\documentstyle[12pt,psfig]{article}
\message{
Copyright, 1993, all rights reserved, Charlie Ebner (Dept.of Physics,
The Ohio State University, Columbus OH 43210) and Mark Jarrell (Dept.of
Physics, The University of Cincinnati, Cincinnati, OH 45221-0011). This
material may not be reproduced for profit, modified or published in any
form (this includes electronic redistribution) without the prior written
permission of the authors listed above.
}
\title{The Special Theory of Relativity}
\author{Albert Einstein\\(1879 - 1955)}
\input{../defs}
\def\baselinestretch{1.4}
\bdo
\maketitle
\tableofcontents
\medskip
In this chapter we depart temporarily from the study of electromagnetism to
explore Einstein's special theory of relativity. One reason for doing so is
that Maxwell's field equations are inconsistent with the tenets of
``classical'' or ``Galilean'' relativity. After developing the special
theory, we will apply it to both particle kinematics and electromagnetism
and will find that Maxwell's equations are completely consistent with the
requirements of the special theory.
\section{Einstein's Two Postulates}
Physical phenomena may be observed and/or described relative to any of an
infinite number of ``reference frames;'' we regard the reference frame as
being that one relative to which the measuring apparatus is at rest. The
basic claim (or postulate) of relativity, which predates Einstein's work by
many centuries, is that physical phenomena should be unaffected by
the choice of the frame from which they are observed. This statement is
quite vague. A simple explicit example is a collision of two objects. If
they are seen to collide when observed from one frame, then the postulate
of relativity says that they will be seen to collide no matter what
reference frame is used to make the observation.
\subsection{Galilean Invariance}
Given that one believes some version of the postulate of relativity, then
that person should, when constructing an explanation of the phenomena in
question, make a theory which will predict the same phenomena in all
reference frames. The original great achievements of this kind were Newton's
theories of mechanics and gravitation. Consider, for example, $\F=m\a$. If
the motion of some massive object is observed relative to two different
reference frames, the motion will obey this equation in both frames
provided the frames themselves are not being accelerated. This
qualification leads one to restrict the statement of the relativity
principle to unaccelerated or {\em inertial} reference frames.
In order to test the postulate of relativity, one needs a transformation
that makes it possible to translate the values of physical observables from
one frame to another. Consider two frames $K$ and $K'$ with $K'$ moving at
velocity $\v$ relative to $K$.
\centerline{\psfig{figure=fig1.ps,height=2.5in,width=8.5in}}
{\narrower\em{Figure 1: Inerital frames $K$ and $K'$}}
\medskip
\noindent Then the (almost obvious) way
to relate a space-time point $(t,\x)$ in $K$ to the same
point $(t',\x')$ in
$K'$ is via the {\em Galilean transformation}
\beq
\xp=\x-\v t\andh t'=t,
\eeq
or so it was believed up to the time of Einstein. Notice that the
transformation is written so that the (space) origins coincide at $t=t'=0$;
we shall say simply that the origins (in space and time) coincide.
In what sense is Newton's law of motion consistent with the Galilean
transformation? If his equation satisfies the postulate of relativity, then
the motion of a massive object must obey it in both frames; thus
\beq
\F=m\a\andh\F'=m'\a'
\eeq
where primed quantities are measured in $K'$ and unprimed ones in $K$.
Now, experiments demonstrate (not quite correctly) that the force and mass
are invariants, meaning that they are the same in all inertial frames, so if
Newton's law is to hold in all inertial frames, then it must be the case
that $\a=\a'$. The Galilean transformation provides a way of comparing
these two quantities. In Eqs.~(1), let $\x$ and $\x'$ be the positions of the
mass at, respectively, times $t$ and $t'$ in frames $K$ and $K'$. Then we
have
\beq
\der\xp{t'}=\der \x t-\v
\eeq
and
\beq
\a'=\sde{\x'}{t'}=\sde\x t=\a,
\eeq
assuming $\v$ is a constant. Thus we find that Newton's law, Galileo's
transformation, and the observed motions of massive objects are consistent.
\footnote{Of course, they aren't consistent at all if one either makes
measurements of extraordinary precision or studies particles traveling at
an appreciable fraction of the speed of light. Neither of these things was
done prior to the twentieth century.}.
\subsection{The difficulty with Galilean Invariance}
Now we come to the dilemma posed by Maxwell's equations. They are not
consistent with the postulate of relativity if one uses the Galilean
transformation to relate quantities in two different inertial frames.
Imagine the quandary of the late-nineteenth-century physicist. He had the
Galilean transformation and Newton's equations of motion, backed by
enormous experimental evidence, to support the almost self-evident
principle of relativity. But he also had the new - and enormously
successful - Maxwell theory of light which was not consistent with Galilean
relativity. What to do? One possible way out of the morass was easy to
find. It was well-known that wavelike phenomena, such as sound, obey wave
equations which are not properly ``invariant'' under Galilean
transformations. The reason is simple: These waves are vibrational motions
of some medium such as air or water, and this medium will be in motion with
different velocities relative to the coordinate axes of different inertial
frames. If one understood this, then one could see that although the wave
equation takes on different forms relative to different frames, it did
correctly describe what goes on in every frame and was not inconsistent
with the postulate of relativity.
The appreciation of this fact set off a great search to find the medium,
called the ``luminiferous ether'' or simply the {\em ether}, whose
vibrations constitute electromagnetic waves. The search (i.e.
Michelson and Morely) was, as we know,
completely unsuccessful\footnote{Or completely successful, if we adopt a
somewhat different (Einstein's) point of view.}, as the ether eluded all
seekers.
However, for Einstein, it was the Fizeau experiment (1851) which
convinced him that the ether explaination was incorrect. This experiment
looked for a change in the phase velocity of light due to its passage through
a moving medium, in this case water.
\centerline{\psfig{figure=fig2.ps,height=2.5in,width=4.25in}}
{\narrower\em{Figure 2: Diagram of Fizeau experiment}}
\medskip
\noindent Fitzeau found that this phase velocity was given by
\[
v_{phase}=\frac{c}{n} \pm v\lep 1-\frac{1}{n^2}\rip\;\;{\rm{experiment}}
\]
where $n$ is the index of refraction of the water, and $v$ is its
velocity. The plus(minus) sign is taken if the water is moving
with(against) the light.
Lets analyze the experiment from a Galilean point of view.
The dielecric water is moving in either the same or opposite direction
as the light, and so acts as a moving source for the light with is
refracted (i.e. reradiated by the water molecules). Nonrelativistically,
we just add the velocity $v$ of the source to the wave velocity for
the stationary source. Thus Galilean therory says
\[
v_{phase}=\frac{c}{n} \pm v \;\;{\rm{Galilean\; theory}},
\]
which is clearly inconsistent with experiment.
The stage was now set for Einstein who, in 1905, made the following
postulates:
\ben
\item Postulate of relativity: The laws of nature and the results of
all experiments performed in a given frame of reference are independent of
the translational motion of the system as a whole.
\item Postulate of the constancy of the speed of light: The speed of light
is independent of the motion of its source.
\een
The first postulate essentially reaffirmed what had long been thought or
believed in the specific case of Newton's law, extending it to all
phenomena. The second postulate was much more radical. It did away with the
ether at a stroke and also with Galilean relativity because it implies that
the speed of light is the same in all reference frames which is
fundamentally inconsistent with the Galilean transformation.
\section{Simultaneity, Separation, Causality, and the Light Cone}
\subsection{Simultaneity}
The second postulate - disturbing in itself - leads to many additional
``nonintuitive'' predictions. For example, suppose that there are sources
of light at points A and C and that they both emit signals that are
observed by someone at B which is midway between A and C.
\centerline{\psfig{figure=fig3.ps,height=2.in,width=4.25in}}
{\narrower\em{Figure 3: Simultaneity depends upon the rest frame of
the observer}}
\medskip
\noindent
If he sees the two signals simultaneously and knows that he is equidistant
from the sources, he will conclude quite correctly that the signals were
emitted simultaneously. Now suppose that there is a second observer, B',who is
moving along the line from A to C and who arrives at B just when the
signals do. He will know that the signals were emitted at some earlier time
when he was closer to A than to C. Also, since both signals travel with
the same speed $c$ in his rest frame (because the speeds of the signals
relative to him are independent of the speeds of the sources relative to
him), he will conclude that the signal from C was emitted earlier than
that from A because it had to travel the greater distance before reaching
him. He is as correct as the first observer. Similarly, an observer moving
in the opposite direction relative to the first one will conclude from the
same reasoning that the signal from A was emitted before that from C.
{\em{Hence Einstein's second postulate leads us to the conclusion that events,
in this case the emission of light signals, which are simultaneous in one
inertial frame are not necessarily simultaneous in other inertial frames.}}
\subsection{Separation and Causality}
If simultaneity is only a relative fact, as opposed to an absolute one,
what about causality? Because the order of the members of some pairs of events
can be reversed by
changing one's reference frame, we must consider whether the events'
ability to influence each other can similarly be affected by a change of
reference frame. This question is closely related to a quantity that we
shall call the {\em separation} between the events. Given two events A and
B which occur at space-time points $\txu$ and $\txd$, we define the squared
separation $s_{12}^2$ between them to be
\beq
s_{12}^2\equiv c^2(t_1-t_2)^2-|\x_1-\x_2|^2.
\eeq
Let the two events be (1) the emission of an electromagnetic signal at some
point in vacuum and (2) its reception somewhere else. Then, because the
signal travels with the speed $c$, these events have separation zero,
$s_{12}^2=0$. This result will be the same in any inertial frame since the
signal has the same speed $c$ in all such frames.
Now, if we have two events such that $s_{12}^2>0$, then we have a ``causal
relationship'' in the sense that a light signal can get from the first
event to the place where the second one occurs before it does occur. Such a
separation is called {\em timelike}. On the other hand, if $s_{12}^2<0$,
then a light signal cannot get from the first event to the location of the
second event before the second event occurs. This separation is called {\em
spacelike}. A separation $s_{12}^2=0$ is called {\em lightlike}.
It is important to ask whether there is some other type of signal that
travels faster than $c$ and which could therefore produce a causal
relationship between events with a spacelike separation. None has been
found and we shall assume that none exists. Consequently, we claim that
events with a timelike separation are such that the earlier one can
influence the later one, because a signal can get from the first to the
location of the second before the latter occurs, but that events with a
spacelike separation are such that the earlier one cannot influence the
later one because a signal cannot get from the first event to the location
of the second one fast enough.
\subsection{The Light Cone}
The question now is whether the character of
the separation between two events, timelike, spacelike, or lightlike, can
be changed by changing the frame in which it is measured. For simplicity,
let the ``first'' event, A, occur (in frame $K$) at $(t=0,\x=0)$ while the
second takes place at some general $(t,\x)$ with $t>0$. Further, let $ct$
be larger than $|\x|$ so that $s^2>0$ and A may influence B. Consider these
same two events in another frame $K'$. By an appropriate choice of the origin
(in space and time) of this frame, we can make the first event occur here,
just as it does in frame $K$. The second event will be at some (t',\x').
We can picture the relative positions of the two events in space and time by
using a {\em light cone} as shown.
\centerline{\psfig{figure=fig4.ps,height=1.5in,width=4.25in}}
{\narrower\em{Figure 4: The Light Cone}}
\medskip
\noindent The vertical axis
measures $ct$; the horizontal one, separation in space, $|\x|$. The two
diagonal lines have slopes $\pm1$. The event B is shown within the cone
whose axis is the $ct$ axis; any event with a timelike separation relative
to the origin will be in here.
%Another event C with spacelike separation is also shown.
The question we wish to ask now is whether, by going to another
reference frame, one may cause event B to move across one of the diagonal
lines and so wind up in a place where it cannot be influenced by the event
at the origin? The point is that A can influence any event inside of the
``future'' cone; it can be influenced by any event inside of the ``past''
cone; but it cannot influence, or be influenced by, any event inside of the
``elsewhere'' region. If an event B and two reference frames $K$ and $K'$ can
be found such that the event when expressed in one frame is on the opposite
side of a diagonal from where it is in the other frame, then we have made
causality a frame-dependent concept.
Suppose that we have two such frames.
We can effect a transformation from one to the other by considering a sequence
of many frames, each moving at a velocity only slightly different from the
previous one, and such that the first frame in the sequence is $K$ and the
final one is $K'$.
\centerline{\psfig{figure=fig5.ps,height=1.5in,width=4.25in}}
{\narrower\em{Figure 5: Evolution of the event B as we evolve from
frame $K$ to $K'$}}
\medskip
\noindent If we let this become an infinite sequence with
infinitesimal differences in the velocities of two successive members of
the sequence, then the positions of the event B in the sequence of light
cones for each of the frames will form a continuous curve when expressed in
a single light cone. If B crosses from timelike to spacelike in this
sequence, then at some point, B must lie on one of
the diagonals. For such an event, the separation from A is lightlike, or
zero.
Consider now two events with zero separation $s_{12}^2=0$. These events can be
coincident with the emission and reception of a light signal. But these
events must be coincident with the emission and reception of the light
signal in all frames, by the postulate of relativity (the first postulate),
and so these events
must have zero separation in all inertial frames because of the constancy
of the speed of light in all frames. Consequently, what we are trying to do
above is in fact impossible; that is, one cannot move an event such as B onto
or off of the surface of the light cone by looking at it in a different
reference frame, and for this reason, one cannot make it cross the surface of
the light cone. An event in the future will be there in all reference
frames. One in the past cannot be taken out of the past; and one that is
``elsewhere'' will be there in any reference frame.
\subsection{The invariance of Separation}
With a little more thought we can generalize the conclusion of the
previous paragraph that two events with zero separation in one frame have
zero separation in all frames. In fact, the separation, whatever it may be,
between any two events is the same in all frames. We shall call something
that is the same in all frames an {\em invariant}; the separation is an
invariant. To argue that this should be the case, suppose that we have two
events which are infinitesimally far apart in both space and time so that we
may write $ds_{12}$ for $s_{12}$,
\beq
(ds_{12})^2=c^2(t_1-t_2)^2-|\x_1-\x_2|^2
\eeq
in frame $K$. In another inertial frame $K'$ we have separation
$(ds_{12}')^2$, and we have argued that this is zero if $ds_{12}^2$ is zero. If
$K'$ is moving with a small speed relative to $K$, the separations in the two
frames must be nearly equal which means that they will be infinitesimal
quantities of the same order, or
\beq
(ds_{12})^2=A(ds_{12}')^2
\eeq
where $A=A(v)$ is a finite function of $v$, the relative speed of the
frames. Furthermore, $A(0)=1$ since the two frames are the same if $v=0$.
Now, if time and space are homogeneous and isotropic, then it must also be
true that
\beq
(ds_{12}')^2=A(v)(ds_{12})^2.
\eeq
Comparing the preceding equations, we see that the only solutions are
$A(v)=\pm1$; the condition that $A(0)=1$ means $A(v)=1$. Hence
\beq
(ds_{12}')^2=(ds_{12})^2
\eeq
which is a relation between differentials that may be integrated to give
\beq
(s_{12}')^2=(s_{12})^2,
\eeq
thereby demonstrating that the separation between any two events is an
invariant. The locus of all points with a given separation is a hyperbola
when drawn on a light cone (or a hyperboloid of revolution if more spatial
dimensions are displayed in the light cone).
\section{Proper time}
Another related invariant is the so-called {\em proper time}. This is tied
to a particular object and is the time that elapses in the rest frame of
that object. If the object is accelerated, its rest frame in not an
inertial frame.
\centerline{\psfig{figure=fig6.ps,height=1.5in,width=4.25in}}
{\narrower\em{Figure 6: Instaneous rest frames $K'$ and $K''$ of an
object with velocity $\u(t)$ as measured in $K$}}
\medskip
\noindent It is then useful to make use of the ``instantaneous'' rest
frame of the object, meaning an inertial frame relative to which the object is
not moving at a particular instant of time. Thus, if in frame $K$ the
object has a velocity $\u(t)$, its instantaneous rest frame at time $t$ is
a frame $K'$ which moves at velocity $\v=\u(t)$. One may find the object's
proper time by calculating the time that elapses in an infinite sequence of
instantaneous rest frames.
Consider an object moving with a trajectory $\xt$ relative to frame $K$.
Between $t$ and $t+dt$ it moves a distance $d\x$ as measured in $K$. Let us
ask what time $dt'$ elapses in the frame $K'$ which is the instantaneous
rest frame at time $t$. The one thing we know is that
\beq
(ds)^2\equiv c^2(dt)^2-(d\x)^2=(ds')^2=c^2(dt')^2-(d\x')^2
\eeq
where, as usual, unprimed quantities are the ones measured relative to $K$
and primed ones are measured in $K'$. Now, $d\x'=0$\footnote{More correctly
$d\x'$ is a second order differential, and hence may be neglected} because the object is at rest in $K'$ at time $t$. Hence we may
drop this contribution to the (infinitesimal) separation and solve for $dt'$:
\beq
dt'=\sqrt{(dt)^2-(d\x)^2/c^2}=dt\sqrt{1-\frac1{c^2}\lep\der\x t\rip^2}
=dt\sqrt{1-u^2/c^2}
\eeq
where $\u\equiv d\x/dt$ is the object's velocity as measured in $K$. Now
we may integrate from some initial time $t_1$ to a final time $t_2$ to find
the proper time of the object which elapses while time is proceeding from
$t_1$ to $t_2$ in frame $K$; that is, we are adding up all of the time that
elapses in an infinite sequence of instantaneous rest frames of the object
while time is developing in $K$ from $t_1$ to $t_2$.
\beq
\ta_2-\ta_1=\int_{t_1}^{t_2}dt\,\sqrt{1-u^2(t)/c^2}.
\eeq
\subsection{Proper Time of an Oscillating Clock}
As an example let the object move along a one-dimensional path with $u(t)=c
\sin(2\pi t/t_0)$ with $t_1=0$ and $t_2=t_0$. This velocity describes a
round trip of a harmonic oscillator with a peak speed of $c$ and a period
of $t_0$. The corresponding elapsed proper time is
\beq
\ta_0=\int_0^{t_0}dt\,\sqrt{1-\sin^2(2\pi t/t_0)}=\int_0^{t_0}dt\,
|\cos(2\pi t/t_0)|=2t_0/\pi.
\eeq
This is smaller than $t_0$ by a factor of $2/\pi$ which means that a clock
carried by the object will show an elapsed time during the trip which is
just $2/\pi$ times what a clock which remains in frame $K$ will show,
provided the acceleration experienced by the clock which makes the trip
doesn't alter the rate at which it runs.
\centerline{\psfig{figure=fig7.ps,height=1.5in,width=4.25in}}
{\narrower\em{Figure 7: The proper time for the oscillating frame is
$2t/\pi$; which is less than the elapsed time in frame $K$.}}
\medskip
\noindent If this clock is a traveller, then
the traveller ages during the trip by an amount which is only $2/\pi$ of the
amount by which someone who stays at rest in $K$ ages. One may wonder
whether, from the point of view of the traveller, the one who stayed at
home should be the one who ages more ``slowly.'' If the calculation is done
carefully (correctly), one finds that the same conclusion is reached; the
traveller has in fact aged less that the stay-at-home.
\section{Lorentz Transformations}
\subsection{Motivation}
So far we know the locations $\tx$ and $\tpxp$ of a space-time point as
given in $K$ and $K'$ must be related by
\beq
c^2t^2-\x\cdot\x=c^2t'^2-\x'\cdot\x',
\eeq
given that the origins of the coordinate and time axes of the two frames
coincide. This equation looks a lot like the statement that the inner
product of a four-dimensional vector, having components $ct$ and $i\x$, with
itself is an invariant. It suggests that the transformation relating $\tx$
and $\tpxp$ is an orthogonal transformation in the four-dimensional space
of $ct$ and $\x$. There is an unusual feature in that the transformation
apparently describes an imaginary or complex rotation because the inner
product, or length, that is preserved is $c^2t^2-\x\cdot\x$ as opposed to
$c^2t^2+\x\cdot\x$. Recall that a rotation in three dimensions around the
$\ept$ direction by angle $\ph$ can be represented by a matrix
\beq
a=\lep\barr{ccc}
\cos\ph & -\sin\ph & 0\\
\sin\ph & \cos\ph & 0\\
0 & 0 & 1\earr\rip
\eeq
so that
\beq
x_i'=\sum_{j=1}^3a_{ij}x_j;
\eeq
that is,
\beq\barr{c}
x_1'=\cos\ph\,x_1-\sin\ph\,x_2\\
x_2'=\sin\ph\,x_1+\cos\ph\,x_2\\
x_3'=x_3.
\earr\eeq
For an imaginary $\ph$, $\ph=i\et$, $\cos\ph\rightarrow\cosh\et$, and
$\sin\ph\rightarrow-i\sinh\et$. Further, let us reconstruct the vector as
$\y=(x_1,ix_2,ix_3)$ and make the transformation
\beq
\y_i'=\sum_ja_{ij}y_j.
\eeq
The result, expressed in terms of components of $\x$, is
\beq\barr{c}
x_1'=\cosh\et\,x_1-\sinh\et\,x_2\\
x_2'=-\sinh\et\,x_1+\cosh\et\,x_2\\
x_3'=x_3;
\earr\eeq
these are such that
\beq
x_1'^2-x_2'^2-x_3'^2=x_1^2-x_2^2-x_3^2
\eeq
since $\cosh^2(\et)-\sinh^2(\et)=1$, so we have succeeded in
constructing a transformation that produces the
right sort of invariant. All we have to do is generalize to four
dimensions.
\subsection{Derivation}
Let's begin by introducing a vector with four components, $(x_0,x_1,x_2,x_3)$
where $x_0=ct$ and the $x_i$ with $1=1,2,3$ are the usual Cartesian components
of the position vector. Then introduce $\vec{y}\equiv(x_0,i\x)$ which has the
property that
\beq
\vec{y}\cdot\vec{y}=x_0^2-\x\cdot\x.
\eeq
This inner product is supposed to be an invariant under the transformation
of $\x$ and $t$ that we seek. The transformation in question is a rotation
through an imaginary angle $i\et$ that mixes time and one spatial direction,
which we pick to be the first ($y_1$ or $x_1$) without loss of generality.
The matrix representing this rotation is
\beq
a=\lep\barr{cccc}
\cosh\et & i\sinh\et & 0 & 0\\
-i\sinh\et & \cosh\et & 0 & 0\\
0 & 0 & 1 & 0\\
0 & 0 & 0 & 1\earr\rip.
\eeq
Now operate with this matrix on $\vec{y}$ to produce $\vec{y}'$. If we write
the components of the latter as $(x_0',i\xp)$, we find the following:
\beq\barr{c}
x_0'=\cosh\et\,x_0-\sinh\et\,x_1\\
x_1'=-\sinh\et\,x_0+\cosh\et\,x_1\\
x_2'=x_2\\
x_3'=x_3.
\earr\eeq
It is a simple matter to show from these results that $x_0^2-\x\cdot\x$ is
an invariant, i.e.,
\beq
x_0^2-\x\cdot\x=x_0'^2-\x'\cdot\x'
\eeq
which means we have devised an acceptable transformation in the sense that
it preserves the separation between two events.
But what is the significance of $\et$? Let us rewrite $\sinh\et$ as $\cosh
\et\,\tanh\et$. Then we have, in particular,
\beq\barr{c}
x_0'=\cosh\et\,(x_0-\tanh\et\,x_1)\\
x_1'=\cosh\et\,(x_1-\tanh\et\,x_0).
\earr\eeq
The second of these is
\beq
x_1'=\cosh\et\,(x_1-\tanh\et\,ct).
\eeq
Suppose that we are looking at an object at rest at the origin of $K'$, and
the space-time point $\tx$ is this object's location. Then $x_1'=0$ for all
$t'$. As seen from $K$, the object is at $x_1-vt$ given that $\v$, the
velocity of the object (and of $K'$) relative to $K$, is parallel to
$\epu$.
\centerline{\psfig{figure=fig8.ps,height=1.5in,width=4.25in}}
{\narrower\em{Figure 8: Coordinates for our Lorentz transform.}}
\medskip
\noindent This is consistent with \eq{27} provided
\beq
\tanh\et\equiv\frac vc\equiv\be
\eeq
where $\be$ is defined as the speed $v$ relative to the speed of light.
From this relation, we find further that
\beq
\cosh\et=1/\sqrt{1-\be^2}\equiv\ga;
\eeq
this expression defines $\ga$, a parameter that comes up repeatedly in the
special theory of relativity.
Our determination of the transformation, called the {\em Lorentz
transformation}\footnote{H.~A.~Lorentz devised these transformations prior
to Einstein's development of the special theory of relativity; they had in
fact been used even earlier by Larmor and perhaps others. Furthermore, it
was known that Maxwell's equations were invariant under these
transformations, meaning that if these are the right transformations (as
opposed to the Galilean transformations), Maxwell's equations are eligible
for ``law of nature'' status.}, is now complete. We find that,
given a frame $K'$ moving at
velocity $\v=v\epu$ relative to $K$, a space-time point $\tx$ in $K$ becomes,
in $K'$, the space-time point $\tpxp$ with
\beq
x_0'=\ga(x_0-\be x_1)\hsph x_1'=\ga(x_1-\be x_0)\hsph x_2'=x_2\hsph x_3'=x_3.
\eeq
The inverse transformation can be extracted from these equations in a
straightforward manner; it may also be inferred from the fact that $K$ is
moving at velocity $-\v$ relative to $K'$ which tells us immediately that
\beq
x_0=\ga(x_0'+\be x_1')\hsph x_1=\ga(x_1'+\be x_0')\hsph x_2=x_2'\hsph
x_3=x_3'.
\eeq
\subsection{Elapsed Proper Time Revisited}
Let us try to use this transformation to calculate something. First, we
revisit the proper time. For an object at rest in $K'$, $\xp$ does not
change with time. Also, from our transformation,
\beq
ct=\ga(ct'+\be x_1'),
\eeq
The differential of this transformation, making use of the fact that the
object is instantaneously at rest in $K'$, gives, $dt=\ga dt'$ since
$dx_1'$ is second-order in powers of $dt'$. Stated in another fashion, we are
considering the transformation of two events or space-time
points. They are the locations of the object at times $t'$ and $t'+dt'$.
Because the object is at rest in $K'$ at time $t'$, its displacement $d\xp$
during the time increment $dt'$ is of order $(dt')^2$ and so may be
discarded. The corresponding elapsed time $dt$ in $K$ is thus found to be
$dt=\ga dt'$, using the Lorentz transformations of the two space-time
points. This equation may also be written as
\beq
dt'=dt/\ga=\sqrt{1-v^2/c^2}dt.
\eeq
The left-hand side of this equation is the elapsed proper time of the
object while $dt$ is the elapsed time measured by observers at rest
relative to $K$. If we introduce $\u$, the velocity of the object relative
to $K$, and notice that $\u=\v$ at time t, then we can write $dt'$ in terms
of $\u(t)$ as
\beq
dt'=\sqrt{1-|\u(t)|^2/c^2}dt,
\eeq
where now $dt'$ in the elapsed proper time of the object which moves at
velocity $\u$ relative to frame $K$. We can integrate this relation to find
the finite elapsed proper time during an arbitrary time interval (in $K$),
\beq
\ta_2-\ta_1=\int_{t_1}^{t_2}dt\sqrt{1-|\u(t)|^2/c^2}.
\eeq
\subsection{Proper Length and Length Contraction}
Next, we shall examine the {\em Fitzgerald-Lorentz contraction}. Define the
{\em proper length} of an object as its length, measured in the frame where
it is at rest. Let this be $L_0$, and let the rest frame be $K'$, moving at
the usual velocity ($\v=v\epu$) relative to $K$.
\centerline{\psfig{figure=fig9.ps,height=1.5in,width=4.25in}}
{\narrower\em{Figure 9: Length contraction occurs along the axis parallel
to the velocity.}}
\medskip
\noindent The relative geometry is shown in the figure. The rod, or
object, is positioned
in $K'$ so that its ends are at $x_1'=0$, $L_0$. They are there for all
$t'$. In order to find the length of the rod in $K$, we have to measure the
positions of both ends at the {\bf same} time $t$ as measured in $K$. We
can find the results of these measurements from the Lorentz transformation
\beq
x_1'=\ga(x_1-\be x_0).
\eeq
Use this relation first with $x_1'$ equal to 0 and then with $x_1'=L_0$,
using the same time $x_0$ in both cases, and take the difference of the two
equations so obtained. The result is
\beq
L_0=\ga(x_{1R}-x_{1L})\equiv\ga L
\eeq
where $x_{1R}$ and $x_{1L}$ are the positions of the right and left ends of
the rod at some particular time, or $x_0$. The difference of these is $L$,
the length of the rod as measured in frame $K$.
Our result for $L$ can be written as
\beq
L=L_0/\ga=\sqrt{1-\be^2}L_0.
\eeq
This length is smaller than $L_0$ which means that the object is found (is
measured) in
$K$ to be shorter than its proper length or its length in the frame where
it is at rest. Notice, however, that if we did the same calculation for its
length in a direction perpendicular to the direction of $\v$, we would find
that it is the same in $K$ as in $K'$. Consequently the transformation of the
object's volume is
\beq
V=V_0/\ga=\sqrt{1-\be^2}\,V_0
\eeq
where $V_0$ is the {\em proper volume} or volume in the rest frame, and $V$
is the volume in a frame moving at speed $\be c$ relative to the rest frame.
\section{Transformation of Velocities}
Because we know how $\x$ and $t$ transform, we can determine how anything
that involves functions of these things transforms. For example, velocity.
Let an object have velocity $\u$ in $K$ and velocity $\u'$ in $K'$ and let
$K'$ move at velocity $\v$ relative to $K$. We wish to determine how $\u'$
is related to $\u$. In $K'$, the object moves a distance $d\x'=\u'dt'$ in
time $dt'$. A similar statement, without any primed quantities, holds in
$K$. The infinitesimal displacements in time and space are related by Lorentz
transformations:
\beq
dt=\ga(v)\lep dt'+(v/c^2)dx'\rip\hsph dx=\ga(v)(dx'+vdt')\hsph dy=dy'\hsph
dz=dz',
\eeq
where we have let $\v$ be along the direction of $x_1$ as usual. Taking
ratios of the displacements to the time increment, we have
\beq
u_x=\der xt=\frac{dx'+vdt'}{dt'+(v/c^2)dx'}=\frac{dx'/dt'+v}
{1+(v/c^2)(dx'/t')}=\frac{u_x'+v}{1+vu_x'/c^2},
\eeq
\beq
u_y=\frac1{\ga(v)}\frac{u_y'}{1+vu_x'/c^2},
\eeq
and
\beq
u_z=\frac1{\ga(v)}\frac{u_z'}{1+vu_x'/c^2}.
\eeq
These results may be summarized in vectorial form:
\beq
\u\pll=\frac{u\pll '+\v}{1+\v\cdot\u'/c^2}\hsph
u\per=\frac{u\per'}{\ga(v)(1+\v\cdot\u'/c^2)}
\eeq
where the subscripts ``$\parallel$'' and ``$\perp$'' refer respectively to the
components of the velocities $\u$ and $\u'$ parallel and perpendicular to
$\v$. Notice too that
\beq
\u\pll=\lep\frac{\u\cdot\v}{v^2}\rip\v\andh\u_\perp=\u-\u\pll
=\u-\lep\frac{\u\cdot\v}{v^2}\rip\v.
\eeq
It is sometimes useful to express the transformations for velocity in polar
coordinates $(u,\th)$ and $(u',\th')$ such that
\beq
u\pll=u\cos\th\andh u_\perp=u\sin\th,
\eeq
etc.; the appropriate expressions are
\beq
\tan\th=\frac{u'\sin\th'}{\ga(v)(u'\cos\th'+v)}\andh
u=\frac{[u'^2+v^2+2u'v\cos\th'-(vu'\sin\th'/c)^2]^{1/2}}{1+u'v\cos\th'/c^2}.
\eeq
\centerline{\psfig{figure=fig10.ps,height=1.5in,width=4.25in}}
{\narrower\em{Figure 10: Frames for velocity transform.}}
\medskip
\noindent The inverses of all of these velocity transformations are easily
found by appropriate symmetry arguments based on the fact that the
velocity of $K$ relative to $K'$ is just $-\v$.
In it interesting that the velocity transformations are, in contrast to the
ones for $\x$ and $t$, nonlinear. They must be nonlinear because there is a
maximum velocity which is the speed of light; combining two velocities,
both of which are close to, or equal to, $c$, cannot give a velocity
greater than $c$. A linear transformation would necessarily allow this to
happen, so a nonlinear transformation is required. To see how the
transformations rule out finding a frame where an object moves faster than
$c$, let us consider the transformation of a velocity $|\u'|=c$. From the
second of Eqs.~(47), we see that
\beq
u=\frac{c^2+v^2+2cv\cos\th'-v^2(1-\cos^2\th')]^{1/2}}{1+v\cos\th'/c}=
c\frac{[(1+v\cos\th'/c)^2]^{1/2}}{1+v\cos\th'/c}=c.
\eeq
Thus do we find what we already knew: If something moves at speed $c$ in
one frame, then it moves at the same speed in any other frame. More
generally, if we had used any $u'\le c$ and $v\le c$, we would have
recovered a $u\le c$.
\subsection{Aberration of Starlight}
An interesting example of the application of the velocity transformation is the
observed aberration of starlight. Suppose that an observer is moving with
speed $v$ at right angles to the direction of a star that he is watching.
If a Galilean transformation is applied to the determination of the
apparent direction of the star, one finds that it is seen at an angle $\ph$
away from its true direction where $\tan\ph=v/c$.
\centerline{\psfig{figure=fig11.ps,height=2.0in,width=4.25in}}
{\narrower\em{Figure 11: Due to the finite velocity of light, a star is
seen an angle $\ph$ away from its true direction.}}
\medskip
\noindent One can measure this
angle by waiting six months. The velocity $\v$ is provided by the earth's
orbital motion; six months later it is reversed and if the observer then
looks for the same star, it position will have shifted by $2\ph$, at least
according to the Galilean transformation.
But that prediction is not correct. Consider what happens if the Lorentz
transformation is used to compute the angle $\ph$. Using \eq{47}, we see
that the angle $\th$ at which the light from the star appears to be headed
in the frame $K$ of the observer is
\beq
\tan\th=\frac{c\sin\th'\sqrt{1-v^2/c^2}}{c(\cos\th'+v/c)}.
\eeq
where $\th'$ is its direction in the frame $K'$ which is the rest frame of
the sun.
\centerline{\psfig{figure=fig12.ps,height=2.0in,width=4.25in}}
{\narrower\em{Figure 12: Coordinates for stargazing.}}
\medskip
\noindent Now suppose that $\v$ is at $\pi/2$ radians to the direction of
the light's motion in frame $K'$ so that $\th'=\pi/2$. Then we find
$\tan\th=c/\ga(v)v$. To compare with the prediction of the Galilean
transformation, we need to find the angle $\ph$, which is to say,
$\pi/2-\th$. From a trigonometric identity, we have
\beq
\tan\th=\frac{\tan(\pi/2)-\tan\ph}{1+\tan(\pi/2)\tan\ph}=\frac1{\tan\ph},
\eeq
and so
\beq
\tan\ph=v\ga(v)/c\hsph\mbox{or}\hsph\sin\ph=v/c.
\eeq
This is the correct answer; the tangent of the angle $\ph$ differs from the
prediction of the Galilean transformation by a factor of $\ga$ which is
second-order in powers of $v/c$.
\section{Doppler Shift}
The Doppler shift of sound is a well-known and easy to understand
phenomenon. It depends on the velocities of the source and observer
relative to the medium in which the waves propagate. For electromagnetic
waves, this medium does not exist and so the Doppler shift for light takes
on its own special - and relatively simple! - form.
Suppose that in frame $K$ there is a plane wave with wave vector $\k$ and
frequency $\om$. Put an observer at some point $\x$ and set him to work
counting wave crests as they go past him.
\centerline{\psfig{figure=fig13.ps,height=1.5in,width=4.25in}}
{\narrower\em{Figure 13: An observer counts wave crests.}}
\medskip
\noindent Let him begin with the crest
which passes the origin at $t=0$ and continue counting until some later
time $t$. How many crests does he count? We can decide by first determining
when he starts. The starting time is $t_0=(\k\cdot\x)/kv_w$ where $v_w$ is
the velocity of the wave in frame $K$. The observer counts from $t_0$ to
$t$ and so counts $n$ crests where
\beq
n=(t-t_0)/T;
\eeq $T$ is the period of the wave, $T=2\pi/\om$. Hence,
\beq
n=\frac1{2\pi}\lep\om t-\frac\om{kv_w}\k\cdot\x\rip=\frac1{2\pi}(\om t
-\k\cdot\x),
\eeq since $\om=v_wk$.
\centerline{\psfig{figure=fig14.ps,height=1.5in,width=4.25in}}
{\narrower\em{Figure 14: Frame alignments for the Doppler problem.}}
\medskip
Now let the same measurement be performed by an observer at rest at a point
$\xp$ in frame $K'$ which moves at $\v$ relative to $K$. We choose $\xp$ in
a special way; it must be coincident with $\x$ at time $t$ (measured in $K$).
This observer also counts crests, starting with the one that passed his
origin at time $t'=0$ and stopping with the one that arrives when he (the
second observer) is coincident with the first observer. Given the usual
transformations, the four-dimensional coordinate origins coincide, and so
both observers count the same number of crests. Repeating the argument
given for the number counted by the first observer, we find that the number
counted by the second observer can be written as
\beq
n=\frac1{2\pi}(\om't'-\k'\cdot\xp)
\eeq
where $\om'$ and $\k'$ are the frequency and wave vector of the wave in
$K'$ and $\tpxp$ is the spacetime point that transforms into $\tx$. Thus we
find
\beq
\om t-\k\cdot\x=\om' t'-\k'\cdot\xp.
\eeq
The significance of this relation is that the phase of the wave is an invariant.
Further it appears to be the inner product of $(ct,i\x)$ and $(\om/c,i\k)$.
Because we know how $\tx$ transforms to $\tpxp$, we can figure out how
$(\om/c,i\k)$ transforms to $(\om'/c,i\k')$. Let $\om/c\equiv k_0$ and
$\om'/c\equiv k_0'$ and consider \eq{55} with the transformations \eq{30}
used for $t'$ and $\xp$:
\beq
\om t-k_1x_1-k_2x_2-k_3x_3=\om'\ga(t-\be x_1/c)-k_1'\ga(x_1-\be ct)-k_2'x_2
-k_3'x_3.
\eeq
Because $t$ and $\x$ are completely arbitrary, we may conclude that
\beq
\om=\ga(\om'+\be ck_1')\hsph k_1=\ga(k_1'+\be\om'/c)\hsph k_2=k_2'\hsph
k_3=k_3',
\eeq
or
\beq
k_0=\ga(k_0'+\be k_1')\hsph k_1=\ga k_1'+\be k_0')\hsph k_2=k_2'\hsph k_3=k_3'.
\eeq
We recognize the form of these transformations; they tell us that $(k_0,\k)$
transforms in the same way as $(x_0,i\x)$, i.e., via the Lorentz
transformation.
Let's spend a few minutes thinking about the conditions under which our
result is valid. We assumed when making the argument that we have a plane
wave in both $K$ and $K'$ which means, more or less, that we are giving
Maxwell's equations Law of Nature status since we assumed that the relevant
equation of motion produces plane wave solutions in both frames. In fact,
our results are not correct for waves in general, because many types of
waves will not have this property (plane waves remain plane waves
relative to all reference
frames if they are plane waves relative to one frame). But they are correct
for electromagnetic waves in vacuum.
Finally, let us look at an alternative form for our transformations. Let
\beq
\k'=k'(\cos\th'\,\epu+\sin\th'\,\epd);
\eeq
the component of $\k'$ perpendicular to $\epu$ is defined to be in the
direction of $\epd$. Further, $\k'=\om'/c$. Then the transformation equations
may be used to produce the relations
\beq
k_1=\ga k'(\cos\th'+\be)\hsph \k_2=k'\sin\th'\,\epd\andh\om=\ga\om'(1+\be
\cos\th')
\eeq
where $\k_2$ is the component of $\k$ which is perpendicular to $\epu$.
>From these results it is easy to show that $\om-ck$, no surprise, and that
\beq
\cos\th=\frac{\cos\th'+\be}{1+\be\cos\th'};
\eeq
$\th$ is the angle that $\k$ makes with the direction of $\v$, (or $\epu$);
that is,
\beq
\k=k(\cos\th\,\epu+\sin\th\,\epd).
\eeq
\subsection{Stellar Red Shift}
The last of Eqs.~(60) in particular may be used to describe the Doppler shift
of the frequency of electromagnetic waves in vacuum. A well-known case in
point is the ``redshift'' of light from distant galaxies.
\centerline{\psfig{figure=fig15.ps,height=1.5in,width=4.25in}}
{\narrower\em{Figure 15: Light from receding stars in $K'$ is
redshifted when seen in $K$.}}
\medskip
\noindent
Given an object receding from the observer in $K$ and emitting light of
frequency $\om'$ in its own rest frame, $K'$, we have $\cos\th'\approx-1$
and
\beq
\om=\ga\om'(1-\be)=\om'\sqrt{\frac{1-\be}{1+\be}}.
\eeq
For, e.g., $\be=1/2$, $\om=\om'/\sqrt3$. The observer sees the light as
having much lower frequency than that with which it is emitted; it is
``red-shifted.''
\section{Four-tensors and all that}\footnote{The introduction to tensor
calculus given in this section is largely drawn from J.~L.~Synge and A.
~Schild, {\em Tensor Calculus}, (University of Toronto Press, Toronto,
1949).}
It is no accident that $(x_0,\x)$ and $(k_0,\k)$ transform from $K$ to $K'$
in the same way. They are but two of many sets of four objects or elements
that have this property. They are called {\em four-vectors}. More generally,
there are sets of $4^p$ elements, with $p=0,1,2 ...$, which have very
similar transformation properties and which are called {\em four-tensors of
rank} $p$. The better to manipulate them when the time comes, let us spend
a little time now learning some of the basics of tensor calculus.
Consider the usual frames $K$ and $K'$ with coordinates $\xb$ and $\xb'$,
respectively; $\xb$ stands for $(x_0,\x)$ and similarly for $x'$. Let there
be some transformation from one frame to the other which gives
\beq
\xb'=x'(\xb),
\eeq
with an inverse,
\beq
\xb=x(\xb').
\eeq
These transformations need not in general be linear.
A {\em{\bf{ contravariant vector}}} or {\em rank-one tensor} is defined to
be a set of four quantities or elements $a^\al$, $\al=0,1,2,3$, which transform
from $K$ to $K'$ according to the rule
\beq
a'^\al=\sum_{\be=0}^3\pde{x'^\al}{x^\be}a^\be\equiv A^\al_{\bel}a^\be.
\eeq
This equation serves to define $A^\al_{\bel}$,
\beq
A^\al_{\bel}\equiv\pde{x'^\al}{x^\be};
\eeq
we have also introduced in the last step the summation convention that a
Greek index, which appears in a term as both an upper and a lower index, is
summed from zero to three.
For any contravariant vector or tensor, we are going to introduce also a {\em
covariant} vector or tensor whose components will be designated by subscripts.
Define a {\em{\bf{ covariant vector}}} or {\em rank-one tensor} as a set of
four objects $b_\al$, $\al=0,1,2,3$, which transform according to the rule
\beq
b_\al'=\sum_{\be=0}^3\pde{x^\be}{x'^\al}b_\be\equiv A_\al^{\bel}b_\be
\eeq
where we have defined
\beq
A_\al^{\bel}\equiv\pde{x^\be}{x'^\al}.
\eeq
The generalization to tensors of ranks other than one is straightforward.
For example, a rank-two contravariant tensor comprises a set of
sixteen objects $T^{\al\be}$ which transform according to the rule
\beq
T'^{\al\be}=A_{\gal}^\al A_{\del}^\be T^{\ga\de}
\eeq
and a rank-two covariant tensor has sixteen elements $T_{\al\be}$
which transform according to the rule
\beq
T'_{\al\be}=A_\al^{\gal}A_\be^{\del}T_{\ga\de}.
\eeq
Mixed tensors can also be of interest. The rank-two mixed tensor $\Tb$
is a set of sixteen elements $T^\al_{\bel}$ which transform according to
\beq
T'^\al_{\;\bel}=A^\al_{\gal}A_\be^{\del}T^\ga_{\;\del}.
\eeq
Generalizations follow as you would expect.
The {\em inner product} of $\ab$ and $\bb$ can be\footnote{We will present
a different but equivalent definition later.} defined as
\beq
\ab\cdot\bb\equiv b_\al a^\al.
\eeq
Consider the transformation properties of the inner product:
\beq
\ab'\cdot\bb'=a'^\al b'_\al=A_\al^{\gal}A_{\del}^\al b_\ga a^\de;
\eeq
however,
\beq
A_\al^{\gal}A_{\del}^\al=\pde{x^\ga}{x'^\al}\pde{x'^\al}{x^\de}=\pde{x^\ga}{x^
\de}=\de^\ga_\de
\eeq
where
\beq
\de^\ga_\de\equiv\lec\barr{cc}1 & \ga=\de \\ 0 & \ga\ne\de\earr\right.
\eeq
Hence
\beq
\ab'\cdot\bb'=\de^\ga_\de a^\de b_\ga=a^\ga b_\ga=\ab\cdot\bb.
\eeq
The inner product is an invariant, also known as a {\em scalar} or {\em
rank-zero tensor}.
Notice that when we wrote the Kronecker delta function, we gave it a
superscript and subscript as though it were a rank-two mixed tensor. It in
fact is one as we can show by transforming it from one frame to another.
Let $\de_\al^\be$ be defined as above in the frame $K$ and let it be
defined to be a mixed tensor. Then we know how it transforms and so can
find it in a different frame $K'$ (where we hope it will turn out to be the
same as in frame $K$:
\beq
\de'^\al_\be=A^\al_{\gal}A_\be^{\del}\de^\ga_\de=A^\al_{\gal}A_\be^{\gal}
=\pde{x'^\al}{x^\ga}\pde{x^\ga}{x'^\be}=\pde{x'^\al}{x'^\be}=\de^\al_\be
\eeq
which means that the thing we defined to be a rank-two mixed tensor is
remains the same as the Kronecker delta function in all frames.
The operation which enters the definition of the inner product is to set a
contravariant and a covariant index equal to each other and then to sum
them. This operation is called a {\em contraction} with respect to the pair
of indices in question. It reduces the rank of something by two. That is,
the sixteen objects $b_\al a^\be$ form a rank-two tensor, as may be shown
easily by checking how it transforms (given the transformation properties
of $b_\al$ and $a^\be$). After we perform the contraction, we are left with
a rank-zero tensor.
\subsection{The Metric Tensor}
Now think about how we can use these things in relativity. We have a
fundamental invariant which is the separation between two events;
specifically,
\beq
(ds)^2=(dx^0)^2-(dx^1)^2-(dx^2)^2-(dx^3)^2
\eeq
is an invariant,
\beq
(ds)^2=(ds')^2.
\eeq
We would like to write this as an inner product $d\xb\cdot d\xb$, where
\beq
d\xb\cdot d\xb=dx_\al dx^\al.
\eeq
However, in order that we can do so, it must be the case that the covariant
four-vector $d\xb$ have the components
\beq
dx_0=dx^0\andh dx_i=-dx^i\mbox{ for }i=1,2,3.
\eeq
In general, the components of the contravariant and covariant versions of a
four-vector are related by the {\em metric tensor} $\gb$ which is a rank-two
tensor that can be expressed in covariant, contravariant, or mixed form (just
like any other tensor of rank two or more). In particular, the {\em covariant
metric tensor} is defined for any system by the statement that the separation
can be written as
\beq
(ds)^2\equiv g_{\al,\be}dx^\al dx^\be,
\eeq
plus the statement that it is a symmetric tensor.
How do we know that this is a tensor? From the fact that its double
contraction with the contravariant vector $\xb$ is an invariant and from
the fact that it is symmetric, one can prove that it is a rank-two
covariant tensor.\footnote{See, e.g., Synge and Schild.}
In three-dimensional Cartesian coordinates in a Euclidean space such as we
are accustomed to thinking about, the covariant metric tensor is just the unit
tensor. In curvilinear coordinates (for example, spherical coordinates) it is
some other (still simple) thing. For the flat four-dimensional space that one
deals with in the special theory of relativity, we can see from \eqs{79}{83},
and from the condition that $\gb$ is symmetric, that it must be
\beq
g_{00}=1,\hsph g_{ii}=-1,\mbox{ }i=1,2,3,\andh g_{\al\be}=0,\mbox{ }\al\ne\be.
\eeq
Next, we introduce the {\em contravariant metric tensor}. First, we take
the determinant of the matrix formed by the covariant metric tensor,
\beq
g\equiv\det[g_{\al\be}]=-1
\eeq
Then one introduces the cofactor, written as $\De^{\al\be}$, of each element
$g_{\al\be}$ in the matrix. The elements of the contravariant metric tensor
are defined
as
\beq
g^{\al\be}\equiv\frac{\De^{\al\be}}{g}.
\eeq
We need to demonstrate that this thing is indeed a contravariant tensor.
From the standard definitions of the determinant and cofactor, we can write
\beq
g_{\al\be}\De^{\al\ga}=g_{\be\al}\De^{\ga\al}=\de^\ga_\be g
\eeq
from which it follows that
\beq
g_{\al\be}g^{\al\ga}=\de^\ga_\be=g_{\be\al}g^{\ga\al}.
\eeq
When contracted (as above) with a covariant tensor, the thing we call a
contravariant tensor produces a mixed tensor. In addition, it is symmetric
which follows from the symmetry of the covariant metric tensor. This is
sufficient to prove that the elements $g^{\al\be}$ do form a contravariant
tensor.
It is easy to work out the elements of the contravariant metric tensor if
one knows the covariant one; for our particular metric tensor they are the
same as the elements of the covariant one.
The metric tensor is used to convert contravariant tensors or indices to
covariant ones and conversely. Consider for example the elements $x_\al$
defined by
\beq
x_\al=g_{\al\be}x^\be.
\eeq
It is clear that the result is a covariant tensor of rank one. It is the
covariant version of the position four-vector $\xb$ and has elements $(x^0,
-\x)$. Similarly, we may recover the contravariant version of a four-vector
or tensor from the covariant version of the same tensor by using the
contravariant metric tensor:
\beq
x^\al=g^{\al\be}x_\be=g^{\al\be}g_{\be\ga}x^\ga=\de^\al_\ga x^\ga=x^\al.
\eeq
More generally, one may raise or lower as many indices as one wishes by using
the appropriate metric tensor as many times as needed. Among other things,
we can thereby construct a mixed metric tensor,
\beq
g_\al^\be=g_{\al\ga}g^{\ga\be};
\eeq
Using the explicit components of the covariant and contravariant metric
tensors, one finds that this is precisely the unit mixed tensor, i.e., the
Kronecker delta,
\beq
g_\al^\be=\de_\al^\be
\eeq
Finally, we earlier defined the inner product of two vectors by contracting
the covariant version of one with the contravariant version of the other;
we can now see that there are numerous other ways to express the inner
product:
\beq
\ab\cdot\bb=a^\al b_\al=g_{\al\ga}a^\al b^\ga=g^{\al\ga}a_\ga b_\al;
\eeq
In particular, the separation is now seen to be the same as $\xb\cdot\xb$,
\beq
(s)^2=g_{\al\be}x^\al x^\be=x^\al x_\al.
\eeq
There is one piece of unfinished business in all of this. We have defined a
metric tensor; it was defined so that the separation is an invariant. We
still do not know (if we assume we haven't as yet learned about Lorentz
transformations) the components $A^\al_{\bel}$ and $A^{\all}_\be$ of the
transformation matrices. Just any old transformations won't do; it has
to be consistent with our metric tensor, i.e., with the condition that the
separation is invariant. This implies some conditions on the
transformations. We shall return to this point later.
\subsection{Differential Operators}
Differential operators also have simple transformation properties. Consider
the basic example of the four operators $\partial/\partial x^\al$. The
transformation of this from one frame to another is found from the relation
\beq
\pde{ }{x'^\al}=\pde{x^\be}{x'^\al}\pde{ }{x^\be}\equiv A_\al^{\bel}\pde{ }
{x^\be}.
\eeq
The components of this operator transform in the same way as the components
of a covariant vector which means that the four differential operators
$\partial/\partial x^\al$ form a covariant four-vector. That being the case
the elements
\beq
A_\al^{\bel}=\pde{x^\be}{x'^\al}
\eeq
are the elements of a rank-two mixed tensor, and that is why we have all along
used for them notation which suggests that they are components of such a
tensor.
It is equally true that $\partial/\partial x_\al$ is a contravariant
four-vector operator. Consider
\beqa
\pde{ }{x'_\al}=\pde{x_\be}{x'_\al}\pde{ }{x_\be}=g_{\be\ga}\pde{x^\ga}
{x'^\de}\pde{x'^\de}{x'_\al}\pde{ }{x_\be}\nonumber\\
=g_{\be\ga}g^{\de\al}A_\de^{\gal}\pde{ }{x_\be}=A^\al_{\bel}\pde{ }{x_\be}.
\eeqa
Since the operator transforms in the same way as a contravariant
four-vector, it is a contravariant four-vector!
Either of these four-vectors is called the {\em four-divergence}. Let's
introduce some new notation for them:
\beq
\partial_\al\equiv\pde{ }{x^\al}
\eeq
is the way we shall write a component of the covariant four-divergence, and
\beq
\partial^\al\equiv\pde{ }{x_\al}
\eeq
is the way we write a component of the contravariant four-divergence.
We can construct some interesting invariants using the four-divergence. For
example, the inner product of one of them with an four-vector produces an
invariant,
\beq
\partial^\al A_\al=\partial_\al A^\al=\pde{A^0}{x^0}+\div\A.
\eeq
Also, the four-dimensional Laplacian
\beq
\partial^\al\partial_\al=\spd{ }{{x^0}}-\div\grad\equiv\Box
\eeq
is an invariant, or scalar, operator.
\subsection{Notation}
It is natural to present four-vectors using column vectors and rank-two
tensors using matrices. Thus a four-vector such as $\xb$ becomes
\beq
\xb=\lep\barr{c}x^0\\x_1\\x^2\\x^3\earr\rip,
\eeq
and its transpose is
\beq
\tilde{x}=(x^0\;x^1\;x^2\;x^3).
\eeq
Using this notation, we can write, e.g., the inner product of two vectors
as
\beq
\ab\cdot\bb=a_\al b^\al=g_{\al\be}a^\al b^\be=\tilde{a}\gb\bb.
\eeq
Notice however, that if $\tilde{a}$ were the transpose of the covariant
vector, we would write the inner product as $\tilde{a}\bb$. The notation
leaves something to be desired. Be that as it may, we can write a
transformation as
\beq
x'^\al=A^\al_{\bel}x^\be\hsph\mbox{or }\xb'=\Ab\xb.
\eeq
We will only make use of this abbreviated notation when it is necessary to
cause lots of confusion.
\section{Representation of the Lorentz transformation}
Our next task is to find a general transformation matrix\footnote{This
could be the fully contravariant version, the fully covariant version or one of
the two mixed versions. If we know one of them, we know the others because we
can lower and raise indices with the metric tensor.} $\Ab$. As pointed out
earlier, the basic fact we have to work with is that the separation is
invariant,
\beq
g_{\al\be}dx^\al dx^\be=g_{\al\be}dx'^\al dx'^\be.
\eeq
Knowing this, and knowing also that
\beq
x'^\al=A^\al_{\bel} x^\be,
\eeq
it is a standard and straightforward exercise in linear algebra to show
that
\beq
\det|A|=\pm1.
\eeq
Just as in three dimensions, there are proper and improper transformations
which satisfy our requirements. The proper ones may be arrived at via a
sequence of infinitesimal transformations starting from the identity,
$A^\al_{\bel}=g^\al_\be$. All transformations generated in this manner have
determinant +1. The improper ones cannot be constructed in this way, even
though some of them can have determinant +1. An example is $A^\al_{\bel}=
-g^\al_\be$; it has determinant +1 but is an improper transformation and cannot
be arrived at by a sequence of infinitesimal transformations.
In this investigation we shall construct proper Lorentz transformations and
shall build them from infinitesimal ones. Let's start by writing
\beq
A^\al_{\bel}=\de^\al_\be+\De\om^\al_{\bel},
\eeq
where $\De\om^\al_{\bel}$ is an infinitesimal. From the invariance of the
interval, one can easily show that of the sixteen components
$\De\om^\al_{\bel}$, the diagonal ones must be zero and the off-diagonal
ones must be such that
\beq
\De\om^{\al\be}=-\De\om^{\be\al};
\eeq
notice that both indices are now contravariant, in contrast to the previous
equation. If we write the preceding relation with one contravariant and one
covariant index, we will find the same $-$ sign if the two indices are 1, 2,
or 3, and there will be no $-$ if one index is 0 and the other is one of 1,
2, or 3. Evidently, it is simpler to use a completely contravariant form.
\footnote{The point is, we can use any form for the tensor that we like
because all forms can be found from any single one. Therefore, it makes sense
to use that form in which the relations are simplest, if there is one.}
These results demonstrate that we have just six independent infinitesimals.
We may take them to be a set of six numbers without indices if we introduce
suitable basis matrices. One such set of matrices is given by
\beq
(K_1)^\al_{\bel}=\lep\barr{cccc}
0 & 1 & 0 & 0 \\
1 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0\earr\rip;\hsph(K_2)^\al_{\bel}=\lep\barr{cccc}
0&0&1&0\\
0&0&0&0\\
1&0&0&0\\
0&0&0&0\earr\rip\nonumber
\eeq
\beq
(K_3)^\al_{\bel}=\lep\barr{cccc}
0&0&0&1\\
0&0&0&0\\
0&0&0&0\\
1&0&0&0\earr\rip;\hsph(S_1)^\al_{\bel}=\lep\barr{cccc}
0&0&0&0\\
0&0&0&0\\
0&0&0&-1\\
0&0&1&0\earr\rip\nonumber
\eeq
\beq
(S_2)^\al_{\bel}=\lep\barr{cccc}
0&0&0&0\\
0&0&0&1\\
0&0&0&0\\
0&-1&0&0\earr\rip;\hsph(S_3)^\al_{\bel}=\lep\barr{cccc}
0&0&0&0\\
0&0&-1&0\\
0&1&0&0\\
0&0&0&0\earr\rip.
\eeq
The most general infinitesimal transformation can now be written as
\beq
\Ab=\gb-\De\vom\cdot\vSb-\De\vze\cdot\vKb;
\eeq
where $\De\vom$ contains three independent infinitesimal components as does
$\De\vze$; these are, respectively, just infinitesimal coordinate rotations and
infinitesimal relative velocities.
Powers of the matrices $\Kb_i$ and $\Sb_i$ have some very special properties.
For example,
\beq
(\Kb_1)^2=\lep\barr{cccc}1&0&0&0\\0&1&0&0\\0&0&0&0\\0&0&0&0\earr\rip\andh
(\Sb_1)^2=\lep\barr{cccc}0&0&0&0\\0&0&0&0\\0&0&-1&0\\0&0&0&-1\earr\rip;
\eeq
consequently, powers of the matrices tend to repeat; the
periods of these cycles are two and four for the $\Kb$'s and the $\Sb$'s,
respectively so that, for any $m$ and integral $n$,
\beq
(\Kb_i)^{m+2n}=(\Kb_i)^m\andh(\Sb_i)^{m+4n}=(\Sb_i)^m.
\eeq
We have seen above what is the second power in each case; for the $\Kb$'s,
the third power is the same as the first and for the $\Sb$'s, one finds the
negative of the first,
\beq
(\Sb_i)^3=-\Sb_i;
\eeq
and finally, the fourth power of one of the $\Sb$'s has two 1's on the diagonal,
much like the even powers of the $\Kb$'s.
We can construct the matrix for a finite transformation by making a
sequence of many infinitesimal transformations. To this end consider some
finite $\vom$ and $\vze$ and relate them to the infinitesimals by
\beq
\De\vom=\vom/n\andh\De\vze=\vze/n,
\eeq
where $n$ is a very large number. Now apply $\Ab$ (given by \eq{114}) to $\xb$
$n$ times, thereby producing some $\xb'$:
\beq
x'^\al=\lep g-\frac{\vom\cdot\S}n-\frac{\vze\cdot\K}n\rip^\al_{\;\;\al_1}
\lep g-\frac{\vom\cdot\S}n-\frac{\vze\cdot\K}n\rip^{\al_1}_{\;\;\al_2}
...\lep g-\frac{\vom\cdot\S}n-\frac{\vze\cdot\K}n\rip^{\al_{n-1}}_{\;\;\;\al_n}
x^{\al_n}
\eeq
We want to take the $n\rightarrow\infty$ limit of this expression. In
general,
\beq
\lim_{n\rightarrow\infty}\lep1+\frac an\rip^n=e^a,
\eeq
as one can show by, e.g., considering the logarithm. Applying this fact, we
find that
\beq
x'^\al=A^\al_{\bel}x^\be
\eeq
where
\beq
A^\al_{\bel}=\lep e^{-\vom\cdot\S-\vze\cdot\K}\rip^\al_{\bel}.
\eeq
We can get a little understanding of what this equation is telling us by
considering some special cases which are also familiar. For example, let
$\vom=0$ and $\vze=\ze\epu$. Then
\beqa
\Ab=e^{-\ze\Kb_1}=1-\ze\Kb_1+\frac{\ze^2}2\Kb_1^2-\frac{\ze^3}6\Kb_1^3+...
\nonumber\\=1-\Kb_1\lep\ze+\frac{\ze^3}6+...\rip+\Kb_1^2\lep1+\frac{\ze^2}2+...
\rip-\Kb_1^2\nonumber\\=1-\Kb_1^2+(\cosh\ze)\Kb_1^2-(\sinh\ze)\Kb_1
\eeqa
or
\beq
A^\al_{\bel}
=\lep\barr{cccc}
\cosh\ze & -\sinh\ze & 0 & 0\\
-\sinh\ze & \cosh\ze & 0 & 0\\
0 & 0 & 1 & 0\\
0 & 0 & 0 & 1\earr\rip
\eeq
which should be familiar. Similarly, if $\vom=\om\em\epu$ with $\vze=0$, one
finds
\beq
A^\al_{\bel}=\lep\barr{cccc}
1 & 0 & 0 & 0\\
0 & 1 & 0 & 0\\
0 & 0 & \cos\om & \sin\om\\
0 & 0 & -\sin\om & \cos\om\earr\rip
\eeq
which we recognize as a simple rotation around the $x$-axis.
Our general result for $\Ab$ allows us to find the transformation matrix for
any combination of $\vom$ and $\vze$. In particular, one can show that for
$\vom=0$ and general $\vze$ such that $\vbe$ has magnitude $\tanh\ze$ and is
in the direction of $\vze$. Writing the components of $\vbe$ as $\be_i$,
$i=1,2,3$, we find that $\Ab$ is
\beq
A^\al_{\bel}=\lep\barr{cccc}
\ga & -\ga\be_1 & -\ga\be_2 & -\ga\be_3\\
-\ga\be_1 & 1+\frac{(\ga-1)\be_1^2}{\be^2} & \frac{(\ga-1)\be_1\be_2}{
\be^2} & \frac{(\ga-1)\be_1\be_3}{\be^2}\\ -\be_2 & \frac{(\ga-1)\be_1
\be_2}{\be^2} & 1+\frac{(\ga-1)\be_2^2}{\be^2} & \frac{(\ga-1)\be_2\be_3}{\be^2}
\\-\ga\be_3 & \frac{(\ga-1)\be_1\be_3}{\be^2} & \frac{(\ga-1)\be_2\be_3}{
\be^2} & 1+\frac{(\ga-1)\be_3^2}{\be^2}\earr\rip,
\eeq
in case anybody wanted to know.
\section{Covariance of Electrodynamics}
In this section we are going to demonstrate the consistency of the Maxwell
equations with Einstein's first postulate. But first we must decide more
precisely what it means for a ``law of nature'' to be ``the same'' in all
inertial frames. The relevant statement is this: an equation expressing a
law of nature must be {\bf invariant in form} under Lorentz
transformations. When this is the case, the equation is said to be {\em
Lorentz covariant} or simply {\em covariant}, which has nothing to do with the
definition of covariant as opposed to contravariant tensors. And what is meant
by the phrase ``invariant in form'' which appears above? It means that the
quantities in the equation must transform in well-defined ways (as particular
components of some four-tensors, for example) and that when terms are grouped
in an appropriate manner, each group transforms in the same way as each of the
other groups. In order to determine whether the Maxwell equations can have
this property, we must first figure out how each of the physical objects in
those equations, that is, $\E$, $\B$, $\rh$, and $\J$, transforms.
\subsection{Transformations of Source and Fields}
\subsubsection{$\rh$ and $\J$}
Let's start with the electric charge. It is an experimental observation
that charge is an invariant. If a system has a particular charge $q$ as
measured in one frame, then it has the same charge $q$ when the
measurements are made in a different frame. From this (experimental) fact
and things we already know, we can determine how charge density and current
density transform.
\centerline{\psfig{figure=fig16.ps,height=1.5in,width=4.25in}}
{\narrower\em{Figure 16: The charge density transforms like time.}}
\medskip
\noindent
Suppose that we have a system with charge density $\rh$, as measured in $K$,
and $\rh'$ as measured in $K'$. Then in volume $d^3x$ in $K$, there is
charge $dq$ where
\beq
dq=\rh d^3x=\rh d^3x\,dt/dt
\eeq
where we have introduced an infinitesimal time element $dt$ as well.
Similarly, in $K'$, the charge $dq'$ in the volume element $d^3x'$ can be
written as
\beq
dq'=\rh'd^3x'dt'/dt'.
\eeq
Now, if $d^3x'$ is what $d^3x$ transforms into (that it, if it is the same
volume element as $d^3x$), then charge invariance implies that $dq=dq'$.
Further, if $dt'$ is what $dt$ transforms into, then we can say that
\beq
c\,d^3x'dt'\equiv d^4x'=\lel\pde{(x'^0,x'^1,x'^2,x'^3)}{x^0,x^1,x^2,x^3)}\ril
d^4x\equiv|\det[\Ab]|d^4x.
\eeq
But the determinant of $\Ab$ is unity, so we have shown that a spacetime
volume element is an invariant,
\beq
d^4x=d^4x'.
\eeq
As applied to the present inquiry, we use this statement along with the
equality of $dq$ and $dq'$ (and the invariance of $c$) to conclude that
\beq
\rh/dt=\rh'/dt'.
\eeq
This relation can be true only if the charge density transforms in the same
way as the time; that is, it must be the $0^{th}$ component of a four-vector.
Where are the other three components of this four-vector? They are the
current density. Since $\J$ is $\rh$ times a velocity, which is in turn the
ratio $d\x/dt$, we can write
\beq
\J=\rh\u=\rh\der\x t;
\eeq
in view of the fact that $\rh/dt$ is an invariant, $\J$ must transform in
the same way as $d\x$, which is to say, as the 1,2,3 components of a
(contravariant) four-vector. Hence we have the {\em contravariant current
four-vector}
\beq
J^\al=(c\rh,\J);
\eeq
the covariant current four-vector is
\beq
J_\al=(c\rh,-\J).
\eeq
Knowing this, we are not surprised to find that the charge conservation
equation is a four-divergence equation,
\beq
\pde\rh t+\div\J=0\hsph\mbox{or }\partial_\al J^\al=0.
\eeq
Notice that this equation is ``covariant'' in the sense introduced
earlier; both sides are scalars.
\subsubsection{Potentials}
Now we shall proceed by demanding that all the relevant
equations be Lorentz covariant.
We shall apply this requirement to equations that we already have and see
what are the implications for the fields $\E$ and $\B$ and also see that no
contradictions arise. Let's start with the equations for the potentials in
the Lorentz gauge. The equations of motion are
\beq
\Box\Axt=\frac{4\pi}c\Jxt\andh\Box\Phxt=4\pi\rhxt;
\eeq
these can all be written in the very brief notation
\beq
\Box A^\al(\x,t)=\frac{4\pi}c J^\al(\x,t)
\eeq
where we have introduced
\beq
A^\al\equiv(\Ph,\A)
\eeq
which must be a contravariant four-vector if the equations of motion above
are the correct equations of motion for the potential in the Lorentz gauge
in every inertial frame. Notice that the potentials in gauges other than
the Lorentz gauge will not form a four-vector.
The Lorentz condition, which is satisfied by potentials in the Lorentz
gauge, is
\beq
\div\Axt+\frac1c\pde\Ph t=0;
\eeq
this equation may also be written as a four-divergence of a four-vector,
\beq
\partial_\al A^\al=0.
\eeq
\subsubsection{Fields, Field-Strength Tensor}
Let's look next at $\E$ and $\B$; these are given by
\beq
\Bxt=\curl\Axt\andh\Ext=-\grad\Phxt-\frac1c\pde\Axt t.
\eeq
Look at just the $x$-components:
\beq
E_x=-\frac1c\pde{A_x}t-\pde\Ph x=-\pde{A^1}{x^0}-\pde\Ph{x^1}=-\pde{A^1}{x_0}
+\pde{A^0}{x_1}=-\partial^0A^1+\partial^1A^0.
\eeq
Similarly, a component of the magnetic induction turns out to be, e.g.,
\beq
B_x=-\partial^2A^3+\partial^3A^2.
\eeq
Given the four-vector character of the differential operators and of the
potentials, we can see that these particular components of the electric
field and magnetic induction are elements of a rank-two tensor which we
have expressed here in contravariant form. Let us define the {\em
field-strength tensor} $\bar{F}$ by
\beq
F^{\al\be}\equiv\partial^\al A^\be-\partial^\be A^\al.
\eeq
This turns out to be
\beq
F^{\al\be}=\lep\barr{cccc}
0 &-E_x&-E_y&-E_z\\E_x&0&-B_z&B_y\\E_y&B_z&0&-B_x\\E_z&-B_y&B_x&0\earr\rip.
\eeq
Because the tensor is antisymmetric, it has just six independent entries,
these being the six components of $\E$ and $\B$.
The corresponding covariant tensor is easily worked out. It is the same as
the contravariant one except that the signs of the entries in the first
column and the first row are reversed. A somewhat different object which
contains the same information is the {\em dual field-strength tensor}
$\bar{{\cal F}}$ which is defined by
\beq
{\cal F}^{\al\be}\equiv\ep^{\al\be\ga\de}\frac12 F_{\ga\de}
\eeq
where the {\em fully antisymmetric rank-four unit pseudotensor} with
components
$\ep^{\al\be\ga\de}$ is in turn defined by specifying (1) that in frame $K$
\beq
\ep^{\al\be\ga\de}\equiv\lec\barr{cl}
1&\mbox{if $\al\be\ga\de$ is an even permutation of 1234}\\
-1&\mbox{if $\al\be\ga\de$ is an odd permutation of 1234}\\
0&\mbox{otherwise,}\earr\right.
\eeq
and (2) that it transforms to other frames as a rank-four pseudotensor must,
\beq
(\ep')^{\al\be\ga\de}\equiv\det[\Ab]A^\al_{\;\;\ph}A^\be_{\;\;\ch}
A^\ga_{\;\;\ps}A^\de_{\;\;\om}\ep^{\ph\ch\ps\om}.
\eeq
Applying this definition, one can show that the components of this
pseudotensor are given by \eq{147} not only in frame $K$ but in all inertial
frames.
Although $\epb$ is a pseudotensor as opposed to a true tensor, the
distinction will not be important for us so long as we stick to proper
Lorentz transformations or to improper ones that have determinant $+1$. In
what follows, we will refer to it as a tensor even though we know better;
similarly we will refer to the dual tensor as a ``tensor'' (as was done
in the definition) even though it is in fact a pseudotensor.
Returning now to the original point, ${\cal F}^{\al\be}$ is, explicitly,
\beq
{\cal F}^{\al\be}=\lep\barr{cccc}0&-B_x&-B_y&-B_z\\B_x&0&E_z&-E_y\\
B_y&-E_z&0&E_x\\B_z&E_y&-E_x&0\earr\rip.
\eeq
\subsection{Invariance of Maxwell Equations}
Now we know how everything transforms; it remains to be seen whether the
Maxwell equations are Lorentz covariant. The inhomogeneous equations are
\beq
\div\Ext=4\pi\rhxt\andh\curl\Bxt-\frac1c\pde\Ext t=\frac{4\pi}c\Jxt.
\eeq
The first of these is
\beq
\pde{F^{10}}{x^1}+\pde{F^{20}}{x^2}+\pde{F^{30}}{x^3}=\frac{4\pi}cJ^0.
\eeq
Because $F^{00}\equiv0$, we may add a term $\partial F^{00}/\partial x^0$
to the left-hand side of this equation and then find that it reads
\beq
\partial_\al F^{\al0}=\frac{4\pi}cJ^0.
\eeq
This equation is clearly the $0^{th}$ component of a four-vector equation
in which the left-hand side is obtained by taking the divergence of a rank-two
tensor. The other three inhomogeneous Maxwell equations may be analyzed in
similar fashion and the four may be concisely written as
\beq
\partial_\al F^{\al\be}=\frac{4\pi}cJ^\be
\eeq
where $\be=$0,1,2, and 3. These are manifestly Lorentz covariant.
The homogeneous Maxwell equations are
\beq
\div\Bxt=0\andh\curl\Ext=-\frac1c\pde\Bxt t.
\eeq
The first one can be written as
\beq
\pde{{\cal F}^{10}}{x^1}+\pde{{\cal F}^{20}}{x^2}+\pde{{\cal F}^{30}}{x^3}=0,
\eeq
or, since ${\cal F}^{00}=0$,
\beq
\partial_\al{\cal F}^{\al0}=0.
\eeq
The others can be expressed in similar fashion, and all four are contained
in the following equation:
\beq
\partial_\al{\cal F}^{\al\be}=0,
\eeq
where $\be=$0, 1, 2, and 3. This form is clearly covariant, establishing
the covariance of Maxwell's equations.
These equations are components of a rank-one
pseudotensor. They may also be written as components of a rank-three
tensor. Notice that $\div\B=0$ is, in tensor notation,
\beq
\pde{F^{31}}{x^2}+\pde{F^{23}}{x^1}+\pde{F^{12}}{x^3}=0.
\eeq
The remaining three homogeneous Maxwell equations can be expressed in
similar fashion, and all four can be written as
\beq
\partial^\al F^{\be\ga}+\partial^\ga F^{\al\be}+\partial^\be F^{\ga\al}=0
\eeq
where $\al$, $\be$, and $\ga$ are any three of 0,1,2,3, giving four
equations. The other possible choices of the superscripts (involving
repetition of two or more values) give nothing (They give 0=0). Hence we
have succeeded in writing each of the homogeneous Maxwell equations in the
form of an element of a rank-three tensor and the Lorentz covariant
equation we have constructed simply says that this tensor is equal to zero.
\section{Transformation of the electromagnetic field}
The transformation properties of $\E$ and $\B$ are easily worked out by
making use of our knowledge of how a rank-two tensor must transform:
\beq
(F')^{\al\be}=A^\al_{\gal}A^\be_{\del}F^{\ga\de},
\eeq
or, in matrix notation,
\beq
\Fb'=\Ab\Fb\At
\eeq
where $\At$ is the transpose of the matrix representing $\Ab$. If we pick a
frame $K'$ which is moving at velocity $\v=c\be\epu$, then
\beq
\Ab=\lep\barr{cccc}\ga&-\be\ga&0&0\\-\be\ga&\ga&0&0\\0&0&1&0\\0&0&0&1\earr
\rip\equiv\At.
\eeq
Given the field tensor from \eq{145}, we have
\beq
\Fb\At=\lep\barr{cccc}\be\ga E_x&-\ga E_x&-E_y&-E_z\\ \ga E_x&-\be\ga E_x&-
B_z&B_y\\\ga E_y-\be\ga B_z&-\be\ga E_y+\ga B_z&0&-B_x\\ \ga E_z+\be\ga B_y&
-\be\ga E_z-\ga B_y&B_x&0\earr\rip
\eeq
and
\beq
\Fb'=\lep\barr{cccc}0&-E_x&-\ga E_y+\be\ga B_y&-\ga E_z-\be\ga B_y\\
E_x&0&\be\ga E_y-\ga B_z&\be\ga E_z+\ga B_y\\
\ga E_y-\be\ga B_z&-\be\ga E_y+\ga B_z&0&-B_x\\
\ga E_z+\be\ga B_y&-\be\ga E_z-\ga B_y&B_x&0\earr\rip.
\eeq
This is an antisymmetric tensor - as it should be - and we can equate
individual elements to the appropriate components of $\B'$ and $\E'$. One
finds
\beqa
B_x'=B_x\hsph B_y'=\ga(B_y+\be E_z)\hsph B_z'=\ga(B_z-\be E_y)\nonumber\\
E_x'=E_x\hsph E_y'=\ga(E_y-\be B_z)\hsph E_z'=\ga(E_z+\be B_y).
\eeqa
By examining these relations for a bit, one can see that
\beq\barr{cc}
\E\pll'=\E\pll&\E\per'=\ga[\E\per+(\vbe\times\B)]\\
\B\pll'=\B\pll&\B\per'=\ga[\B\per-(\vbe\times\E)]\earr
\eeq
where the subscripts refer to components of the fields parallel or
perpendicular to $\vbe$.
From the transformations one may see that when $\E\perp\B$, it is possible
to find a frame where one of $\E'$ and $\B'$ (which one?) vanishes. This is
achieved by picking $\vbe\perp\E$ and $\vbe\perp\B$ with an appropriate
magnitude. For example, if $|\B|>|\E|$, we take
\beq
\vbe=\be(\E\times\B)/|\E||\B|
\eeq
where $\be$ is to be such that $\E\per'=0$, or
\beq\barr{c}
0=\E+\vbe\times\B=\E+\be[(\E\times\B)\times\B]/|\E||\B|\\=\E-\be\E|\B|/|\E|
=\E(1-\be|\B|/|\E|)\earr
\eeq
so that we find
\beq
\be=|\E|/|\B|
\eeq
which is possible if $B>E$.
\subsection{Fields Due to a Point Charge}
Another example of the use of the transformations is the determination of
the fields of a charge moving at constant velocity. Suppose a charge $q$
has velocity $\u=\be c\epu$ relative to frame $K$. Let $K'$ move at this
velocity relative to $K$ so that the charge in at rest in the primed frame.
Further, choose the coordinates so that the charge is at $\xp=0$. Then the
fields in this frame are
\beq
\B'(\xp,t')=0\andh\E'(\xp,t')=\frac q{r'^3}\xp.
\eeq
Let us restrict attention, without loss of generality, to the $z'=0$ plane.
There, the electric field is
\beq
\E'=\frac q{(\sqrt{x'^2+y'^2})^3}[x'\epu'+y'\epd]
\eeq
Using the transformations of the electromagnetic field, we find that the
nonvanishing components of the fields in frame $K$ are
\beq
E_x=\frac{qx'}{(x'^2+y'^2)^{3/2}}\hsph E_y=\frac{\ga qy'}{(x'^2+y'^2)^{3/
2}}
\eeq
and
\beq
B_z=\frac{\ga\be qy'}{(x'^2+y'^2)^{3/2}}.
\eeq
In order for these expressions to be of any use, we should express the
fields in terms of $t$ and $\x$ rather than the primed spacetime variables.
\centerline{\psfig{figure=fig17.ps,height=1.5in,width=4.25in}}
{\narrower\em{Figure 17: Charge fixed in $K'$ detected by an observer
at P in $K$.}}
\medskip
\noindent
We shall look in particular at the fields at the point $\x=b\epd$. His
position translates, via the Lorentz transformations, into $K'$ as
\beq
y'=y\hsph x'=-\ga vt\andh z'=0.
\eeq
Using these in the expressions for $\E$ and $\B$, we find
\beq\barr{c}
E_x=-\ga qvt/[b^2+(\ga vt)^2]^{3/2}\\
E_y=\ga qb/[b^2+(\ga vt)^2]^{3/2}\\
B_z=\ga\be qb/[b^2+(\ga vt)^2]^{3/2}.\earr
\eeq
It is instructive to study these results. They tell us the field at a point
(0,b,0) in $K$ when a charge $q$ goes along the $x$ axis with speed $v$,
passing the origin at time $t=0$. The fields are zero at large negative
times, then $E_x$ rises and falls to zero at $t=0$ and repeats this pattern
with the opposite sign at positive times. The other two rise to a maximum
value at $t=0$ and then fall to zero at large positive time. The duration
in time of the pulse if of order $b/\ga v$ and becomes very short if $v
\rightarrow c$ because then $\ga$ becomes arbitrarily large. The maximum
field strengths are, for $E_y$, $\ga q/b^2$, and, for $B_z$, $\ga\be q/b^2$.
Notice that for a highly relativistic particle, $\be\rightarrow1$, $E_y\approx
B_z$ and also that the maximum pulse strength scales as $\ga$ which means
it becomes very large (but for a very short time) as the velocity
approaches $c$.
\vspace{6.0in}
\edo