# .

# Einstein notation

In mathematics, especially in applications of linear algebra to physics, the Einstein notation or Einstein summation convention is a notational convention useful when dealing with coordinate formulae. It was introduced by Albert Einstein in 1916.[1]

According to this convention, when an index variable appears twice in a single term it implies summation of that term over all the possible values of the index. In typical applications, the index values are 1,2,3 (representing the three dimensions of physical Euclidean space), or 0,1,2,3 or 1,2,3,4 (for the elements of a basis in four-dimensional spacetime or Minkowski space), but they can range over any indexing set, including an infinite set. Thus in three dimensions

\( y = c_i x^i \, \)

means

\( y = \sum_{i=1}^3 c_i x^i = c_1 x^1 + c_2 x^2 + c_3 x^3 . \)

The upper indices are not exponents but generally relate to indexing of coordinates, coefficients or a basis. Thus, for example, x2 should be read as "x-two", not "x squared", and typically (x1, x2, x3) would be equivalent to the traditional (x, y, z).

In general relativity, a common convention is that the Greek alphabet and the Latin alphabet are used to distinguish between the index ranges 0,1,2,3 and 1,2,3 (usually Greek, μ, ν, ... for 0,1,2,3 and Latin, i, j, ... for 1,2,3). This should not be confused with a typographically similar convention used to distinguish between Einstein notation and the closely related but distinct basis-independent abstract index notation.

Einstein notation can be applied in slightly different ways. Typically, each index occurs once in an upper (superscript) and once in a lower (subscript) position in a term; however, the convention can be applied more generally to any repeated indices within a term.[2] When dealing with covariant and contravariant vectors, where the position of an index also indicates the type of vector, the first case usually applies; a covariant vector can only be contracted with a contravariant vector, corresponding to summation of the products of coefficients. On the other hand, when there is a fixed coordinate basis (or when not considering coordinate vectors), one may choose to use only subscripts; see below.

Introduction

Example of Einstein notation for a vector:

\( y = c_i x^i \, \)

In Einstein notation, vector indices are superscripts (e.g. \( x^i \)) and covector indices are subscripts (e.g. \( x_i \)). The position of the index has a specific meaning. It is important, of course, not to confuse a superscript with an exponent—all the relations with superscripts and subscripts are linear, they involve no power higher than the first. Here, the superscripted i above the symbol x represents an integer-valued index running from 1 to n.

The virtue of Einstein notation is that it represents the invariant quantities with a simple notation.

The basic idea of Einstein notation is that a vector can form a scalar:

\( y = c_1 x^1+c_2x^2+c_3x^3+ \cdots + c_nx^n \, \)

This is typically written as an explicit sum:

\( y = \sum_{i=1}^n c_ix^i \)

This sum is invariant under changes of basis, but the individual terms in the sum are not. This led Einstein to propose the convention that repeated indices imply the sum:

\( y = c_i x^i \, \)

This, and any, scalar is invariant under transformations of basis. When the basis is changed, the components of a vector change by a linear transformation described by a matrix.

As for covectors, they change by the inverse matrix. This is designed to guarantee that the linear function associated with the covector, the sum above, is the same no matter what the basis is.

Vector representations

In linear algebra, Einstein notation can be used to distinguish between the components of vectors and of covectors, and between the vector and covector basis. Given a vector space V and its dual space V*: vectors vi ∈ V have lower indices, and components ai of vectors have upper indices.[Note] So a vector v may be expressed as:

\( v = a^i e_i = \begin{bmatrix}e_1&e_2&\cdots&e_n\end{bmatrix}\begin{bmatrix}a^1\\a^2\\\vdots\\a^n\end{bmatrix} \)

where the set of vectors { ei } is a basis for V.

Covectors wi ∈ V* have upper indices, and components ai of covectors have lower indices.[Note] So a covector w may be expressed as:

\( w = b_i e^i = \begin{bmatrix}b_1 & b_2 & \cdots & b_n\end{bmatrix}\begin{bmatrix}e_1\\e_2\\\vdots\\e_n\end{bmatrix} \)

where the set of covectors { ei } is the dual basis for V* with the defining relations ei(ej) = δij.

Note that the ei are vectors, the ei are covectors, and the ai and bi are scalars. The product returns a vector v or covector w, respectively. Since basis vectors ei are given lower indices and coordinates ai are labeled with upper indices, summation notation suggests pairing them (in the obvious way) to express the vector.

In a given basis, the coefficient ai of ei for a vector v is the value of the covector of the corresponding dual basis acting on the vector: ai = ei(v). Similarly, the coefficient bi of ei for a covector w is the value of the covector acting on the corresponding dual basis: bi = w(ei).

In terms of covariance and contravariance of vectors, upper indices represent components of contravariant vectors (vectors), while lower indices represent components of covariant vectors (covectors): they transform covariantly (resp., contravariantly) with respect to change of basis. In recognition of this fact, the following notation uses the same symbol both for a (co)vector and its components, as in:

\( \, v = v^i e_i \)

\( \, w = w_i e^i \)

Here vi means the ith component of the vector v; it does not mean "the ith covector v". It is w that is the covector, and wi are its components.

Mnemonics

In the above example, vectors are represented as n×1 matrices (column vectors), while covectors are represented as 1×n matrices (row covectors). The opposite convention is also used. For example, the DirectX API uses row vectors.[3]

When using the column vector convention

"Upper indices go up to down; lower indices go left to right"

You can stack vectors (column matrices) side-by-side:

\( \begin{bmatrix}v_1 & \cdots & v_k\end{bmatrix}. \)

Hence the lower index indicates which column you are in.

You can stack covectors (row matrices) top-to-bottom:

\( \begin{bmatrix}w^1 \\ \vdots \\ w^k\end{bmatrix} \)

Hence the upper index indicates which row you are in.

Superscripts and subscripts vs. only subscripts

In the presence of a non-degenerate form (an isomorphism \( V \to V^* \) , for instance a Riemannian metric or Minkowski metric), one can raise and lower indices.

A basis gives such a form (via the dual basis), hence when working on Rn with a Euclidian metric and a fixed orthonomal basis, one can work with only subscripts.

However, if one changes coordinates, the way that coefficients change depends on the variance of the object, and one cannot ignore the distinction; see covariance and contravariance of vectors.

Common operations in this notation

In Einstein notation, the usual element reference A_{mn} for the mth row and nth column of matrix A becomes \( A_n^m \). We can then write the following operations in Einstein notation as follows.

Inner product

Given a row vector v and a column vector u of the same size, we can take the inner product v_i u^i, which is a scalar: it's evaluating the covector on the vector.

Multiplication of a vector by a matrix

Given a matrix \( A^i_j \) and a (column) vector \( v^j \), the coefficients of the product \( \mathbf{A}v \) are given by \( A^i_j v^j \).

Similarly, \( v^\mathrm{T} \mathbf{A}^\mathrm{T} \) is equivalent to \( A^j_i v_j. \)

But, be aware that: notations like \scriptstyle A^j_i v_j are somewhat misleading, then they are refined to

\( Av={A^i}_j v^j \)

to keep track of which is column and which is row. In the notations: \(\scriptstyle {A^i}_j, \) the index i (the first index) is row, and the index j (the second index) is column.

Matrix multiplication

We can represent matrix multiplication as:

\( C^i_k = A^i_j \, B^j_k \)

This expression is equivalent to the more conventional (and less compact) notation:

\( \mathbf{C}_{ik} = (\mathbf{A} \, \mathbf{B})_{ik} =\sum_{j=1}^N A_{ij} B_{jk} \)

Trace

Given a square matrix \( A^i_j \), summing over a common index \( A^i_i \) yields the trace.

Outer product

The outer product of the column vector u by the row vector 'v' yields an M × N matrix A:

\( \mathbf{A} = \mathbf{u} \, \mathbf{v} \)

In Einstein notation, we have:

\( A^i_j = u^i \, v_j = (uv)^i_j \)

Since i and j represent two different indices, and in this case over two different ranges M and N respectively, the indices are not eliminated by the multiplication. Both indices survive the multiplication to become the two indices of the newly-created matrix A of rank 1.

Coefficients on tensors and related

Given a tensor field and a basis (of linearly independent vector fields), the coefficients of the tensor field in a basis can be computed by evaluating on a suitable combination of the basis and dual basis, and inherits the correct indexing. We list notable examples.

Throughout, let e_i be a basis of vector fields (a moving frame).

(covariant) metric tensor

\( g_{ij} = g(e_i,e_j) \)

(contravariant) metric tensor

\( g^{ij} = g(e^i,e^j) \)

Torsion tensor (using the below)

\( T^c_{ab} = \Gamma^c_{ab} - \Gamma^c_{ba}-\gamma^c_{ab}, \)

which follows from the formula

\( T = \nabla_X Y - \nabla_Y X - [X,Y]. \)

Riemann curvature tensor

\( {R^\rho}_{\sigma\mu\nu} = dx^\rho(R(\partial_{\mu},\partial_{\nu})\partial_{\sigma}) \)

This also applies for some operations that are not tensorial, for instance:

Christoffel symbols

\( \nabla_ie_j=\Gamma_{ij}^ke_k \)

where \nabla_i e_j is the covariant derivative. Equivalently,

\( \Gamma_{ij}^k = e^k\nabla_ie_j \)

commutator coefficients

\( [e_i,e_j] = \gamma_{ij}^k e_k \)

where \( [e_i,e_j] \) is the Lie bracket. Equivalently,

\( \gamma_{ij}^k = e^k[e_i,e_j]. \)

Vector dot product

In mechanics and engineering, vectors in 3D space are often described in relation to orthogonal unit vectors i, j and k.

\( \mathbf{u} = u_x \mathbf{i} + u_y \mathbf{j} + u_z \mathbf{k} \)

If the basis vectors i, j, and k are instead expressed as \( e_1, e_2 \) , and \( e_3 \), a vector can be expressed in terms of a summation:

\( \mathbf{u} = u^1 \mathbf{e}_1 + u^2 \mathbf{e}_2 + u^3 \mathbf{e}_3 = \sum_{i = 1}^3 u^i \mathbf{e}_i \)

In Einstein notation, the summation symbol is omitted since the index i is repeated once as an upper index and once as a lower index, and we simply write

\( \mathbf{u} = u^i \mathbf{e}_i \)

Using \( e_1, e_2 \) , and \( e_3 \) instead of i, j, and k, together with Einstein notation, we obtain a concise algebraic presentation of vector and tensor equations. For example,

\( \mathbf{u} \cdot \mathbf{v} = \left( \sum_{i = 1}^3 u^i \mathbf{e}_i \right) \cdot \left( \sum_{j = 1}^3 v^j \mathbf{e}_j \right) = (u^i \mathbf{e}_i) \cdot (v^j \mathbf{e}_j)= u^i v^j ( \mathbf{e}_i \cdot \mathbf{e}_j ). \)

Since

\( \mathbf{e}_i \cdot \mathbf{e}_j = \delta_{ij} \)

where \ \delta_{ij} is the Kronecker delta, which is equal to 1 when i = j, and 0 otherwise, we find

\( \mathbf{u} \cdot \mathbf{v} = u^i v^j\delta_{ij}. \)

One can use \( \ \delta_{ij}\) to lower indices of the vectors; namely, \( \ u_i=\delta_{ij}u^j \) and \( \ v_i=\delta_{ij}v^j \). Then

\( \mathbf{u} \cdot \mathbf{v} = u^i v^j\delta_{ij}= u^i v_i = u_j v^j \)

Note that, despite \( u^i=u_i \) for any fixed i, it is incorrect to write

\( \mathbf{u} \cdot \mathbf{v} = u^iv^i, \)

since on the right hand side the index i is repeated both times as an upper index and so there is no summation over i according to the Einstein convention. Rather, one should explicitly write the summation:

\( \mathbf{u} \cdot \mathbf{v} = \sum_{i=1}^3u^iv^i. \)

Vector cross product

For the cross product,

\( \mathbf{u} \times \mathbf{v}= \left( \sum_{j = 1}^3 u^j \mathbf{e}_j \right) \times \left( \sum_{k = 1}^3 v^k \mathbf{e}_k \right) = (u^j \mathbf{e}_j ) \times (v^k \mathbf{e}_k)

= u^j v^k (\mathbf{e}_j \times \mathbf{e}_k ) = u^j v^k\epsilon^i_{jk} \mathbf{e}_i \)

where \( \mathbf{e}_j \times \mathbf{e}_k = \epsilon^i_{jk} \mathbf{e}_i \) and \( \ \epsilon^i_{jk}=\delta^{il}\epsilon_{ljk}, \) with \( \epsilon_{ijk} \) the Levi-Civita symbol defined by:

\( \epsilon_{ijk} = \left\{ \begin{matrix} 0 & \mbox{unless } i,j,k \mbox{ are distinct}\\ +1 & \mbox{if } (i,j,k) \mbox{ is an even permutation of } (1,2,3)\\ -1 & \mbox{if } (i,j,k) \mbox{ is an odd permutation of } (1,2,3) \end{matrix} \right. \)

One then recovers

\( \mathbf{u} \times \mathbf{v} = (u^2 v^3 - u^3 v^2) \mathbf{e}_1 + (u^3 v^1 - u^1 v^3) \mathbf{e}_2 + (u^1 v^2 - u^2 v^1) \mathbf{e}_3 \)

from

\( \mathbf{u} \times \mathbf{v} = u^j v^k \delta^{il}\epsilon_{ljk}\mathbf{e}_i

\mathbf{u} \times \mathbf{v}= \epsilon^i_{jk} u^j v^k\mathbf{e}_i = \sum_{i = 1}^3 \sum_{j = 1}^3 \sum_{k = 1}^3 \epsilon^i_{jk} u^j v^k\mathbf{e}_i . \)

In other words, if \( \mathbf{w} = \mathbf{u} \times \mathbf{v}, then w^i \mathbf{e}_i= \epsilon^i_{jk} u^j v^k\mathbf{e}_i , so that \ w^i = \epsilon^i_{jk} u^j v^k . \)

Abstract definitions

In the traditional usage, one has in mind a vector space V with finite dimension n, and a specific basis of V. We can write the basis vectors as \( e_1, e_2, ..., e_n \). Then if 'v' is a vector in V, it has coordinates \( v^1,\dots,v^n \) relative to this basis.

The basic rule is:

\( \mathbf{v} = v^i\mathbf{e}_i. \)

In this expression, it was assumed that the term on the right side was to be summed as i goes from 1 to n, because the index i does not appear on both sides of the expression. (Or, using Einstein's convention, because the index i appeared twice.)

An index that is summed over is a summation index. Here, the i is known as a summation index. It is also known as a dummy index since the result is not dependent on it; thus we could also write, for example:

\( \mathbf{v} = v^j\mathbf{e}_j. \)

An index that is not summed over is a free index and should be found in each term of the equation or formula if it appears in any term. Compare dummy indices and free indices with free variables and bound variables.

The value of the Einstein convention is that it applies to other vector spaces built from V using the tensor product and duality. For example, V\otimes V, the tensor product of V with itself, has a basis consisting of tensors of the form \( \mathbf{e}_{ij} = \mathbf{e}_i \otimes \mathbf{e}_j \) . Any tensor \( \mathbf{T} \) in \( V\otimes V \) can be written as:

\( \mathbf{T} = T^{ij}\mathbf{e}_{ij}. \)

V*, the dual of V, has a basis e1, e2, ..., en which obeys the rule

\( \mathbf{e}^i (\mathbf{e}_j) = \delta^i_j. \)

Here δ is the Kronecker delta, so \( \delta^i_j \) is 1 if i =j and 0 otherwise.

As

\( \mathrm{Hom}(V,W) = V^* \otimes W \)

the row-column coordinates on a matrix correspond to the upper-lower indices on the tensor product.

Examples

Einstein summation is clarified with the help of a few simple examples. Consider four-dimensional spacetime, where indices run from 0 to 3:

\( \mathbf{} a^\mu b_\mu = a^0 b_0 + a^1 b_1 + a^2 b_2 + a^3 b_3 \)

\( \mathbf{} a^{\mu\nu} b_\mu = a^{0\nu} b_0 + a^{1\nu} b_1 + a^{2\nu} b_2 + a^{3\nu} b_3. \)

The above example is one of contraction, a common tensor operation. The tensor \( \mathbf{} a^{\mu\nu}b_{\mu} \) becomes a new tensor by summing over the first upper index and the lower index. Typically the resulting tensor is renamed with the contracted indices removed:

\( \mathbf{} {s}^{\nu} = a^{\mu\nu}b_{\mu}. \)

For a familiar example, consider the dot product of two vectors a and b. The dot product is defined simply as summation over the indices of a and b:

\( \mathbf{a}\cdot\mathbf{b} = a^{\alpha}b_{\alpha} = a^0 b_0 + a^1 b_1 + a^2 b_2 + a^3 b_3, \)

which is our familiar formula for the vector dot product. Remember it is sometimes necessary to change the components of a in order to lower its index; however, this is not necessary in Euclidean space, or any space with a metric equal to its inverse metric (e.g., flat spacetime).

See also

Abstract index notation

Bra-ket notation

Penrose graphical notation

Kronecker delta

Levi-Civita symbol

Notes

This applies only for numerical indices. The situation is the opposite for abstract indices. Then, vectors themselves carry upper abstract indices and covectors carry lower abstract indices, as per the example in the introduction of this article. Elements of a basis of vectors may carry a lower numerical index and an upper abstract index.

References

^ Einstein, Albert (1916). "The Foundation of the General Theory of Relativity" (PDF). Annalen der Physik. Retrieved 2006-09-03.

^ "Einstein Summation". Wolfram Mathworld. Retrieved 13 April 2011.

^ Dunn, Parberry (2002). 3d Graphics Primer for Graphics and Game Development. Wordware. pp. 90–91.

Bibliography

Kuptsov, L.P. (2001), "Einstein rule", in Hazewinkel, Michiel, Encyclopedia of Mathematics, Springer, ISBN 978-1556080104.

External links

Rawlings, Steve (2007-02-01). "Lecture 10 - Einstein Summation Convention and Vector Identities". Oxford University.

Retrieved from "http://en.wikipedia.org/"

All text is available under the terms of the GNU Free Documentation License