FACTOID # 9: The bookmobile capital of America is Kentucky.

 Home Encyclopedia Statistics States A-Z Flags Maps FAQ About

 WHAT'S NEW

SEARCH ALL

Search encyclopedia, statistics and forums:

(* = Graphable)

Encyclopedia > Matrix calculus

This article uses another definition for vector and matrix calculus than the form often encountered within the field of estimation theory and pattern recognition. The resulting equations will therefore appear to be transposed when compared to the equations used in textbooks within these fields. Estimation theory is a branch of statistics and signal processing that deals with estimating the values of parameters based on measured/empirical data. ... Pattern recognition is a field within the area of machine learning. ...

## Notation

Let M(n,m) denote the space of real n×m matrices with n rows and m columns, whose elements will be denoted F, X, Y, etc. An element of M(n,1), that is, a column vector, is denoted with a boldface lowercase letter x, while xT denotes its transpose row vector. An element of M(1,1) is a scalar, and denoted a, b, c, f, t etc. All functions are assumed to be of differentiability class C1 unless otherwise noted. In mathematics, the real numbers may be described informally as numbers that can be given by an infinite decimal representation, such as 2. ... In linear algebra, a column vector is an m Ã— 1 matrix, i. ... In linear algebra, the transpose of a matrix A is another matrix AT (also written Atr, tA, or Aâ€²) created by any one of the following equivalent actions: write the rows of A as the columns of AT write the columns of A as the rows of AT reflect A... A differentiability class in mathematics is a class of functions which share differentiability features. ...

## Vector calculus

Main article: Vector calculus

Because the space M(n,1) is identified with the Euclidean space Rn and M(1,1) is identified with R, the notations developed here can accommodate the usual operations of vector calculus. Vector calculus (also called vector analysis) is a field of mathematics concerned with multivariate real analysis of vectors in two or more dimensions. ... Around 300 BC, the Greek mathematician Euclid laid down the rules of what has now come to be called Euclidean geometry, which is the study of the relationships between angles and distances in space. ...

• The tangent vector to a curve x : RRn is
$frac{partial mathbf{x}} {partial t} = begin{bmatrix} frac{partial x_1}{partial t} vdots frac{partial x_n}{partial t} end{bmatrix}.$
• The gradient of a scalar function f : RnR
$frac{partial f}{partial mathbf{x}} = begin{bmatrix} frac{partial f}{partial x_1} & cdots & frac{partial f}{partial x_n} end{bmatrix}.$
The directional derivative of f in the direction of v is then
$nabla_mathbf{v} f = frac{partial f}{partial mathbf{x}}mathbf{v}.$
• The pushforward or differential of a function f : RmRn is described by the Jacobian matrix
$frac{partial mathbf{f}}{partial mathbf{x}} = begin{bmatrix} frac{partial f_1}{partial x_1} & cdots & frac{partial f_1}{partial x_m} vdots & ddots & vdots frac{partial f_n}{partial x_1} & cdots & frac{partial f_n}{partial x_m} end{bmatrix}.$
The pushforward along f of a vector v in Rm is
$d,mathbf{f}(mathbf{v}) = frac{partial mathbf{f}}{partial mathbf{x}} mathbf{v}.$

In differential geometry, one can attach to every point p of a differentiable manifold a tangent space, a real vector space which intuitively contains the possible directions in which one can pass through p. ... For other uses, see Gradient (disambiguation). ... In mathematics, the directional derivative of a multivariate differentiable function along a given vector V at a given point P intuitively represents the instantaneous rate of change of the function, moving through P, in the direction of V. It therefore generalizes the notion of a partial derivative, in which the... Suppose that Ï† : M â†’ N is a smooth map between smooth manifolds; then the differential of Ï† at a point x is, in some sense, the best linear approximation of Ï† near x. ... In vector calculus, the Jacobian is shorthand for either the Jacobian matrix or its determinant, the Jacobian determinant. ...

## Matrix calculus

For the purposes of defining derivatives of simple functions, not much changes with matrix spaces; the space of n×m matrices is after all isomorphic as a vector space to Rnm. The three derivatives familiar from vector calculus have close analogues here, though beware the complications that arise in the identities below. In mathematics, an isomorphism (in Greek isos = equal and morphe = shape) is a kind of interesting mapping between objects. ... In mathematics, a vector space (or linear space) is a collection of objects (called vectors) that, informally speaking, may be scaled and added. ...

• The tangent vector of a curve F : RM(n,m)
$frac{partial mathbf{F}}{partial t} = begin{bmatrix} frac{partial F_{1,1}}{partial t} & cdots & frac{partial F_{1,m}}{partial t} vdots & ddots & vdots frac{partial F_{n,1}}{partial t} & cdots & frac{partial F_{n,m}}{partial t} end{bmatrix}.$
• The gradient of a scalar function f : M(n,m) → R
$frac{partial f}{partial mathbf{X}} = begin{bmatrix} frac{partial f}{partial X_{1,1}} & cdots & frac{partial f}{partial X_{n,1}} vdots & ddots & vdots frac{partial f}{partial X_{1,m}} & cdots & frac{partial f}{partial X_{n,m}} end{bmatrix}.$
Notice that the indexing of the gradient with respect to X is transposed as compared with the indexing of X. The directional derivative of f in the direction of matrix Y is given by
$nabla_mathbf{Y} f = operatorname{tr} left(frac{partial f}{partial mathbf{X}} mathbf{Y}right),$
where tr denotes the trace.
• The differential or the matrix derivative of a function F : M(n,m) → M(p,q) is an element of M(p,q) M(m,n), a fourth rank tensor (the reversal of m and n here indicates the dual space of M(n,m)). In short it is an m×n matrix each of whose entries is a p×q matrix.
$frac{partialmathbf{F}} {partialmathbf{X}}= begin{bmatrix} frac{partialmathbf{F}}{partial X_{1,1}} & cdots & frac{partial mathbf{F}}{partial X_{n,1}} vdots & ddots & vdots frac{partialmathbf{F}}{partial X_{1,m}} & cdots & frac{partial mathbf{F}}{partial X_{n,m}} end{bmatrix},$
and note that each ∂F/∂Xi,j is a p×q matrix defined as above. Note also that this matrix has its indexing transposed; m rows and n columns. The pushforward along F of an n×m matrix Y in M(n,m) is then
$dmathbf{F}(mathbf{Y}) = operatorname{tr}left(frac{partialmathbf{F}} {partialmathbf{X}}mathbf{Y}right).$
Note that this definition encompasses all of the preceding definitions as special cases.

In linear algebra, the trace of an n-by-n square matrix A is defined to be the sum of the elements on the main diagonal (the diagonal from the upper left to the lower right) of A, i. ... Note: This is a fairly abstract mathematical approach to tensors. ... In mathematics, a tensor is (in an informal sense) a generalized linear quantity or geometrical entity that can be expressed as a multi-dimensional array relative to a choice of basis; however, as an object in and of itself, a tensor is independent of any chosen frame of reference. ... In mathematics, any vector space V has a corresponding dual vector space (or just dual space for short) consisting of all linear functionals on V. Dual vector spaces defined on finite-dimensional vector spaces can be used for defining tensors which are studied in tensor algebra. ...

## Identities

Note that matrix multiplication is not commutative, so in these identities, the order must not be changed. In mathematics, especially abstract algebra, a binary operation * on a set S is commutative if x * y = y * x for all x and y in S. Otherwise * is noncommutative. ...

• Chain rule: If Z is a function of Y which in turn is a function of X
$frac{partial mathbf{Z}} {partial mathbf{X}} = frac{partial mathbf{Z}} {partial mathbf{Y}} frac{partial mathbf{Y}} {partial mathbf{X}}$
• Product rule:
$frac{partial (mathbf{Y}^Tmathbf{Z})}{partial mathbf{X}} = (mathbf{Z}^T)frac{partialmathbf{Y}}{partial mathbf{X}} + (mathbf{Y}^T)frac{partialmathbf{Z}}{partial mathbf{X}}$

In calculus, the chain rule is a formula for the derivative of the composite of two functions. ... In calculus, the product rule also called Leibnizs law (see derivation), governs the differentiation of products of differentiable functions. ...

## Examples

### Derivative of linear functions

This section lists some commonly used vector derivative formulas for linear equations evaluating to a vector.

$frac{partial ; textbf{a}^Ttextbf{x}}{partial ; textbf{x}} = frac{partial ; textbf{x}^Ttextbf{a}}{partial ; textbf{x}} = textbf{a}^T$
$frac{partial ; textbf{A}textbf{x}}{partial ; textbf{x}} = textbf{A}$

This section lists some commonly used vector derivative formulas for quadratic matrix equations evaluating to a scalar.

$frac{partial ; textbf{x}^T textbf{A}textbf{x}}{partial ; textbf{x}} = textbf{x}^T(textbf{A}^T + textbf{A})$
$frac{partial ; (textbf{A}textbf{x} + textbf{b})^T textbf{C} (textbf{D}textbf{x} + textbf{e}) }{partial ; textbf{x}} = (textbf{D}textbf{x} + textbf{e})^T textbf{C}^T textbf{A} + (textbf{A}textbf{x} + textbf{b})^T textbf{C} textbf{D}$

Related to this is the derivative of the Euclidean norm: In mathematics and astronomy, Euclidean space is a generalization of the 2- and 3-dimensional spaces studied by Euclid. ...

### Derivative of matrix traces

This section shows examples of matrix differentiation of common trace equations. In linear algebra, the trace of an n-by-n square matrix A is defined to be the sum of the elements on the main diagonal (the diagonal from the upper left to the lower right) of A, i. ...

$frac{partial ; operatorname{tr}( textbf{A} textbf{X} textbf{B})}{partial ; textbf{X}} = frac{partial ; operatorname{tr}( textbf{B}^T textbf{X}^T textbf{A}^T)}{partial ; textbf{X}} = textbf{A}^T textbf{B}^T$

## Relation to other derivatives

There are other commonly used definitions for derivatives in multivariable spaces. For topological vector spaces, the most familiar is the Fréchet derivative, which makes use of a norm. In the case of matrix spaces, there are several matrix norms available, all of which are equivalent since the space is finite-dimensional. However the matrix derivative defined in this article makes no use of any topology on M(n,m). It is defined solely in terms of partial derivatives, which are sensitive only to variations in a single dimension at a time, and thus are not bound by the full differentiable structure of the space. For example, it is possible for a map to have all partial derivatives exist at a point, and yet not be continuous in the topology of the space. See for example Hartogs' theorem. The matrix derivative is not a special case of the Fréchet derivative for matrix spaces, but rather a convenient notation for keeping track of many partial derivatives for doing calculations, though in the case that a function is Fréchet differentiable, the two derivatives will agree. This article is about derivatives and differentiation in mathematical calculus. ... In mathematics a topological vector space is one of the basic structures investigated in functional analysis. ... In mathematics, the FrÃ©chet derivative is a derivative defined on Banach spaces. ... In linear algebra, functional analysis and related areas of mathematics, a norm is a function which assigns a positive length or size to all vectors in a vector space, other than the zero vector. ... In mathematics, the term Matrix Norm can have two meanings: A vector norm on matrices, i. ... Topological spaces are structures that allow one to formalize concepts such as convergence, connectedness and continuity. ... In mathematics, a partial derivative of a function of several variables is its derivative with respect to one of those variables with the others held constant (as opposed to the total derivative, in which all variables are allowed to vary). ... In topology, an atlas describes how a complicated space is glued together from simpler pieces. ... NB that the terminology is inconsistent and Hartogs theorem may also mean Hartogs lemma on removable singularities, or the result on Hartogs number In mathematics, Hartogs theorem is a fundamental result of Friedrich Hartogs in the theory of several complex variables. ...

## Usages

Matrix calculus is used for deriving optimal stochastic estimators, often involving the use of Lagrange multipliers. This includes the derivation of: Fig. ...

The Kalman filter is an efficient recursive filter that estimates the state of a dynamic system from a series of incomplete and noisy measurements. ... The Wiener filter is a filter proposed by Norbert Wiener during the 1940s and published [1]. // Description Unlike the typical filtering theory of designing a filter for a desired frequency response the Wiener filter approaches filtering from a different angle. ...

## Alternatives

The tensor index notation with its Einstein summation convention is very similar to the matrix calculus, except one writes only a single component at a time. It has the advantage that one can easily manipulate arbitrarily high rank tensors, whereas tensors of rank higher than two are quite unwieldy with matrix notation. Note that a matrix can be considered simply a tensor of rank two. In mathematics, especially in applications of linear algebra to physics, the Einstein notation or Einstein summation convention is a notational convention useful when dealing with coordinate formulae. ... For other topics related to Einstein see Einstein (disambig) In mathematics, especially in applications of linear algebra to physics, the Einstein notation or Einstein summation convention is a notational convention useful when dealing with coordinate equations or formulas. ...

In mathematics, there are many possible generalizations of the derivative, i. ...

Results from FactBites:

 NationMaster - Encyclopedia: Matrix calculus (443 words) In mathematics, matrix calculus is a specialized notation for doing multivariable calculus, especially over spaces of matrices, where it defines the matrix derivative. Multivariable calculus is the extension of calculus in one variable to calculus in several variables: the functions which are differentiated and integrated involve several variables rather than one variable. ADJ(A) is the adjoint of the square matrix A. DIAG(a) is the diagonal matrix whose diagonal elements are the elements of a.
 NationMaster - Encyclopedia: Partial derivative (1189 words) In vector calculus, the divergence is an operator that measures a vector fields tendency to originate from or converge upon a given point. In vector calculus, the Jacobian is shorthand for either the Jacobian matrix or its determinant, the Jacobian determinant. In vector calculus, the Laplace operator or Laplacian is a differential operator equal to the sum of all the unmixed second partial derivatives of a dependent variable.
More results at FactBites »

Share your thoughts, questions and commentary here