.
Scatter matrix
In multivariate statistics and probability theory, the scatter matrix is a statistic that is used to make estimates of the covariance matrix, for instance of the multivariate normal distribution.
Definition
Given n samples of m-dimensional data, represented as the m-by-n matrix, \( X=[\mathbf{x}_1,\mathbf{x}_2,\ldots,\mathbf{x}_n], \) the sample mean is
\( \overline{\mathbf{x}} = \frac{1}{n}\sum_{j=1}^n \mathbf{x}_j \)
where \(\mathbf{x}_j \) is the jth column of \( X\,. \)
The scatter matrix is the m-by-m positive semi-definite matrix
\(S = \sum_{j=1}^n (\mathbf{x}_j-\overline{\mathbf{x}})(\mathbf{x}_j-\overline{\mathbf{x}})^T = \sum_{j=1}^n (\mathbf{x}_j-\overline{\mathbf{x}})\otimes(\mathbf{x}_j-\overline{\mathbf{x}}) = \left( \sum_{j=1}^n \mathbf{x}_j \mathbf{x}_j^T \right) - n \overline{\mathbf{x}} \overline{\mathbf{x}}^T \)
where T denotes matrix transpose, and multiplication is with regards to the outer product. The scatter matrix may be expressed more succinctly as
\(S = X\,C_n\,X^T \)
where \( \,C_n \) is the n-by-n centering matrix.
Application
The maximum likelihood estimate, given n samples, for the covariance matrix of a multivariate normal distribution can be expressed as the normalized scatter matrix
\(C_{ML}=\frac{1}{n}S. \)
When the columns of \(X\, \) are independently sampled from a multivariate normal distribution, then \(S\, \) has a Wishart distribution.
See also
Estimation of covariance matrices
Sample covariance matrix
Wishart distribution
Outer product—XX^\topor X⊗X is the outer product of X with itself.
Gram matrix
Retrieved from "http://en.wikipedia.org/"
All text is available under the terms of the GNU Free Documentation License