Errors-in-variables model

From Freepedia

(Redirected from Total least squares)

Errors-in-variables (EIV) is a robust modeling technique in statistics, which assumes that every variable can have error or noise. Errors-in-variables is also referred to as total least squares (TLS), in a broad sense, in the literature of computational mathematics and engineering. However, TLS in a strict sense implies the application of EIV or orthogonal regression to a linear model <math>\mathbf{A x} = \mathbf{b}</math>.

Robust linear regression

In linear regression, the least squares (LS) attributes all error to the dependent variables. It has variant versions according to other error configurations such as total least squares (i.e. orthogonal error), data least squares (DLS), constrained or structured TLS and so on.

Given an observation vector <math>\mathbf{b} \in \reals^n</math> and a data matrix <math>\mathbf{A} \in \reals^{n \times m}</math>, consider the solution of the overdetermined system of equations <math>\mathbf{Ax \approx b}</math>.

The ordinary least square method (OLS) yields the solution <math>\mathbf{x}</math> that minimizes the Euclidean norm of error or residual <math>||{\mathbf{Ax-b}}||_2</math>, where <math>||\cdot||_2</math> is also known as two-norm. Equivalently, the problem can be solved by

<math> \min_{\mathbf{x}}||\Delta\mathbf{b}||_2 \quad

\mbox{ subject to }\quad \mathbf{Ax}=\mathbf{b}+\Delta\mathbf{b}. </math>

If the data matrix <math>\mathbf{A}</math> is also noisy (i.e. error in both the dependent and the explanatory variables), the OLS solution is no longer optimal. In case orthogonal optimization is acceptable, TLS offers a proper formulation:

<math> \min_{\mathbf{x}} ||{[{\Delta\mathbf{A}\,\Delta\mathbf{b}}]}||_F \quad

\mbox{ subject to }\quad (\mathbf{A}+\Delta\mathbf{A})\mathbf{x}=\mathbf{b}+\Delta\mathbf{b},</math>

where <math>||{\cdot}||_F</math> is the Frobenius norm (or in human English: the "length" of the vector); and the perturbations <math>\Delta\mathbf{A}</math> and <math>\Delta\mathbf{b}</math> are used to compensate for the noisy signals <math>\mathbf{A}</math> and <math>\mathbf{b}</math>, respectively. This formulation of TLS also implies that the noises are assumed to be independently, identically distributed (i.i.d.) both in <math>\mathbf{A}</math> and <math>\mathbf{b}</math>. Note that the objective can have a weighting matrix according to the distribution of errors if the distribution is known or well-estimated, which is called the constrained or structured TLS.

In the other case, where the noise is only in <math>\mathbf{A}</math>, DLS can be used alternatively as

<math> \min_{\mathbf{x}} ||{[{\Delta\mathbf{A}}]}||_F \quad \mbox{ subject to } \quad (\mathbf{A}+\Delta\mathbf{A})\mathbf{x}=\mathbf{b}.</math>

The solution of OLS can be obtained using (pseudo-)inverse of data matrix. The other solutions of TLS or DLS have been shown to be closely connected to a set of singular vectors of (augmented) system-related matrix corresponding to the minimum singular value.

References

  • S. V. Huffel and P. Lemmerling, Total Least Squares and Errors-in-Variables Modeling: Analysis, Algorithms and Applications. Dordrecht, The Netherlands: Kluwer Academic Publishers, 2002.
  • S. Jo and S. W. Kim, "Consistent normalized least mean square filtering with noisy data matrix," IEEE Trans. Signal Processing, vol. 53, no. 6, pp. 2112-2123, Jun. 2005.
  • R. D. DeGroat and E. M. Dowling, "The data least squares problem and channel equalization," IEEE Trans. Signal Processing, vol. 41, no. 1, pp. 407–411, Jan. 1993.
  • T. Abatzoglou and J. Mendel, "Constrained total least squares," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP’87), Apr. 1987, vol. 12, pp. 1485–1488.


Views
Personal tools
Similar Links