Skip to content

5.12 Best Approximation

Theorem(Gram-Schmidt)

Let \((V,\lang,\rang)\) be a k-IPVS, let \(\{v_{1},...,v_{n}\}\) be a linearly ordered independent set in \(V\)

Then there exists an ordered orthogonal set \(\{w_{1},...,w_{n}\}\subseteq V\) such that \(\forall l,1\leq l\leq n\), the subspace \(\lang v_{1},...,v_{l}\rang\) is equal to the subspace \(\lang w_1,...,w_l\rang\)

Proof (Recursive proof)

Take \(w_{1}=v_{1}\), of course \(\lang v_1\rang =\lang w_1\rang\). Suppose that we have already defined \(\{w_{1},..,w_{n}\}\) for some \(l<n\) such that \(w_i\perp w_j\) forall \(i\neq j\) and \(i,j\leq l\) and \(\lang v_1,...,v_l\rang =\lang w_1,...,w_l\rang\)

Now we want to define \(w_{l+1}\) such that \(w_{l+1}\perp w_i\), \(\forall i,1\leq i\leq l\) and \(\lang v_{1},..,v_{l+1}\rang=\lang w_1,...,w_l,w_{l+1}\rang\)

We define \(w_{l+1}=v_{l+1}-\sum^{l}_{j=1}\frac{\lang v_{l+1},w_{j}\rang}{||w_{j}||^{2}}w_{j}\) (For example, if \(l=1\), then \(w_{2}=v_{2}-\frac{\lang v_2,w_1\rang}{||w_1||^2}w_1\) )image

We want to prove: 1. \(w_{l+1}\perp w_i,\forall i\), \(1\leq i\leq l\) 2. \(\lang v_{1},...,v_{l+1}\rang =\lang w_1,...,w_{l+1}\rang\)

Note that \(w_{l+1}\neq 0_V\) because if \(w_{l+1}=0_V\), then \(v_{l+1}=\sum^{l}_{j=1}\frac{\lang v_{l+1},w_j\rang }{||w_j||^2}w_j\)

That is \(v_{l+1}\in \lang w_{1},...,w_{l}\rang=\lang v_1,...,v_l\rang\). This is impossible becaise \(\{v_1,...,v_{l+1}\}\) is linearly independent

Let's prove 1), let \(i,1\leq i \leq l\),

\(\langle w_{l+1},w_{i}\rangle=\langle v_{l+1}-\sum_{j=1}^{l}\frac{\langle v_{l+1},w_{j}\rangle}{||w_{j}||^{2}} w_{j},w_{i}\rangle=\langle v_{l+1},w_{i}\rangle-\sum_{j=1}^{l}\frac{\langle v_{l+1},w_{j}\rangle}{||w_{j}||^{2}} \langle w_{j},w_{i}\rangle\)

\(=\langle v_{l+1},w_{i}\rangle-\frac{\langle v_{l+1},w_{i}\rangle}{||w_{i}||^{2}} \langle w_{i},w_{i}\rangle=\langle v_{l+1},w_{i}\rangle-\langle v_{l+1},w_{i}\rangle =0\)

Then prove 2)

\(\langle w_{1},...,w_{l},w_{l+1}\rangle=\langle v_{1},...,v_{l},w_{l+1}\rangle=\langle v_{1},...,v_{l},v_{l+1}-\sum_{j=1}^{l}\frac{\langle v_{l+1},w_{j}\rangle}{||w_{j}||^{2}} w_{j}\rangle\) where \(\sum_{j=1}^{l}\frac{\langle v_{l+1},w_{j}\rangle}{||w_{j}||^{2}}w_{j}\in\langle w_{1},\ldots,w_{l}\rangle=\langle v_{1},\ldots,v_{l}\rangle\)

Then \(\langle v_{1},...,v_{l},v_{l+1}-\sum_{j=1}^{l}\frac{\langle v_{l+1},w_{j}\rangle}{||w_{j}||^{2}} w_{j}\rangle=\lang v_{1},...,v_{l+1}\rang\)

Remark

  1. If we want the set \(\{w_1,...,w_n\}\) to be orthonormal, then we only have to divide each \(w_j\) by its norm

  2. In particular, if \(\{v_{1},...,v_n\}\) is a basis of \(V\), then \(\{w_1,...,w_n\}\) is an orthogonal basis since it's linearly independent by theorem

Example

Let \(V=\mathbb{C}^{4}\) as \(\mathbb{C}\)-v.s. with the standard inner product

Let \(v_1=(2i,0,4,0),v_2=(-1+i,0,1,0),v_3=(2,3,4,0)\)

Then \(\{v_1,v_2,v_3\}\) is linearly independent, let's find \(\{w_1,w_2,w_3\}\) such that \(w_i\perp w_j\) if \(i\neq j\)

The subspace \(\lang w_1\rang=\lang v_1\rang\) and \(\lang w_1,w_2\rang =\lang v_1,v_2\rang\) and \(\lang w_1,w_2,w_3\rang =\lang v_1,v_2,v_3\rang\)

Set \(w_1=v_1=(2i,0,4,0)\), now \(w_{2}=v_{2}-\frac{\lang v_{2},w_{1}\rang}{||w_{1}||^{2}}w_{1}\), then \(w_{2}=(-1+i,0,1,0)-\lambda_{1}(2i,0,4,0)\)

After computation, \(||w_{1}||^{2}=\langle w_{1},w_{1}\rangle=\langle(2i,0,4,0),(2i,0,4,0)\rangle=2i\cdot \overline{2i}+0\cdot\bar{0}+4\cdot\bar{4}+0\cdot\bar{0}=20\)

\(\langle v_{2},w_{1}\rangle=\langle(-1+i,0,1,0),(2i,0,4,0)\rangle=\left(-1+i\right )\cdot\overline{2i}+0\cdot\bar{0}+1\cdot\bar{4}+0\cdot\bar{0}=2i+6\)

Then \(\lambda =\frac{i+3}{10}\), then \(w_{2}=(-\frac{3}{5}+\frac{4}{5}i,0,\frac{7}{10}-\frac{i}{10},0)\), \(w_{3}=v_{3}-\frac{\langle v_{3},w_{1}\rangle}{20}w_{1}-\frac{\langle v_{3},w_{2}\rangle}{||w_{2}||^{2}} w_{2}=...\)

Remark

  1. Suppose \(B=\{v_1,...,v_n\}\) is an basis of \(V\). Then \([\lang ,\rang]=Id_{n}\)

    If \(B\) is orthogonal, then \([\lang ,\rang]_B=\begin{pmatrix} \lambda_1&&0\\ &\ddots&\\ 0&&\lambda_n \end{pmatrix}\), \(\lambda_i>0,\forall i\)

    This is because \(A_{ii}=\lang v_i,v_i\rang =||v_i||^2=\lambda_i\)

    Suppose \(\{v_1,...,v_n\}\) is an orthogonal linearly independent set, we compute \(\{w_1,...,w_n\}\) like in the theorem, since initially, we set \(v_1=w_1\),...., we get the same set \(\{v_1,...,v_n\}\)

  2. If we start with a non linearly independent set, we will get zero vector at some point

Best Approximation

image

Let \(\{v_1,v_2\}\) be orthogonal basis of \(\R^2\), let \(w=\alpha_{1}v_{1}+\alpha_{2}v_{2}\), then \(||w-\alpha_{1}v_{1}||^{2}=||\alpha_2v_2||^2\) and \(||w-\beta_1v_1||^2\)


Definition

Let \((V,\lang,\rang)\) be a k-IPVS, given a subspace \(W\) of \(V\) and \(v\in V\), a best approximation of \(v\) to \(W\) is a vector \(w\in W\) such that \(||v-w||\leq ||v-z||\), \(\forall z\in W\)

Theorem

Let \((V,\lang,\rang)\) be a k-IPVS, \(W\) is a subspace of \(V\) and \(v\in V\). Then

  1. \(w\in W\) is the best approximation to \(v\) in \(W\)\(\iff\)\(v-w\perp W\) (That is \(v-w\perp z,\forall z\in W\))

  2. If a best approximation to \(v\) in \(W\) exists, then it is unique

  3. If \(\dim W<\infty\) and \(\{v_1,...,v_r\}\) is an orthonormal basis of \(W\), then \(w=\sum_{i=1}^{r}\frac{\langle v,v_{i}\rangle}{||v_{i}||^{2}}v_{i}=\sum_{i=1}^{r} \langle v,v_{i}\rangle v_{i}\) is the unique best approximation to \(v\) in \(W\)

Proof

  1. \(\Leftarrow\)) Suppose \(z\in W\) and \(z\neq w\), NTP: \(||v-z||\geq ||v-w||\)

    \(||v-z||^{2}=||v-w+(w-z)||^{2}=||v-w||^{2}+||w-z||^{2}+2\text{Re}\lang v-w,w-z\rang\)

    Since \(v-w\perp W\) and \(w-z\in W\), then \(2\text{Re}\langle v-w,w-z\rangle=0\)

    Then \(||v-z||^{2}=||v-w||^{2}+||w-z||^{2}\) and since \(||w-z||^2\geq 0\), then \(||v-z||^{2}\geq||v-w||^{2}\)

    \(\Rightarrow\)) Suppose that \(||v-w||\leq ||v-z||\), \(\forall z\in W\), let's prove that \(v-w\perp W\)

    Then given \(z\in W\), \(||v-w||^{2}\leq||v-z||^{2}=||v-w||^{2}+||w-z||^{2}+2\text{Re}\langle v-w,w-z\rangle\)

    Then \(||w-z||^{2}+2\text{Re}\langle v-w,w-z\rangle\geq 0\), (remark: given \(\gamma\in W\), \(\exists z_\gamma:\gamma =w-z_\gamma\), it's enough to take \(z_{\gamma} =w-\gamma\in W\))

    Now, given \(z\in W\), take \(\gamma = -\frac{\lang v-w,w-z\rang}{||w-z||^{2}}\cdot(w-z)\), then \(2\text{Re}\langle v-w,-\frac{\langle v-w,w-z\rangle}{||w-z||^{2}}\cdot(w-z)\rangle +||\frac{\lang v-w,w-z\rang}{||w-z||^2}(w-z)||^2\geq 0\)

    Then \(-2\text{Re}\frac{\overline{\langle v-w,w-z\rangle}}{||w-z||^{2}}\cdot\langle v-w ,w-z\rangle+\frac{\langle v-w,w-z\rangle}{||w-z||^{2}}\cdot\frac{\overline{\langle v-w,w-z\rangle}}{||w-z||^{2}} \cdot||w-z||^{2}\geq0\)

    Then \(-2\frac{|\langle v-w,w-z\rangle|^{2}}{||w-z||^{2}}+\frac{|\langle v-w,w-z\rangle|^{2}}{||w-z||^{2}} \overline{}\geq0\), then \(-\frac{|\langle v-w,w-z\rangle|^{2}}{||w-z||^{2}}\overline{}\geq0\)

    Then \(\frac{|\langle v-w,w-z\rangle|^{2}}{||w-z||^{2}}=0\Rightarrow|\langle v-w,w-z\rangle |^{2}=0\Rightarrow\langle v-w,w-z\rangle=0\)

    Since \(w-z\) is any vector in \(W\) can be written as \(w-z\) for some \(z\), we get that \(\lang v-w,t\rang =0,\forall t\in W\)

  2. Suppose \(w\) and \(w'\) are vectors in \(W\) such that \(v-w\perp W\) and \(v-w'\perp W\)

    Then \(||v-w^{\prime}||^{2}=||v-w^{\prime}-w+w||^{2}=||v-w||^{2}+||w-w^{\prime}||^{2}+2 \text{Re}\langle v-w,w-w^{\prime}\rangle=||v-w||^{2}+||w-w^{\prime}||^{2}\)

    Since both are best approximation, then \(||v-w'||^2\geq ||v-w||^2\) and \(||v-w||^2\geq ||v-w'||^2\), then \(||v-w||^2=||v-w'||^2\)

    Then \(||w-w'||^{2}=0\), then \(w-w'=0_V\), then \(w=w'\)

  3. Suppose \(\dim W<\infty\), let \(\{v_1,...,v_r\}\) be an orthonormal basis of \(W\)

    Given \(v\in V\), let \(w=\sum_{i=1}^{r}\langle v,v_{i}\rangle v_{i}\), we will prove that \(w\) is the best approximation to \(v\) in \(W\)

    By 1), NTP \(v-w\perp W\). Given \(z\in W\), \(z=\sum^{r}_{i=1}\lang z,v_{i}\rang\cdot v_{i}\)

    Then \(\langle v-w,z\rangle=\langle v-\sum_{i=1}^{r}\langle v,v_{i}\rangle v_{i},\sum_{i=1} ^{r}\langle z,v_{i}\rangle\cdot v_{i}\rangle=\langle v,\sum_{i=1}^{r}\langle z,v_{i} \rangle\cdot v_{i}\rangle-\sum_{i,j=1}^{r}\langle v,v_{i}\rangle\overline{\langle z,v_{j}\rangle} \langle v_{i},v_{j}\rangle\)

    \(=\sum_{i=1}^{r}\overline{\langle z,v_{i}\rangle}\langle v,v_{i}\rangle-\sum_{j=1} ^{r}\langle v,v_{j}\rangle\overline{\langle z,v_{j}\rangle}=0\) since orthonormal, then \(\langle v_{i},v_{j}\rangle=1\iff i=j\)