5.12 Properties of differentiation, directional derivativesand growth.
Recall that
- A function \(f: \mathbb{R}^2 \rightarrow \mathbb{R}\) is differentiable at \(p = (x_0, y_0)\) if both partial derivatives \(\frac{\partial f}{\partial x}(p)\) and \(\frac{\partial f}{\partial y}(p)\) exist and \(\lim_{(x,y)\rightarrow(x_0,y_0)}\frac{f(x,y) - (\frac{\partial f}{\partial x}(p)(x - x_{0}) + \frac{\partial f}{\partial y}(p)(y - y_{0}) + f(p))}{||(x, y) - p||}= 0\)
- The plane in \(\mathbb{R}^3\) defined by the equation \(z=\frac{\partial f}{\partial x}(p)(x-x_{0})+\frac{\partial f}{\partial y}(p)(y-y _{0})+f(p)\) is called the tangent plane of the graph of \(f\) at the point \((p, f (p)) = (x_0, y_0, f (x_0, y_0))\).
- The vector in \(\mathbb{R}^2\), \(\nabla f (p) = (\frac{\partial f}{\partial x}(p), \frac{\partial f}{\partial y}(p))\) is called the gradient of \(f\) at \(p = (x_0, y_0)\).
- Using the dot product, we may write the equation of the tangent plane as follows: \(z = f(x_{0}, y_{0}) + \nabla f (x_{0}, y_{0}) \cdot (x - x_{0}, y - y_{0})\)
Moreover, if we write \(p = (x_0, y_0)\) and \(x=(x,y)\), this equals \(z=f(p)+\nabla f(p)\cdot(x-p)\)
Then \(\lim_{x\rightarrow p}\frac{f(x)-\left(f(p\right)+\nabla f(p)\cdot\left(x-p)\right)}{||x-p||} =0\)
Let us start by proving that differentiation implies continuity.
Note that both notions (differentiation and continuity) are local notions.
Theorem
Let \(f: \mathbb{R}^2 \rightarrow \mathbb{R}\) be a differentiable function at the point \(p = (x_0, y_0)\). Then \(f\) is continuous at \(p = (x_0, y_0)\).
Proof
We want to show that \(\lim_{(x,y)\to(x_0,y_0)}f(x,y)=f(x_0,y_0)\) with the notation \(x=(x,y)\) and \(p=(x_0,y_0)\)
Since \(f\) is differentiable, then \(\lim_{x\rightarrow p}\frac{f(x)-\left(f(p\right)+\nabla f(p)\cdot\left(x-p)\right)}{||x-p||} =0\), then \(\lim_{x\rightarrow p}f(x)-\left(f(p\right)+\nabla f(p)\cdot\left(x-p)\right)=0\)
Then \(\lim_{x\rightarrow p}f(x)-\left.f(p\right)-\lim_{x\to p}\nabla f(p)\cdot\left(x- p)\right.=0\), then limx→pf(x)−f(p)\=0 iff \(\lim_{x\to p}\nabla f(p)\cdot\left(x-p)\right.=0\) (Note that these are vectors not numbers, we cannot use algebra of limit directly, but it's true)
Using Theorem (Cauchy-Schwartz inequality), then \(0\leq |\nabla f(p)\cdot(x-p)|\leq || \nabla f(p)||\cdot ||(x-p)||\)
Since \(\lim_{x\to p}(x-p)=0\), then \(\lim_{x\to p}||\nabla f(p)||\cdot||(x-p)||=0\), then \(\lim_{x\to p}\nabla f(p)\cdot\left(x-p)\right.=0\)
Remark. As we saw above, the existence of both partial derivatives \(\frac{\partial f}{\partial x}(p)\) and \(\frac{\partial f}{\partial y}(p)\) of a function \(f\) at \(p\) is a necessary condition for the differentiability.
We also know that both partial derivatives at \(p\) may exist but this does not imply that the function is differentiable.
Nevertheless, if these partial derivatives exist in a neighborhood of the point \(p\) and are continuous when considered as functions, then the functions is differentiable!
This is saying the following: if the behavior of the function in the axis-parallel directions is good enough then we know the behavior in any direction!
Theorem
Let \(p \in \mathbb{R}^2\) and \(r > 0\). Assume that the function \(f: B(p,r) \rightarrow \mathbb{R}\) has partial derivatives in \(B(p,r)\) and that the functions
are continuous at \(p\). Then \(f\) is differentiable at \(p\).
Proof
Let \(q=(x,y)\) and \(p=(x_0,y_0)\), then NTP: \(\lim_{q\rightarrow p}\frac{f(q)-\left(f(p\right)+\nabla f(p)\cdot\left(q-p)\right)}{||q-p||} =0\)
We write \(\delta = q-p=(\delta_1,\delta_2)=(x-x_0,y-y_0)\), then \(q=p+\delta\)
So \(\frac{f(q)-f(p)-\nabla f(p)(q-p)}{||q-p||}=\frac{f(q)-f(p)-\nabla f(p)\cdot\delta}{||\delta||} =\frac{f(q)-f(p)-\left(\frac{\partial f}{\partial x}p,\frac{\partial f}{\partial y}p\right)\cdot\left(\delta_{1},\delta_{2}\right)}{\sqrt{\delta_{1}^{2}+\delta_{2}^{2}}}\)
\(=\frac{f(p+\delta)-f(p)-\frac{\partial f}{\partial x}\left(p\right)\delta_{1},\frac{\partial f}{\partial y}\left(p\right)\delta_{2}}{\sqrt{\delta_{1}^{2}+\delta_{2}^{2}}}=\frac{f(x_{0}+\delta_{1},y_{0}+\delta_{2})-f(x_{0},y_{0})-\frac{\partial f}{\partial x}\left(p\right)\delta_{1}-\frac{\partial f}{\partial y}\left(p\right)\delta_{2}}{\sqrt{\delta_{1}^{2}+\delta_{2}^{2}}}\)
\(=\frac{f(x_{0}+\delta_{1},y_{0}+\delta_{2})-f(x_{0}+\delta_{1},y_{0})+f(x_{0}+\delta_{1},y_{0})-f(x_{0},y_{0})-\frac{\partial f}{\partial x}\left(p\right)\delta_{1}-\frac{\partial f}{\partial y}\left(p\right)\delta_{2}}{\sqrt{\delta_{1}^{2}+\delta_{2}^{2}}}\)
\(=\frac{f(x_{0}+\delta_{1},y_{0})-f(x_{0},y_{0})-\frac{\partial f}{\partial x}\left(p\right)\delta_{1}}{\sqrt{\delta_{1}^{2}+\delta_{2}^{2}}} +\frac{f(x_{0}+\delta_{1},y_{0}+\delta_{2})-f(x_{0}+\delta_{1},y_{0})-\frac{\partial f}{\partial y}\left(p\right)\delta_{2}}{\sqrt{\delta_{1}^{2}+\delta_{2}^{2}}}\)
To prove \(f\) is differentiable, we need to prove the limits of both terms is 0
For the first term
\(0\leq\left|\frac{f(x_{0}+\delta_{1},y_{0})-f(x_{0},y_{0})-\frac{\partial f}{\partial x}\left(p\right)\delta_{1}}{\sqrt{\delta_{1}^{2}+\delta_{2}^{2}}}\right|\leq\left |\frac{f(x_{0}+\delta_{1},y_{0})-f(x_{0},y_{0})-\frac{\partial f}{\partial x}\left(p\right)\delta_{1}}{\sqrt{\delta_{1}^{2}}} \right|=\left|\frac{f(x_{0}+\delta_{1},y_{0})-f(x_{0},y_{0})-\frac{\partial f}{\partial x}\left(p\right)\delta_{1}}{\delta_{1}}\right|\)
\(=\left|\frac{f(x_{0}+\delta_{1},y_{0})-f(x_{0},y_{0})}{\delta_{1}}-\frac{\partial f}{\partial x}\left(p\right)\right|\)
Since \(\lim_{\delta_1\to0}\frac{f(x_{0}+\delta_{1},y_{0})-f(x_{0},y_{0})}{\delta_{1}}=\frac{\partial f}{\partial x}\left(p\right)\), then \(\lim_{\delta_1\to0}\left|\frac{f(x_{0}+\delta_{1},y_{0})-f(x_{0},y_{0})}{\delta_{1}} -\frac{\partial f}{\partial x}\left(p\right)\right|=0\)
Thus by squeeze theorem: \(\lim_{\delta_1\to0}\left|\frac{f(x_{0}+\delta_{1},y_{0})-f(x_{0},y_{0})-\frac{\partial f}{\partial x}\left(p\right)\delta_{1}}{\sqrt{\delta_{1}^{2}+\delta_{2}^{2}}}\right |=0\)
For the second term
\(0\leq\left|\frac{f(x_{0}+\delta_{1},y_{0}+\delta_{2})-f(x_{0}+\delta_{1},y_{0})-\frac{\partial f}{\partial y}\left(p\right)\delta_{2}}{\sqrt{\delta_{1}^{2}+\delta_{2}^{2}}}\right |\leq\left|\frac{f(x_{0}+\delta_{1},y_{0}+\delta_{2})-f(x_{0}+\delta_{1},y_{0})-\frac{\partial f}{\partial y}\left(p\right)\delta_{2}}{\delta_{2}}\right|=\left|\frac{f(x_{0}+\delta_{1},y_{0}+\delta_{2})-f(x_{0}+\delta_{1},y_{0})}{\delta_{2}} -\frac{\partial f}{\partial y}\left(p\right)\right|\)
Since \(\lim_{\delta_2\to0}\frac{f(x_{0}+\delta_{1},y_{0}+\delta_{2})-f(x_{0}+\delta_{1},y_{0})}{\delta_{2}} =\frac{\partial f}{\partial x}\left(p+\left(\delta_{1},0\right)\right)\), then \(\lim_{\delta_2\to0}\left|\frac{f(x_{0}+\delta_{1},y_{0}+\delta_{2})-f(x_{0}+\delta_{1},y_{0})}{\delta_{2}} -\frac{\partial f}{\partial x}\left(p\right)\right|=\left|\frac{\partial f}{\partial x}\left(p+\left(\delta_{1},0\right)\right)-\frac{\partial f}{\partial x}\left(p\right )\right|=0\)
Since \(\frac{\partial f}{\partial y}\) is continuous on \(B(p,r)\), then we get \(\lim_{\delta_1\to0}\left|\frac{\partial f}{\partial y}\left(p+\left(\delta_{1},0 \right)\right)-\frac{\partial f}{\partial y}\left(p\right)\right|=\lim_{\delta_1\to0} \left|\frac{\partial f}{\partial y}\left(p+\left(\delta_{1},0\right)\right)\right |-\frac{\partial f}{\partial y}\left(p\right)=0\)
Thus \(\lim_{\delta_1\to0,\delta_2\to0}\left|\frac{f(x_{0}+\delta_{1},y_{0}+\delta_{2})-f(x_{0}+\delta_{1},y_{0})-\frac{\partial f}{\partial y}\left(p\right)\delta_{2}}{\delta_{2}}\right|=0\)
Thus \(\lim_{q\rightarrow p}\frac{f(q)-\left(f(p\right)+\nabla f(p)\cdot\left(q-p)\right)}{||q-p||} =0\)
Remark
Note that \(\frac{\partial f}{\partial y}(x,y)\) being continuous means \(\lim_{(x,y) \rightarrow (x_0,y_0)} \frac{\partial f}{\partial y}(x,y) = \frac{\partial f}{\partial y}(x_0,y_0)\)
\(\iff \lim_{(x,y) \rightarrow (x_0,y_0)} (\frac{\partial f}{\partial y}(x,y) - \frac{\partial f}{\partial y}(x_0,y_0)) = 0\)
Since this limit exists, we get the same result if we approach the point \((x_0,y_0)\) with any curve: we chose for example, the line \(y=y_0\), \(\lim_{(x_0,y_0) \rightarrow (x_0,y_0)} (\frac{\partial f}{\partial y}(x_0,y_0) - \frac{\partial f}{\partial y}(x_0,y_0)) = 0\)
Let \(x = x_0 + \delta_1\)\(\implies \lim_{\delta_1 \rightarrow 0} (\frac{\partial f}{\partial y}(x_0+\delta_1, y_0) - \frac{\partial f}{\partial y}(x_0,y_0)) = 0\)
Examples
-
Any polynomial function in two variables is diff. at any point.
For example, \(f(x,y) = x^3 + y^2\), then \(\frac{\partial f}{\partial x}(x,y) = 3x^2\) and \(\frac{\partial f}{\partial y}(x,y) = 2y\) -
Any rational function \(f(x,y) = \frac{p(x,y)}{q(x,y)}\) with \(p(x,y), q(x,y)\) are polynomials in two variables is diff. at all points except where \(q(x,y) = 0\).
For example, \(f(x,y) = \frac{xy}{x^2+y^2}\)
\(\frac{\partial f}{\partial x}(x,y) = \frac{y(x^2+y^2) - xy(2x)}{(x^2+y^2)^2} = \frac{y(y^2-x^2)}{(x^2+y^2)^2}\) for \((x,y) \neq (0,0)\)
\(\frac{\partial f}{\partial y}(x,y) = \frac{x(x^2+y^2) - xy(2y)}{(x^2+y^2)^2} = \frac{x(x^2-y^2)}{(x^2+y^2)^2}\) for \((x,y) \neq (0,0)\) -
\(f(x,y) = \cos(xy) + x\cos(y)\)
\(\frac{\partial f}{\partial x}(x,y) = -\sin(xy)y + \cos(y)\)
\(\frac{\partial f}{\partial y}(x,y) = -\sin(xy)x - x\sin(y)\)
They are differentiable at any \((x,y) \in \mathbb{R}^2\).
Directional derivatives
When we consider partial derivatives, we were considering derivatives of a function of two variables with respect to preferred directions, the \(x\) and the \(y\) axis.
These derivatives give us information about the changes of the function in these directions.
Take the function \(f (x, y) = x^2 + y^3\)
\(\frac{\partial f}{\partial x}(x, y) = \lim_{h\to 0} \frac{f(x+h, y) - f(x, y)}{h}\) \(= \lim_{h\to 0} \frac{f(x, y) + h(1, 0) - f(x, y)}{h}\)
\(\frac{\partial f}{\partial y}(x, y) = \lim_{h\to 0} \frac{f(x, y) + h(0, 1) - f(x, y)}{h}\)
We can derivate \(f\) in any direction
In general, we want to know the changes of a function with respect to any direction.
For example, if \(f\) represents the temperature of a disc on the plane, we may want to know the variation of the temperature along any direction \(v\) with respect to the time \(t\).
So, we have to analyze the function \(f (p + tv)\) which can be considered as a function of one variable \(g: \mathbb{R}_{\geq 0} \rightarrow \mathbb{R}, \quad g(t) = f(p + tv)\) with respect to the variable \(t\).
For \(p = (x_0, y_0)\) and \(v = (v_1, v_2)\), we have that \(g(t) = f(p + tv) = f(x_0 + tv_1, y_0 + tv_2)\)
Hence, the rate of the temperature change at the point \(p = (x_0, y_0)\) is given by \(g'(0) = \frac{d}{dt}f(p + tv)|_{t=0}\)
It is represented as a tangent vector to the graph of \(f\) at the point \((x_0, y_0, f(x_0, y_0))\) with respect to the direction \(v = (v_1, v_2)\) (in the domain).
As we want to measure quantities, we generally take vectors of length 1, that is unit vectors.
For example, if we take the direction given by the vector \((1,1)\), we take the direction \(v = (\frac{1}{\sqrt{2}}, \frac{1}{\sqrt{2}})\).
For the description of the line passing through the point \((1,1)\) it actually does not matter which vector do we take.
\(L = \{(0,0) + t(1,1)| t \in \mathbb{R}\} = \{t(1,1)| t \in \mathbb{R}\} = \{t(\frac{1}{\sqrt{2}}, \frac{1}{\sqrt{2}})| t \in \mathbb{R}\}\)
Any point \((x, y)\) in the line \(L\) satisfies that \(\begin{cases} x = t + 0 \\ y = y + 0 \end{cases} \implies x = y\)
From another point of view
Definition
Let \(f: \mathbb{R}^2 \rightarrow \mathbb{R}\) be a function and \(v \in \mathbb{R}^2\) a unit vector. The directional derivative of \(f\) at a point \(p\) along the vector \(v\) is
The following theorem tells us how to compute the directional derivatives using the gradient vector.
Theorem
Let \(f: \mathbb{R}^2 \rightarrow \mathbb{R}\) be differentiable function at \(p \in \mathbb{R}^2\). Then for any unit vector \(v \in \mathbb{R}^2\) we have that
That is, the directional derivative is given by the dot product of the gradient \(\nabla f (p)\) at \(p\) with the vector \(v\)
Exercise
Proof
Since \(\frac{\partial f}{\partial v}(p)=\lim_{t\to0}\frac{f(p+tv)-f(p)}{t}\), we compute \(\lim_{t\to 0}\frac{f(p+tv)-f(p)}{t}- \nabla f(p) \cdot v\)
\(= \lim_{t\to 0}\frac{f(p+tv)-f(p) - t \cdot \nabla f(p) \cdot v}{t}\) \(=\lim_{t\to0}\frac{f(p+tv)-f(p)-\nabla f(p)\cdot(tv+p-p)}{t}\)
Let \(tv+p=q\), then \(= \lim_{q\to p} \frac{f(q)-f(p)-\nabla f(p)(q-p)}{||q-p||} = 0\) because f is diff. at p
Corollary
Let \(f: \mathbb{R}^2 \rightarrow \mathbb{R}\) be differentiable function at \(p \in \mathbb{R}^2\). Then for any unit vector \(v \in \mathbb{R}^2\) we have that
Proof: Apply Theorem (Cauchy-Schwartz inequality) since \(||v||=1\)
Remark
The inequality above may be interpreted as follows: If we are standing at a point \((p, f (p))\) in the graph of a function \(f\) then the direction of the gradient points where the function grow (or decay, depends on what is the function) faster.
There is a very useful method for minimizing functions based on this idea which is called the gradient descend.