Convergence of Sequences (2)

Multidimensional sequences

In Convergence of Sequences (1) we focused on scalar sequences.

Today we will study the local stability of nonlinear multidimensional sequences defined by recursions like

\[x_{t} = f(x_{t-1}),\quad t=1,2,\dots\] where \(f\) is a nonlinear function \(f:\mathbb{R}^n \to \mathbb{R}^n\) and \(x_0\) is an initial vector in \(\mathbb{R}^n\).Assume there is a stationary point \(\overline{x}\) such that \[f(\overline{x})=\overline{x}\]

We want to understand the behavior of \(x_t\) as \(t\to\infty\) when we start near \(\overline{x}\).

At first order, we can study the linearized dynamics around \(\overline{x}\): \[y_t = A y_{t-1},\quad A = f'(\overline{x}),\quad y_t = x_t - \overline{x}.\]

Linearized Dynamics

The recursion \(y_t = A y_{t-1}\) has a closed form solution \[ y_t = A^t\,y_0. \]

Two questions drive the dynamics:

Does \(A^t \to 0\) (stability) or does it grow (instability)?
Can \(\lVert A^t\rVert\) exhibit transient amplification even if the system is asymptotically stable?

The answers depend on how we measure “size” and on whether \(A\) is normal (commutes with \(A^*\)) or non-normal.

Operator norm

We use here the following norms:

\(\|x\| = \|x\|_2 = \sqrt{\sum_i |x_i|^2}\): euclidean norm for vectors
- it can be defined from the scalar product \(\langle x,y\rangle = x^*y\) (where \(x^*\) is the conjugate transpose of \(x\))
- \(\|x\|_2 = \sqrt{\langle x,x\rangle}\)
\(\|A\| = \|A\|_2 = \max_{\|x\|_2=1} \|Ax\|_2\): (induced) operator norm for matrices
- The operator norm, provides us with a worst-case measure of how much \(A\) can stretch a vector in one step \(\|Ax\|_2 \le \|A\|_2 \|x\|_2\)

It yields a simple estimate of the convergence rate: \[\frac{\|y_t\|}{\|y_{t-1}\|} = \frac{\|A y_{t-1}\|_2}{\|y_{t-1}\|_2} \le \|A\|_2.\]

Spectral radius: the asymptotic growth rate

Definition (spectral radius): \[ \rho(A) := \max\{ |\lambda| : \lambda \in \sigma(A)\}. \]
Stability (discrete time): \[ \rho(A) < 1 \quad\Longleftrightarrow\quad A^t \to 0\ (t\to\infty). \]
Intuition: eigenvalues give the long-run scaling of invariant “modes” (exactly for diagonalizable/normal matrices; asymptotically for all).
But: eigenvalues alone can miss large short-run growth when \(A\) is non-normal.

Transitory Dynamics Can Differ from Asymptotic Dynamics

A canonical 2×2 example: \[ A = \begin{bmatrix}0.9 & M\\ 0 & 0.9\end{bmatrix},\quad M\gg 1. \]

Eigenvalues are both \(0.9\) so \(\rho(A)=0.9<1\).
But for large \(M\), \(\lVert A\rVert_2\) can be much larger than 1, producing sizeable short-run growth.

We have the closed form:

\[A^k = \begin{bmatrix}0.9^k & k M 0.9^{k-1}\\ 0 & 0.9^k\end{bmatrix}\]

And \(\|A^k\|_2\) grows like \(k M 0.9^{k-1}\) for large \(M\) and small \(k\), before eventually decaying to zero as \(k\to\infty\).

Gelfand’s formula

Even though the spectral radius is not a norm ¹, it is related to matrix norms in the following way.

For any matrix norm \(\|\cdot\|\) (including the matrix norm), we have the following formula for the spectral radius: \[ \boxed{\ \rho(A) = \lim_{k\to\infty} \lVert A^k\rVert^{1/k}\ } \]

Interpretation

The quantity \(\lVert A^k\rVert^{1/k}\) is an “average per-step growth factor” over \(k\) steps:
- suppose there is growth \(g\) such that \(g^k = \lVert A^k\rVert\), then \(g=\lVert A^k\rVert^{1/k}\).
As \(k\) increases, it filters out transients and converges to the true asymptotic rate \(\rho(A)\).

Remark

This formula is very useful for theoretical insights.
In practice computing matrix norms is expensive, and computing matrix exponents is also expensive, so we typically use eigenvalue-based methods to estimate \(\rho(A)\) for large matrices.

Power iteration: estimating the spectral radius

The classic power iteration targets the eigenvalue of largest modulus, i.e. it can estimate \(\rho(A)\) under standard conditions (dominant eigenvalue, start vector not orthogonal to its eigenvector).

Algorithm (basic form)

Choose a nonzero vector \(v_0\).
Iterate: \(w_{m+1} \leftarrow A v_m\), then normalize \(v_{m+1}\leftarrow w_{m+1}/\lVert w_{m+1}\rVert_2\).
Track the Growth ratio: \(\lVert A v_m\rVert_2 / \lVert v_m\rVert_2\)

Convergence intuition

If \(|\lambda_1|>|\lambda_2|\) and \(A\) is diagonalizable, then \(v_m\) aligns with the dominant eigenvector and the estimates converge to \(|\lambda_1|=\rho(A)\).
If the dominant eigenvalue is not unique in modulus (e.g. a complex conjugate pair), the iterates may oscillate.

Recap: Spectral Radius and Operator Norm

Spectral radius \(\rho(A)\)
- Governs asymptotic stability and long-run exponential rate.
- Insensitive to how non-orthogonal the eigenvectors are.
Operator norm \(\lVert A\rVert_2\)
- Exact worst-case one-step state amplification: \(\max_{\|x\|_2=1}\|Ax\|_2\).
- Most conservative one-step bound.
Relation between both: \(\rho(A) \le \|A\|_2\)
- \(\rho(A)\) can be much smaller than \(\|A\|_2\)
- For a normal matrix \(A\): \(\rho(A)=\|A\|_2\).
  - normal matrices commute with their conjugate transpose (\(A A^* = A^* A\)), and include symmetric, Hermitian, and unitary matrices.
  - symmetric, hermitian, unitary matrices are all normal
- A matrix can satisfy \(\rho(A)=\|A\|_2\) without being normal

Nonlinear Sequences: Contraction Mappings

Now we want to study the convergence of nonlinear sequences defined by recursions like:

\[x_{n+1} = f(x_n),\quad n=0,1,2,\dots\]

with steady-state \(\overline{x}\) such that \(f(\overline{x})=\overline{x}\).

Contraction mapping: \(f\) is a contraction if \(\exists L<1\) s.t. \[ \|f(x)-f(y)\| \le L\|x-y\|. \]
Banach fixed point theorem: if \(f\) is a contraction on a complete metric space, then \(f\) has a unique fixed point and iterating \(f\) converges to it at a linear rate.

Nonlinear Sequences: Convergence Rates

Denote by \(A=f'(\overline{x})\) the Jacobian matrix of \(f\) at \(\overline{x}\).

At which condition on \(A\) can we guarantee local convergence of \(f\) to \(\overline{x}\)? At which rate?

If \(\|A\|_2<1\)
- for any \(\epsilon\) such that \(\|A\|_2+\epsilon<1\), by continuity of \(f'\) there is a neighborhood around \(\overline{x}\) where \(\|f'(x)\|_2 \le \|A\|_2 + \epsilon < 1\), so \(f\) is a contraction in that neighborhood and local convergence follows by the Banach fixed point theorem.
- in particular asymptotic convergence rate is at most \(\|A\|_2+\epsilon\)
If \(\rho(A)<1\) (weakest condition)
- local asymptotic convergence is guaranteed
- asymptotic convergence rate is at most \(\rho(A)+\epsilon\) for any \(\epsilon>0\)

Going Further

Insteady of studying stability of \(\overline{x}\) (that is for any \(x_0 \in \mathbb{R}^n\)), we can also characterize all the different dynamics we can obtain for different initial \(x_0\) (read different initial shocks to the economy).

We can define two sets of initial conditions:

the stable manifold: \(W=\{x : \lim_{n\to\infty} f^n(x) = \overline{x}\}\)
the unstable manifold: \(U=\{x : \lim_{n\to\infty} f^n(x) \neq \overline{x}\}\)

Even when \(\rho(A)>1\), local dynamics of \(f\) can be characterized by \(A=f'(\overline{x})\)

\(\implies\) the stable eigenspace of \(A\) approximates the stable manifold of \(f\)
and the unstable eigenspace of \(A\) approximate the unstable manifold of \(f\).

\(\implies\) this is formalized in the stable manifold theorem

Appendix

The spectral radius is not a norm

Power iteration convergence analysis

Proof of Gelfand’s formula

Proof of the Banach fixed point theorem

The Numerical Radius

Definition (numerical radius):

\[ w(A) := \max_{\|x\|_2=1} |\langle Ax,x\rangle|. \]