Categories
Engineering Curriculum

When to introduce the matrix exponential?

In the study of linear systems, a familiar relationship is the homogeneous state-space equation \dot{\mathbf{x}}=A(t)\mathbf{x}(t), where \mathbf{x}(t) is an n-vector, and A is an n \times n matrix. The time-invariant solution, (i.e., when A is a constant matrix), is \mathbf{x}(t) = e^{At} \mathbf{x}_0. When this subject is first introduced, the solution is often assumed, rather than derived.

The thinking is that since the solution to the homogeneous scalar equation is x(t) = e^{at} x(0), then students will willingly accept a matrix-friendly equivalent that solves the state-space differential equation. So the definition for the exponential matrix is given, and is shown to work for the homogeneous case:

\begin{aligned} \dot{\mathbf{x}}(t) & = \frac{d}{dt} \left( e^{At} \mathbf{x}_0 \right) \\ & = \frac{d}{dt} \left( e^{At} \right) \mathbf{x}_0 + e^{At} \frac{d}{dt} \left( \mathbf{x}_0 \right) \\ & = A e^{At} \mathbf{x}_0 + e^{At} \left( 0 \right) \\ & = A e^{At} \mathbf{x}_0 \\ & = A \mathbf{x}(t) \end{aligned}

It seems to me that this presentation sequence, however, masks what is really going on with the system; that there is an infinite recursion on the initial state, \mathbf{x}_0, that converges to a value for \mathbf{x}(t):

\begin{aligned} \mathbf{x}(t) & = \mathbf{x}_0 + A \int_0^t \mathbf{x}(\tau)\, d\tau \\ & = \mathbf{x}_0 + A \int_0^t \left[ \mathbf{x}_0 + A \int_0^t\mathbf{x}(\tau)\, d\tau \right]\,d\tau \\ & = \mathbf{x}_0 + A \int_0^t \left[ \mathbf{x}_0 + A \int_0^t \left[ \mathbf{x}_0 + A \int_0^t\mathbf{x}(\tau)\, d\tau \right] d\tau \right]\,d\tau \end{aligned}

This recursion obviously repeats ad infinitum. However, the matrix exponential can now be defined by collecting terms on the right hand side, leading to:

\begin{aligned} \mathbf{x}(t) & = \left[ \mathbf{I}_n + At + \frac{1}{2!} \left( At \right)^2 + \dots \right] \mathbf{x}_0 \\ & = e^{At} \mathbf{x}_0 \end{aligned}

Presented in this order, the exponential matrix is developed based on system response, rather than the other way around. This strikes me as being easier to comprehend than “guessing” that some seemingly arbitrary function might solve the problem. Is this conceptually easier for anyone else?