Categories
Engineering Curriculum

When to introduce the matrix exponential?

In the study of linear systems, a familiar relationship is the homogeneous state-space equation \dot{\mathbf{x}}=A(t)\mathbf{x}(t), where \mathbf{x}(t) is an n-vector, and A is an n \times n matrix. The time-invariant solution, (i.e., when A is a constant matrix), is \mathbf{x}(t) = e^{At} \mathbf{x}_0. When this subject is first introduced, the solution is often assumed, rather than derived.

The thinking is that since the solution to the homogeneous scalar equation is x(t) = e^{at} x(0), then students will willingly accept a matrix-friendly equivalent that solves the state-space differential equation. So the definition for the exponential matrix is given, and is shown to work for the homogeneous case:

\begin{aligned} \dot{\mathbf{x}}(t) & = \frac{d}{dt} \left( e^{At} \mathbf{x}_0 \right) \\ & = \frac{d}{dt} \left( e^{At} \right) \mathbf{x}_0 + e^{At} \frac{d}{dt} \left( \mathbf{x}_0 \right) \\ & = A e^{At} \mathbf{x}_0 + e^{At} \left( 0 \right) \\ & = A e^{At} \mathbf{x}_0 \\ & = A \mathbf{x}(t) \end{aligned}

It seems to me that this presentation sequence, however, masks what is really going on with the system; that there is an infinite recursion on the initial state, \mathbf{x}_0, that converges to a value for \mathbf{x}(t):

\begin{aligned} \mathbf{x}(t) & = \mathbf{x}_0 + A \int_0^t \mathbf{x}(\tau)\, d\tau \\ & = \mathbf{x}_0 + A \int_0^t \left[ \mathbf{x}_0 + A \int_0^t\mathbf{x}(\tau)\, d\tau \right]\,d\tau \\ & = \mathbf{x}_0 + A \int_0^t \left[ \mathbf{x}_0 + A \int_0^t \left[ \mathbf{x}_0 + A \int_0^t\mathbf{x}(\tau)\, d\tau \right] d\tau \right]\,d\tau \end{aligned}

This recursion obviously repeats ad infinitum. However, the matrix exponential can now be defined by collecting terms on the right hand side, leading to:

\begin{aligned} \mathbf{x}(t) & = \left[ \mathbf{I}_n + At + \frac{1}{2!} \left( At \right)^2 + \dots \right] \mathbf{x}_0 \\ & = e^{At} \mathbf{x}_0 \end{aligned}

Presented in this order, the exponential matrix is developed based on system response, rather than the other way around. This strikes me as being easier to comprehend than “guessing” that some seemingly arbitrary function might solve the problem. Is this conceptually easier for anyone else?

Categories
Mathematics

Mathematicians write like novelists

As I attempt to teach myself something about stochastic calculus, I have been reading a great many articles, and several textbooks, on the subject. It has left me with the distinct notion that mathematicians hate to spill the plot too early in the story. Each proof builds upon clues that have been scattered throughout the text, just as a writer might have the butler passing down a hallway in the second chapter for no apparent reason. Lemmas and sub-theorems that, to all outward appearances, are wholly unrelated to the general theme begin to appear. Then slowly, sometimes painfully, these logical manipulations are pulled together as the proof draws to a conclusion. But even so, the mathematician is reluctant to come out and say, “the butler did it.” Rather, phrases like “it is clearly obvious,” and “it is easily proven” are used to inform the reader that the point of mathematics is the mental challenge of figuring out how the pieces fit together.

There is no enjoyment in reading a novel that lays out the entire plot in the first paragraph. So a plot is simply a literary construct used by authors to evoke emotion, just as sculptors do with statues, and dancers with physical movement. Likewise, spelling out every mathematical truism and trick would make a proof lengthy and boring, with no pleasure left to be savored, right? So I think that mathematicians must write proofs (whether intentionally or not) in a manner intended to allow other mathematicians to sense the thrill of unraveling a logistic knot.

Sometimes I enjoy this artistry. Today I do not. 🙁