maanantai 6. huhtikuuta 2020

Jean-Babtiste Joseph Fourier: Analysis of determined equations (1827)

One of the parts of mathematics most easily applicable to practice is the study of equations. You need to find out a certain quantity or certain quantities, and you know that it has or they have a certain relation to other quantities. Finding out what these unknown quantities are means solving the equation.

Although a layman might think that mathematics should always give exact solutions to such problems, it is quite obvious that whether and in what magnitude giving exact solutions is possible is a question that does not always have a clear cut answer. Take such a simple equation like x2 = 2. We know that the solution to this problem cannot be expressed as a ratio of whole numbers. Still, despite the objections Pythagoreans would have had, we are usually accustomed to say that solutions involving roots are precise - at least they tells us that the relationship that the searched for quantity has to known quantities, even if we can express the numeric value of the former only approximately.

It has been long known that for some, relatively simple equations, such relatively exact answers can be found, if there just is an answer to be found. Let’s take a case where we are searching for a single unknown quantity, with a relation to zero, describable in terms of such simple calculations as sum, multiplication and squaring:  x2 + ax + b = 0 (the so-called quadratic equation). There’s a simple enough formula for solving such equations, using again only very simple operations - addition, subtraction, multiplication, division, squaring and square roots.

We know that the formula for quadratic equation will give us two, one or none solutions - the last option occurs, when the formula would involve a square root of a negative number, something that is usually an impossibility, when applying mathematics in more concrete fields, although we can construct an abstract system with such square roots of negative numbers (the so-called imaginary numbers). We also know that the solutions revealed by the formula are all the solutions the equation could have, and we can even represent this geometrically: the equation describes a figure known as parabola, which can cut one of the axis of coordinate system twice, touch it at one point or then not cut or touch it at all.

The situation becomes somewhat more complicated when we allow exponents larger than 2 in the equations, that is, when we deal of general polynomial equations of the type xn + axn-1 + bxn-2  + … + rx + s = 0, where the highest exponent is called the degree of the polynomial. We do know something general about the solutions of such equations. If the n is odd, the figure described by the polynomial function xn + axn-1 + bxn-2  + … + rx + s is like a rising line: with very large, but negative values of x, the result of the polynomial is negative, while with very large, positive values of x, the result is positive. If n is even, the figure resembles parabola, where large values of x, whether positive or negative, produce positive results. The only difference is that with larger exponents, the figures might have more bends - the maximal number of bends in the figure is always n - 1, where n is the degree of the polynomial. This means that the maximal number of solutions for the equation is also the degree of polynomial - every new bend makes one more point of contact with the x-axis of coordinate system possible.

Although the maximum number of solutions of polynomial equation is known, we might not always be exactly sure what these solutions are. With polynomials of degree 3 or 4, a general solution of similar sort as with quadratic equations can be given. Then again, with polynomials of higher degree such a general solution does not - and even more, cannot - exist. We might be able to find the exact solutions sometimes, but there’s no guarantee we could do it always.

Even if a general method for finding exact solutions does not exist, we might still have a method for finding inexact solutions, that is, better and better approximations of the searched for solutions. Such a method of approximation can also be of mathematical nature, because we might have good mathematical reasons to say why a certain method works. A good example is the method invented by Isaac Newton. The basic idea behind Newton’s method is that at small distances a curve is similar to its tangent. Thus, if we have an estimate that is close to the final solution, we can use the tangent at the point of the estimate to get an even better estimate of the solution - just check where the tangent hits the x-axis and you get the new estimate.

The problem with this method is that if the first estimate is not close enough to the real solution, it might take many iterations to get even fairly good approximations. The problem thus becomes how to determine the regions where we should go looking for the A partial answer to this problem is provided by Fourier’s posthumously published work, Analyse des équations déterminées.

Fourier’s starting point is unexpected. He asks us to produce a derivative of the original polynomial, then a derivative of this derivative and so forth, until nothing else is left, but a constant function. The series beginning from the constant function and ending with the original function has n +1 members. What has this series of derivatives to do with the solutions of the original equation? Well, consider the results of the polynomial and the series of derivatives for very large negative numbers. The final constant function is always positive, the result of the next derivative in the series - a polynomial of degree 1 - is negative for very large negative values of x, while the next derivative - a polynomial of degree 2 - has with these values positive results. Generally, the polynomials of odd degree in this series have negative results for very large negative values of x, while polynomials of even degree have positive results. In other words, with these large negative values of x, a result of the function in the series is always of different sign than the result of its derivative, which means that the sign of the result changes throughout the series n times.

By itself, this result seems quite meager, but some further reflections show its importance. Firstly, checking what happens with very large positive values of x, we notice that the original polynomial and all the derivatives of the series have positive results, which means that the series has no sign variations. All the sign variations have vanished when moving from very large negative to very large positive numbers. Indeed, the only point when the number of sign variations may change is with those values of x, when the original polynomial or one of derivatives in the series produces zero - either the function producing zero cuts x-axis at that point and its sign changes when moving through, or then it just touches x-axis and the sign of its derivative changes.

A further important point is that the number of sign variations can never increase. If the function changes from positive to negative near a certain value of x, then the function is diminishing and its derivative must be negative near the same value of x, and if the function changes from negative to positive, then the function is growing and its derivative must be positive. Thus, supposing that the function changing the sign is also a derivative of another function of the series and thus in the middle of two other functions, the change of its sign can never increase the number of sign variations. For instance, if in one part of the series the signs are + for f’’(x), - for f’(x) and - for f(x) (with one sign variation between them), the change of the sign of the middle term changes the series into +, + and - (again with one sign variation between them). Then again, if the series was at first, +, - and + (with two sign variations), after the sign change it will be +, + and + (with no sign variation).

In effect, then, if we take two different values of x, at the smaller value the series of derivatives of the polynomial cannot have less sign variations than at the larger value. In fact, if we consider the difference between the sign variations at these two different places, this difference gives the maximum number of places between these values of x, at which the result of the polynomial will be 0 (roots of the polynomial, as they are called).

Of course, this method provides us only with a maximum number of possible roots of the polynomial between two values of x, and the interval might actually contain a lot less of roots. Still, with systematic division of such intervals - and few tricks Fourier uses to weed out intervals, which really contain no roots, despite the number of sign variations - it is possible to pick out certain intervals where the searched for solutions lie. The next step in Fourier’s method is then simply to use Newton’s method to approximate the solution found within a certain interval. The whole procedure is then strictly mathematical, although the result might never be truly exact - we can even count, Fourier notes, how close our approximations are to the real solution.