The Difference Engine – the Charles Babbage machine, not the steampunk novel – is a device for finding successive solutions to polynomial equations by adding up the differences introduced by each term between the successive input values.

This sounds like a fairly niche market, but in fact it’s quite useful because there are a whole lot of other functions that can be *approximated* by polynomial equations. The approach, which is based in calculus, generates a Taylor series (or a MacLaurin series, if the approximation is for input values near zero).

Now, it happens that this collection of other functions includes logarithms:

\(ln(1+x) \approx x – x^2/2 + x^3/3 – x^4/4 + \ldots\)and exponents:

\(e^x \approx 1 + x + x^2/2! + x^3/3! + x^4/4! + \ldots\)and so, given a difference engine, you can make tables of logarithms and exponents.

In fact, your computer is probably using exactly this approach to calculate those functions. Here’s how glibc calculates `ln(x)`

for x roughly equal to 1:

```
r = x - 1.0;
r2 = r * r;
r3 = r * r2;
y = r3 * (B[1] + r * B[2] + r2 * B[3]
+ r3 * (B[4] + r * B[5] + r2 * B[6]
+ r3 * (B[7] + r * B[8] + r2 * B[9] + r3 * B[10])));
// some more twiddling that add terms in r and r*r, then return y
```

In other words, it works out `r`

so that it is calculating `ln(1+r)`

, instead of `ln(x)`

. Then it adds together `r + a*r^2 + b*r^3 + c*r^4 + d*r^5 + ... + k*r^12`

…it does the Taylor series for `ln(1+r)`

!

Now given these approximations, we can combine numbers into probabilities (using the sigmoid function, which is in terms of `e^x`

) and find the errors on those probabilities (using the cross entropy, which is in terms of `ln(x)`

. We can build a learning neural network!

And, more than a century after it was designed, our technique could still do it using the Difference Engine.