Numerical differentiation

Numerical differentiation#

REFS:

Ward; Kincaid; Numerical mathematics and computing
Wikipedia

Computing derivatives is the core of many of the numerical work done in computational physics. But it is not a trivial problem to do so with good accuracy. Numerical derivatives are the core computation in finite difference methods, optimization problems, interpolations and many more. For instance, you need numerical derivatives to compute:

your speed based on position and time. This is done with your gps (google maps, waze, etc), or the tracker software.
the rate of change of any quantity that changes in time: speedometers, weather models, finance, etc.
solutions for partial differential equations using “finite differences” https://en.wikipedia.org/wiki/Finite_difference?useskin=vector.
solutions for ordinary differential equations.

First approach: forward difference#

The Taylor expansion for a given analytical function is given by (see an example at https://www.geogebra.org/m/s9SkCsvC)

(1)#\[\begin{equation} f(x+h) = f(x) + h f'(x) + \frac{h^2}{2!} f''(x) + \frac{h^3}{3!} f'''(x) + \dots, \end{equation}\]

If we stop at the second order, we can rewrite it as

(2)#\[\begin{equation} f(x+h) = f(x) + h f'(x) + O(h^2), \end{equation}\]

where \(O(h^2)\) means that we stop expanding at the second order term, and , given that \(h\) is small, then the dominant term is of order \(h^2\). This equation can be used to both solve a ODE (where we already know the \(f'(x)\), see https://www.geogebra.org/m/NUeFjm9J), or, to actually compute \(f'(x)\) , as a way to compute the slope (https://en.wikipedia.org/wiki/Numerical_differentiation):

(3)#\[\begin{equation} f'(x) \simeq \dfrac{f(x+h) - f(x)}{h} + O(h), \end{equation}\]

which shows that, mathematically, the error should decerease as \(h\) (but, in the computer, there are more error sources). This approximation basically estimates the derivative using a linear interpolation, and assumes that you can compute \(f(x)\) for arbitrary points.

Implementing it in C++#

Think about a function declaration to implement the forward difference. What data type to return, what arguments? specially, should the function to be derived be an argument? For instance, if you want to derive the \(\sin\) function, you can use something that returns

...
return (\sin(x+h) - \sin(x))/h;

That would work for that particular function, but if we want to derive other functions, then we would need to write more functions like this, maybe playing with the name to avoid name collisions. When programming, it is important to be generic. So it is much better to think about using a generic function \(f\) that receives a double and returns a double.

A better estimation with less error: Central difference#

We have used the forward derivative but there is nohing special about it. What if we played with the expansions both forward and backward?

If you compute a forward (\(+h\)) expansion, you have

(4)#\[\begin{equation} f(x+h) = f(x) + f'(x) h + f''(x) h^2/2 + f'''(x) h^3/6 + O(h^4) + O(h^5), \end{equation}\]

and a backward (\(-h\)) Taylor expansion,

(5)#\[\begin{equation} f(x-h) = f(x) - f'(x) h + f''(x) h^2/2 - f'''(x) h^3/6 + O(h^4) - O(h^5), \end{equation}\]

and then you subtract the second from the first (notice that adding then allows you to compute the second order derivative), you get

\[f'(x) \simeq \frac{f(x+h) - f(x-h)}{2h} + O(h^2)\]

which is called the central difference. The order is improved and this version is better than the simple forward difference. See: https://en.wikipedia.org/wiki/Finite_difference

Comparison among methods	Dependence on \(h\)

Other important cases#

Interpolating data and then estimating the derivative
Derivative of noisy data: first compute binned averages and then derive.
Higher order derivatives
N-dimensional derivatives.

Implementing forward and central derivatives#

Implement functions to compute both the forward and central derivatives.

The prototypes might be

double forward_diff(double x, double h);
double central_diff(double x, double h);

Use modular implementation.

Error a as function of \(h\)#

Now fix \( = 1.454335\), and compute (using another main function) the relative error as a function of \(h \in [10^{-16}, 10^{-15}, 10^{-14}, \ldots, 10^{-3}, 10^{-2}, 10^{-1}]\). Plot it. Explain the differences among forward and central results.

You can plot using the following python script, but you have to adapt it and improve it:

import matplotlib.pyplot as plt
import numpy as np
# import seaborn as sns
# sns.set_context("poster")

# read data
x, y = np.loadtxt("data.txt", unpack=True)

# plot 
fig, ax = plt.subplots()
ax.plot(x, y, '-o', 'mylabel')
ax.legend()
fig.savefig('deriv.pdf')

File modularization#

Now try to modularize the code. Create headers and implementation files called derivatives.h and derivatives.cpp, where you are going to put declarations and implementations for both the forward and the central derivative methods. What happens now? where should the function to be derived go? We need a more generic approach. Use using fptr = double(double) as a function pointer, which is a construct that represents what a function returns and receives. The ideal is to use the functions as parameters

using fptr = double(double); // pointer(c++11) to a function which receives a double and returns a double
double forward_diff(double x, double h, fptr fun);
double central_diff(double x, double h, fptr fun);

We can even go further and try to use C++ templates, so we will have

template <class funptr>
double forward_diff(double x, double h, funptr fun);
template <class funptr>
double central_diff(double x, double h, funptr fun);

Or even use functors or std::function. For now let’s just keep this simple with using.

Testing a first implementation#

Compute the derivative as a function of \(x\), with fixed \(h = 0.01\). Use \(f(x) = \sin(x^2)\). Compare with the exact derivative. Plot the relative difference.

Error as a function of both \(h\) and \(x\)#

Is the error dependent on \(x\) also? explore variations for both \(x\) and \(y\) and plot in a 3d picture.

Richardson extrapolation#

By using Richardson extrapolation, you can improve even more the central difference estimation,

(6)#\[\begin{equation} f'(x) = \phi\left(\frac{h}{2}\right) + \frac{1}{3}\left[\phi\left(\frac{h}{2} \right) - \phi(h) \right] + O(h^4), \end{equation}\]

where \(\phi(h)\) is the derivative estimation from the central difference with a given \(h\) value. NOTE: This form is only for the central difference, since it depends on the algorithm order.

Implementing Richardson extrapolation#

Now, use the general expression for Richardson extrapolation applied to an algorithm of order \(O(h^\alpha)\), and also plot its relative differences for fixed \(x\), (see https://en.wikipedia.org/wiki/Richardson_extrapolation?useskin=vector)

(7)#\[\begin{equation} I \simeq \dfrac{t^\alpha I(\frac{h}{t}) - I(h)}{t^\alpha - 1}, \end{equation}\]

where \(t\) is a factor larger than 1 (tipically \(t=2\)), \(\alpha\) is the algorithm order, and \(I\) is the value you are computing. This implies additions to both headers and implementation files, and a new main.

As you can see, the Richardson implementation is very similar for both the central and the forward algorithms. Maybe there is some hope to decrease this duplication. Think about a new using ... construct. Implement it and plot all errors as a function of \(h\)

Modularizing the files with Richardson extrapolation#

Until now, we have used the simple using fptr = double(double) for modelling a generic function to be derived. In the case of Richardson extrapolation, we have two generics:

the actual function to be derived
the actual algorithm to be used (central, forward, etc) Our simple approach does not work in this case. We cannot use using inside using. Also, our approach does not work with other kind of functions, like templates or lambdas. Therefore, we need to use a more generic approach. In this case, using std::function, which is a generalization of an object behaving like a function (you need to #include <functional>), we can solve this and modularize our code.

Our files will be like (fill in the // todo)

derivatives.h: To store the declarations

#pragma once

#include <functional>
#include <cmath>

using fptr = std::function<double(double)>;

double deriv_central(fptr f, double x, double h);
double deriv_forward(fptr f, double x, double h);

// notice the fptr inside the algptr
using algptr = std::function<double(fptr, double, double)>;

double richardson(fptr f, double x, double h, algptr alg, int n); // n is the algorithm order

derivatives.cpp:

#include "deriv.h"

double deriv_central(fptr f, double x, double h) {
	// Central difference formula
    // todo
}

double deriv_forward(fptr f, double x, double h) {
	// Forward difference formula
    // todo
}

double richardson(fptr f, double x, double h, algptr alg, int n) // n is the algorithm order
{
    // todo
}

and our main function file main_deriv.cpp

#include <iostream>

#include "deriv.h"

double fun(double x) {
	return x * x*std::sin(x);
}

int main(int argc, char **argv) {
    std::cout.precision(16);
    std::cout.setf(std::ios::scientific);
	std::cout << deriv_forward(fun, 1.0, 0.01) << std::endl;
	std::cout << deriv_central(fun, 1.0, 0.01) << std::endl;
	std::cout << deriv_central([](double x){ return x*x*std::sin(x);}, 1.0, 0.01) << std::endl; // works for a lambda function
	std::cout << richardson(fun, 1.0, 0.01, deriv_central, 2) << std::endl;

	return 0;
}

To compile, use wither full compilation (easy but not practical with many cpp files)

g++ -std=c++17 derivatives.cpp main_deriv.cpp

or the modularized one, which can be automated using a Makefile

g++ -std=c++17 -c derivatives.cpp # Produces the object deriv.o
g++ -std=c++17 -c main_deriv.cpp # Produces the object main_deriv.o
g++ -std=c++17 derivatives.o main_deriv.o # links both objects and produces the executable

Testing#

Can you of some test to write for your derivatives? Implement some tests in another main file