The most common way of computing numerical derivative of a function
at any point
is to approximate
by some polynomial
in the neighborhood of
. It is expected that if selected neighborhood of
is sufficiently small then
approximates
near
well and we can assume that
.
Let’s consider this approach in details (or go directly to the table of formulas).
At first, we sample
at the
(
is odd) equidistant points around
:
,
where
is some step.
Then we interpolate points
by polynomial of
degree:
. Its coefficients
are found as a solution of system of linear equations:

By our assumption
can be approximated by the derivative of the constructed interpolating polynomial:

To illustrate the process let’s consider case when
. Here interpolating parabola are defined by the system (we assume
for simplicity):

After subtracting first equation from the last we get final result:

In the same way we can obtain expressions for any
. Formulas for
are listed below:
These expressions are very widely used in numerical analysis and commonly refered as central(finite) differences. As it can be clearly seen they have simple anti-symmetric structure and in general difference of
-th order can be written as:
,
where
are coefficients derived by procedure described above.
It is easy to see that if
is a polynomial of a degree
, then central differences of order
give precise values for
derivative at any point. This follows from the fact that central differences are result of approximating
by polynomial. If
is a polynomial itself then approximation is exact and differences give absolutely precise answer.
To differentiate a digital signal
we need to use h=1/SamplingRate and replace
by
in the expressions above. In this case derivative of a signal
is found by

Frequency response of central differences is:

Magnitude responses for
are drawn below:
Red dashed line is the magnitude response of an ideal differentiator
. As
grows
moves closer to ideal differentiator response (by increasing tangency order with
at
).
In practice there is no need for ideal differentiators because usually signals contain noise at high frequencies which should be suppresed. This means that response of practical differentiator should be close to
on some passband interval
and should tend to zero in the upper part of Nyquist interval
.
From the plot we can see that central differences don’t resemble such behavior, all they care about is to get as closer as possible to the response of ideal differentiator, without supression of noisy high frequencies. As a consequence they perform well only on exact values, which contain no noise. Different technique is needed for robust derivative estimation of noisy signals. Such approaches are considered in the next sections:




17 Comments
Hi Pavel, great tutorial!
How can this be done for the second derivatives f”(x) and also for the 2D case , especially how can we approximate d^2 f(x,y) / dxdy with different orders of approximation
Thank you!
Thanks, Kem.
1.
is very simple to derive. We use the same interpolating polynomial and assume that
. Second order central differences derived this way are:
3. Third order central differences are:
2. Estimation of the mixed second order derivative
is a little more elaborate but still follows the same idea. In this case we should build interpolating polynomial over the 2D grid
and find its mixed derivative at the point
assuming that it is approximately equal to
.
Here are reference formulas for mixed second order differences:
For the
order of approximation:
For the
order of approximation:
anyone know about the derivation of the third order central differences formula?
Hi Pavel –
Do you have a published reference for the O(h^4) approximation for the mixed 2nd partial derivatives? Thanks.
Here is formula for the
order of approximation:
Hi Pavel -
Thanks for your quick reply. I read that you had posted that same formula earlier … in fact, it’s the earlier post that prompted me to write. I’m looking for a published reference (book or journal article) that contains the formula so that I can cite the reference in a paper I’m writing. Thanks.
Best regards,
Chris Willy
Well, I’ve derived this formula by myself. I have no idea is it published somewhere or not.
You can reference my site – it would be of huge support for me. Several published papers already include references to my site – so I think there is no problem.
You can tell me more about your task, maybe I can derive more suitable filters for your conditions.
Hello Pavel,
Congratulations for your good and well organised work. I’m not mathematic and I’m writing an algorithm to derivate a discrete function. My set of points are not equidistant, so they have x values that are no constant. I read your advanced work about derivatives for noisy functions and probably I will use it in a future.
For the moment, I will derivate with central differences method. I wonder if it’s easy to extend the method when points are not equidistant. Is it ok to divide by the x increment?
Thanks for your blog
Thank you for your comment!
Central difference extension for irregular spaced data is:
So, yes, difference of function values should be divided by the
Actually it is better to use
from smooth noise-robust differentiators, since they are much more numerically stable even in the absence of noise. Plus they have similar approximation properties to central differences – so you won’t lose any precision and get more stability as a bonus.
Coefficients are for
and for 
. See referenced page for other
.
Let me know about the results please.
Hello again Pavel,
Thank you very much for your fast and good answer.
Really, I have a noisy discrete function (points come from peak calculation of a laser stripe, acquired by a digital camera. I use the laser stripe to enhance profile of objects).
I need to smooth the function for display purpose and I need also to derivate the function in order to find inflexion points, local maxima and other features that I use to do measurements of the object that I’ve enhanced the profile.
At this moment I’m smoothing the function using Savitzky-Golay filter and I apply central differences to the smoothed function. I read your work and I realized that smooth noise-robust differentiators would be really useful to my work, but I don’t know how to use it to smooth the function for display purposes.
Thank you for the description of your project – it is really interesting.
I think your application could benefit from using SNRD filters. I will help you to apply them for smoothing (for visual output) and for finding derivative.
Just let me know what polynomial degree and number of points you are using now in Savitzky-Golay filter. Knowing that info I can build SNRD filters with better properties for your task.
Hello Pavel, I appreciate your help very much.
I’m smoothing my function with a third degree polynomial and I’m using 7 points. Once smoothed I’m applying central differences to calculate first and second derivative.
I’ve tryed to calculate first derivative by using SNRD filters on the Savitzy-Golay smoothed functions, and I’m really loosing important function detail, I’m smoothing too much, I think this is not a good way to proceed.
Thanks again for your interest and best regards from Spain.
Congratulations on the Victory!!
Thank you for the information about your test results.
I have several comments though.
I’ve checked properties of 3rd order 7 point SG and derive some special filters with better properties. I would appreciate if you would try them and give me feedback.
1. For your application I recommend to use following filters for smoothing (for visualization):
where coefficients
Both of them preserve important details better than SG and have soft and guaranteed noise suppression (
2. For derivative estimation I recommend to use
with
Important: do not apply differentiation filters after smoothing filters. This could hurt approximation order (as you experienced with SNRD after SG smoothing).
Apply derivative filter directly on noisy non-smoothed data.
Awesome tutorial.
I’m writing a GPU raytracer and need to estimate the gradient (∂/∂x, ∂/∂y, ∂/∂z) of a distance function (for surface normals). I assume that the function is expensive to evaluate.
Now I’m using central differences – these should get me O(h²) error with seven function samples (the center is always evaluated). I was wondering if I could get the same error order with fewer samples – some kind of sample sharing between the three axes (I’ve tried the tetrahedral arrangement, but it’s even worse than forward differences because of ∂x∂y-like terms).
Thanks.
Actually 7-points central difference has
. Three point stencil is enough to achieve
.
To get the lowest error with the smallest number of points in 3D one should apply 1D stencil along the corresponding axis. Only points along the required axis play major role for approximation order, others can be used for other goals (like noise suppression).
P.S.
I like your site. Some of your projects made me nostalgic about old good DOS-hacking times. Trees were taller and grass was greener back then…….
Thanks for the quick answer and for visiting my site
.
What I originally meant was using the three-point stencil along every axis (which makes 6 evaluations in total + one in the center for other unrelated purposes).
I’ve finally settled for 4-point forward differences (1+9 evaluations in total with error O(h³)):
.
Oh, sorry for misunderstanding. Glad you have found suitable solution.
Maybe I should include one sided differences in the tutorial.