Introduction
This page is about numerical differentiation of a noisy data or functions. Its first part briefly summarizes results from the other articles I wrote (see menu on the right) covering standard approaches of computing derivative numerically. Text is self-contained, so you don’t have to read all other articles for understanding.
Description begins with analysis of well known central differences establishing reasons of its bad noise suppression capabilities.
Then I consider differentiators with improved noise suppression derived by classical method of least-squares smoothing. Such filters are known as low-noise Lanczos differentiators or Savitzky-Golay filters. I briefly sketch its properties and main drawback – “wavy” always-non-zero response on noisy high frequencies.
Finally I propose my original filters for numerical differentiation with guaranteed noise suppression and very efficient computational structure (coefficients are fractions with the denominators of power of 2). You can skip directly to the table of final formulas.
Prior methods
Classical approach of computing derivative numerically relies on approximation of
near target point
by some polynomial
. If neighborhood of
is chosen properly then
approximates
well on it and we can assume that
.
Particular methods can be derived by using different approximation technique of
by
in this general framework.
Central differences are obtained by interpolation of
by
. Since interpolation lacks for high-frequencies suppression they work well only for noiseless functions whose values can be computed precisely. In the case of noisy functions/data central differences give inadequate results.
Red dashed line is the response of an ideal differentiator
. More detailed analysis on their properties can be found on central differences page.
To resolve this issue Lanczos differentiators (Savitzky-Golay filters) use smoothing least-squares approximation of
by
to remove noise from the data. As a result such method is much more robust to noise. However due to least-squares nature Lanczos differentiators cannot guarantee complete noise suppression in high frequencies. Its magnitude response doesn’t go smoothly to zero near
. Instead they resemble “wavy” always-non-zero response on high frequencies.
Although amplitude of waves is decreasing for longer filters in order to achieve acceptable suppression one should choose very long filters. See Low-noise Lanczos differentiators page for more details.
From signal processing point of view differentiators are anti-symmetric digital filters with finite impulse response of Type III. There are powerful methods for its construction: based on windowed Fourier series, frequency sampling, etc. Parks-McClellan algorithm is one of the best among others. The main idea behind them is to construct filter from the required magnitude response. Although such algorithms are capable to generate near-optimal solution they spread error (deviance from the target magnitude response) uniformly in passband (low frequencies) and stopband (high frequencies). On the contrary differentiator needs to be very precise on low frequencies. Moreover filters designed by standard frequency-based methods lack for crucial time-domain characteristic – exactness on polynomials.
Task statement
To summarize, desired numerical derivative computation schema (filter) should posses following properties:
- Exactness on polynomials.
- Preciseness on low frequencies.
- Smooth and guaranteed suppression of high frequencies (to be noise robust).
Additionally it should have computationally favorable structure to be effectively applied in practice. This page is devoted to development of such method (skip to the table of final results).
Derivation
Let’s denote filter length as
(odd), filter’s coefficients as
, function values sampled at
equidistant points around
with some step
as:

Then numerical derivative can be written in general form as

As we said before,
is anti-symmetric filter of Type III. Its frequency response is (
):

Our goal is to select coefficients
such that
will be as close as possible to the response of an ideal differentiator
in low frequency region and smoothly tend to zero towards highest frequency
.
The most intuitive way of doing this is to force
to have high tangency order with
at
as well as high tangency order with
axis at
. This leads us to the system of linear equations against
:

Let’s consider particular example for
and
:

Finally we arrive at the simple solution
. In the same way we can obtain differentiators for any
.
As it can be clearly seen tangency of
with response of ideal differentiator at
is equivalent to exactness on monomials up to corresponding degree:
. Thus this condition reformulates exactness on polynomials in terms of frequency domain. So, we don’t need to impose additional requirements to guarantee this valuable property.
Results
Below we present coefficients for the case when
and
are chosen to make quantity of unknowns to be equal to the quantity of equations.
Differentiator of any filter length
can be written as:

where
![<br />
\displaystyle {c_k = \frac{1}{2^{2m+1}}\left[{2m\choose m-k+1}-{2m\choose m-k-1}\right]},\quad \displaystyle{m=\frac{N-3}{2}},\quad M=\frac{N-1}{2}<br />
<br />
\displaystyle {c_k = \frac{1}{2^{2m+1}}\left[{2m\choose m-k+1}-{2m\choose m-k-1}\right]},\quad \displaystyle{m=\frac{N-3}{2}},\quad M=\frac{N-1}{2}<br />](http://www.holoborodko.com/pavel/wp-content/ql-cache/quicklatex-949454708eb234d390aac25c9eb44c5a.gif)
Interestingly that number sequence in numerator (formed by expression in the square brackets) is well known in other areas of mathematics. Please check “On-Line Encyclopedia of Integer Sequences” for more information: A039598, A050166.
Frequency-domain characteristics for the differentiators are drawn below. Red dashed line is the response of ideal differentiator
.
Besides guaranteed noise suppression smooth differentiators have efficient computational structure. Costly floating-point division can be completely avoided. Denominator is a power of
and with appropriate selection of
( h=1/SamplingRate for digital data) division can be replaced by fast shift operation.
Shift operation doesn’t introduce any rounding error (only exponent is altered, mantissa remains the same) comparing to plain division. As a consequence smooth differentiators are not only computationally efficient but also capable to give more accurate results comparing to other methods (Savitzky-Golay filters, etc.).
Also smooth differentiators can be effectively implemented using fixed point (e.g. for robust edge detection in images).
If we will require tangency of 4th order at
with
(which is equivalent to exactness on polynomials up to 4th degree) following filters are obtained:
These filters also show smooth noise suppression with extended passband. As filter length
grows it smoothly zeroing frequencies starting from the highest towards the lowest.
Extensions
There are several obvious extensions of proposed smooth differentiators: second(and higher) order filters, forward(without time lag) filters, 2D smooth differentiators of any order, band-pass differentiators, etc. Here I present only second order smooth differentiators with their properties. I will post other extensions upon request.
Disclaimer
You can use materials from this site in any legal activity (research, teaching, engineering, etc.) freely on the basis of respecting my authorship by giving clear statement and reference to my site or particular page on it in your work. Brief description of your project is welcome in the comments below. One point though, if you use my materials in commercial project please consider supporting my site and me by donation.












25 Comments
Hi Pavel.
Nice work. I’m using it now to compute the velocity of a robot (MBARS) and your methods give very good results.
I’d like to know if you have the formula of a one-sided version, as using a centered version forces me to introduce a time lag.
Thanks, Yoel Shapiro.
Hi Yoel.
I am very grateful for your feedback.
It is very exciting for me to know about real-world applications using my work.
Noise robust differentiators without time delay (one-sided or forward filters):
(As always, longer impulse response means better noise suppression.)
Precise on 1, x:
1/4 (h[0] = 1, h[-1] = 1, h[-2] = -1, h[-3] = -1)
1/8 (h[0] = 1, h[-1] = 2, h[-2] = 0, h[-3] = -2, h[-4] = -1)
1/16 (h[0] = 1, h[-1] = 3, h[-2] = 2, h[-3] = -2, h[-4] = -3, h[-5] = -1)
1/32 (h[0] = 1, h[-1] = 4, h[-2] = 5, h[-3] = 0, h[-4] = -5, h[-5] = -4, h[-6] = -1)
1/64 (h[0] = 1, h[-1] = 5, h[-2] = 9, h[-3] = 5, h[-4] = -5, h[-5] = -9, h[-6] = -5, h[-7] = -1)
Precise on 1, x, x^2:
h[0] = 5/8, h[-1] = 1/4, h[-2] = -1, h[-3] = -1/4, h[-4] = 3/8
h[0] = 3/8, h[-1] = 1/2, h[-2] = -1/2, h[-3] = -3/4, h[-4] = 1/8, h[-5] = 1/4
h[0] = 7/32, h[-1] = 1/2, h[-2] = -1/32, h[-3] = -3/4, h[-4] = -11/32, h[-5] = 1/4, h[-6] = 5/32
h[0] = 1/8, h[-1] = 13/32, h[-2] = 1/4, h[-3] = -15/32, h[-4] = -5/8, h[-5] = -1/32, h[-6] = 1/4, h[-7] = 3/32
One-sided filters have several disadvantages comparing to centered versions.
For example, to achieve the same noise suppression one-sided filter
should be much longer than centered one. Also they strongly amplify noise
in the mid-frequency range.
It would be nice if you would say me which particular centered filter
works the best for you. Then I can try to design one-sided filter with the similar properties.
Let me know about the results.
Also I would gladly make reference on your project – web page, journal paper, etc. if you would provide me such info.
Hi Pavel
Thanks again.
I experimented with a few filters of yours and found the smooth noise-robust estimator with n=4, N=7 to be satisfactory.
I’m about to hand in my thesis soon, but haven’t published yet. I can contact you later and you can reference my thesis if that helps.
you can check this to see what I’m working on:
http://brml.technion.ac.il/show_project.php?id=14&type=1
I used your filter to damp the motion of my controller (PHANToM Desktop).
if you want to stay in contact better address me at my email.
Cheers, Yoel.
Hi Pavel:
This is an excellent work!
Could you please post 3rd order for both central difference and smooth noise robust (N=5, 7 or 9)?
I’d like to study the difference (enhancement) between these two.
How to estimate the error, O(h^n)?
Thanks
Hi,
thanks for the comment.
Estimation of the error can be done by plug in Taylor series (evaluated at corresponding points
) instead of
in the formulas.
I’ll publish third order formulas shortly.
I cannot imagine the practical need of 3rd order numerical approximation of the derivative though. Could you please enlighten me – what kind of task are you solving?
1. Third order smooth differentiators of
approximation error:
2. Third order smooth differentiators of
approximation error:
Hello Pavel,
I tried your method on some very noisy surface scan data collected from a coordinate measurement machine and it gave impressive results using the n=4, 11 point filter. However, I believe that I need something more than 11 points to get the results I need. Can you publish the general form of this particular filter or at the very least post the N=19 coefficients?
Thank you very much.
Hello!
Thank you for your interest in my results!
Here is n=4, N=19 filter:
where
Let me know about the test results. I’ll publish filters for other N if you will need them.
I think for your task (noisy surface data) you need to use 2D smooth differentiators.
They would perform much better than 1D filters. That is because two dimensional data have noise in diagonal frequencies which cannot be suppressed well by 1D filters by definition.
Only 2D filters are capable to do that and filter out noise in all directions uniformly (isotropic noise suppression). However it is not easy to build such smooth noise robust filters for 2D and it requires considerable efforts and time.
If you are targeting commercial applications I would be grateful if you would consider supporting me in this task.
Thank you!
Pavel,
Thank you for your quick response. In our case ,the surface data was collected by scanning curves on the surface so a 1D filter is working perfectly. We are still in the research stage at this point and have not made a decision as to how to proceed.
On a slightly different topic, can your work be applied to the data itself?
Regards,
Steve
Yes, these filters can be applied to the raw data itself. One point though – data should be uniformly spaced.
How do you apply them now? Do you process data somehow beforehand? Smoothing? Curve fitting (you would not need numerical derivative in this case)?
I have some (rather little) experience in processing of data from a coordinate measurement machine and IMHO following workflow is preferable:
Smoothing filters with guaranteed noise suppression (similar to the described derivative filters) can be applied on the first stage of data processing pipeline to prepare data for all range of other operations – finding surface characteristics (first and second derivatives to estimate curvature, normal vectors, etc.), fitting smooth analytical surface representation (by splines) or else.
In this method you remove noise from data once and forever, using standard algorithms for further operations without worrying about noise anymore.
Does it have sense?
Hi Pavel,
Now I’m using n=2 and N=5 in the 1D form of your smooth noise-robus, which already gives impressive results for edge detection on medical images. I would be interested to use your smooth noise-robust differentiator in the 2D form, to improve the analysis. Could you be so kind to post this?
Regards,
Astrid
Hi, thanks for the interest in my work.
I didn’t finish research on 2D filters yet. Please use 1D filters for the moment. They estimate gradient components in 2D well.
I’ll get in touch once I will be ready to share essential 2D filters for edge detection.
Cheers!
I’ve published 2D noise robust derivative filters here: Edge Detection.
First is similar to n=2, N=5 but has isotropic noise suppression. Second is more noise robust, corresponds to n=2, N=7.
Give it a try and let me know about the results please.
Pavel – I am looking at some data where there is noise but where looking at the derivative profile is very informative and so I am interested in your approach. Two questions:
1. have you published any papers on this and if so can you give me the references?
2. My data starts off with h=1 then chnages to h=5 and then for the last portion h=15. Any suggestions as to how I could handle the changes to h when i us eyour formulas?
Thanks
I have not published paper yet, although I plan to do so. You can ask me any questions by commenting or by e-mail if something unclear in the above article. Also you can reference this webpage in your paper.
In case of step change simple linear interpolation could be used to derive missing uniformly spaced data on the boundary. Then it could be plugged into differentiator. I’m sure it will give satisfactory results.
Thanks for the quick reply. What I did was to divide each component of the calculation say
[y(i+k) - y(i-k)] by [x(i+k) - x(i-k)]/k
so that in effect each y contribution is divided by the x distance between the values of the contribution.
With this approach the transition from one h value to the enxt h value was far less obvious.
Any comments?
I think your idea is very good. Actually your idea is exactly what I needed to extend my differentiators on irregular spaced data!
One note though, it is better to divide [y(i+k) - y(i-k)] by [x(i+k) - x(i-k)]/(2*k).
In this case formulas will give precise derivative estimation for any data spacing. Also formulas will coincide with referenced above for uniform data.
To sum up, general form of a differentiator for irregular spaced data is:
with the same
Thank you very much!
You are correct – I had used 2*k but had mis-typed.
Thanks
Pavel,
thank you very much for posting this project online, very useful indeed. I have been using Savitsky-Golay differentiators for ages, but faced recently a new class of problems requiring more serious attention to the details. In particular, I need minimal delays even at the expense of quality, because the results are used for real-time control system. The quality of the smoothing can be compensated for later in time, but the delay limits the response efficiency in principal. In addition to differentiators I need to smooth signal itself as well. I am curious if you experimented and/or can I suggest one-sided smoothers with N around, say 10-15. I will be more than happy to tell you about the application where these smoothers play a very important role in saving lives.
Best regards,
Michael (Михаил)
Are there two dimensional versions of the smooth noise robust differentiators?
Yes, check this page: Edge detection. So far I’ve published only two filters there. More to come shortly.
Thank you very much, Pavel – hugely appreciated. This is precisely what I was looking for!
Looks like great work. I am interested in implementing a general form of this differentiating filter in Matlab function (since I couldn’t find one you had already made) and believe that you missed a k in the last equation you give in the example for the N=5, n=2 and m=1 case. Is that correct and do you mind if I continue working on the Matlab function?
Thanks,
Nate
Yes, you are right, I missed k. I’ve fixed it. Thank you!
As far as I know these filters are not implemented as Matlab function yet. So, I think many people would be very grateful if you would do this and share source code. I will give link to your web page with the files or I can make them available for download through this page if you want.
Thank you for your interest and efforts!
Pavel,
Thanks that would be nice. I currently have a code that is working but for n values other than n=2 where I didn’t know a closed form solution I begin to run into numerical issues for large n values because I do not know how to directly implement the shift operation in Matlab. Therefore, if you could post or email me a code so I can see how you approach these problems I would appreciate it.
Thanks,
Nate