This page is about numerical differentiation of a noisy data or functions. Description begins with analysis of well known central differences establishing reasons of its weak noise suppression properties.

Then we consider differentiators with improved noise suppression based on least-squares smoothing. Such filters are known as low-noise Lanczos differentiators or Savitzky-Golay filters. I briefly sketch its properties and main drawback – “wavy” always-non-zero response on noisy high frequencies.

Finally I propose my original filters for numerical differentiation with guaranteed noise suppression and very efficient computational structure (coefficients are fractions with the denominators of power of 2). You can skip directly to the table of final formulas.

**August 13, 2015**: Andrey Paramonov went further and extended the ideas to build similar filters for smoothing. Please read his article for more details: Noise-robust smoothing filter.

**November 10, 2011**: It is not technically correct to state that Savitzky-Golay filters are inferior to proposed ones. Actually both have situations when one outperforms another. Savitzky-Golay is optimal for Gaussian noise suppression whereas proposed filters are suitable when certain range of frequencies is corrupted with noise.

## Prior methods

Classical approach of computing derivative numerically relies on approximation of near target point by some polynomial . If neighborhood of is chosen properly then approximates well on it and we can assume that .

Particular methods can be derived by using different approximation technique of by in this general framework.

Central differences are obtained by interpolation of by . Since interpolation lacks for high-frequencies suppression they work well only for noiseless functions whose values can be computed precisely. In the case of noisy functions/data central differences give inadequate results.

Red dashed line is the response of an ideal differentiator . More detailed analysis on their properties can be found on central differences page.

To resolve this issue Lanczos differentiators (Savitzky-Golay filters) use smoothing least-squares approximation of by to remove noise from the data. As a result such method is much more robust to noise. However due to least-squares nature Lanczos differentiators cannot guarantee complete noise suppression in high frequencies. Its magnitude response doesn’t go smoothly to zero near . Instead they resemble “wavy” always-non-zero response on high frequencies.

Although amplitude of waves is decreasing for longer filters in order to achieve acceptable suppression one should choose very long filters. See Low-noise Lanczos differentiators page for more details.

From signal processing point of view differentiators are anti-symmetric digital filters with finite impulse response of Type III. There are powerful methods for its construction: based on windowed Fourier series, frequency sampling, etc. Parks-McClellan algorithm is one of the best among others. The main idea behind them is to construct filter from the required magnitude response. Although such algorithms are capable to generate near-optimal solution they spread error (deviance from the target magnitude response) uniformly in passband (low frequencies) and stopband (high frequencies). On the contrary differentiator needs to be very precise on low frequencies. Moreover filters designed by standard frequency-based methods lack for crucial time-domain characteristic – exactness on polynomials.

## Task statement

To summarize, desired numerical derivative computation schema (filter) should posses following properties:

- Exactness on polynomials.
- Preciseness on low frequencies.
- Smooth and guaranteed suppression of high frequencies (to be noise robust).

Additionally it should have computationally favorable structure to be effectively applied in practice. This page is devoted to development of such method (skip to the table of final results).

## Derivation

Let’s denote filter length as (odd), filter’s coefficients as , function values sampled at equidistant points around with some step as:

Then numerical derivative can be written in general form as

As we said before, is anti-symmetric filter of Type III. Its frequency response is ():

Our goal is to select coefficients such that will be as close as possible to the response of an ideal differentiator in low frequency region and smoothly tend to zero towards highest frequency .

The most intuitive way of doing this is to force to have high tangency order with at as well as high tangency order with axis at . This leads us to the system of linear equations against :

Let’s consider particular example for and :

Finally we arrive at the simple solution . In the same way we can obtain differentiators for any .

As it can be clearly seen tangency of with response of ideal differentiator at is equivalent to exactness on monomials up to corresponding degree: . Thus this condition reformulates exactness on polynomials in terms of frequency domain. So, we don’t need to impose additional requirements to guarantee this valuable property.

## Results

Below we present coefficients for the case when and are chosen to make quantity of unknowns to be equal to the quantity of equations.

Differentiator of any filter length can be written as:

where

Interestingly that number sequence in numerator (formed by expression in the square brackets) is well known in other areas of mathematics. Please check “On-Line Encyclopedia of Integer Sequences” for more information: A039598, A050166.

Frequency-domain characteristics for the differentiators are drawn below. Red dashed line is the response of ideal differentiator .

Besides guaranteed noise suppression smooth differentiators have efficient computational structure. Costly floating-point division can be completely avoided. Denominator is a power of and with appropriate selection of ( h=1/SamplingRate for digital data) division can be replaced by fast shift operation.

Shift operation doesn’t introduce any rounding error (only exponent is altered, mantissa remains the same) comparing to plain division. As a consequence smooth differentiators are not only computationally efficient but also capable to give more accurate results comparing to other methods (Savitzky-Golay filters, etc.).

Also smooth differentiators can be effectively implemented using fixed point (e.g. for robust edge detection in images).

If we will require tangency of 4th order at with (which is equivalent to exactness on polynomials up to 4th degree) following filters are obtained:

These filters also show smooth noise suppression with extended passband. As filter length grows it smoothly zeroing frequencies starting from the highest towards the lowest.

Proposed method can be easily extended to calculate numerical derivative for irregular spaced data. General formula is:

where coefficients are the same as for uniform data (can be chosen from the both tables above).

Another extension is filters using only past data or forward differentiators. Please check this report for more information:

One Sided Noise Robust Differentiators.pdf (170KB)

## Extensions

There are several obvious extensions of proposed smooth differentiators: second(and higher) order filters, forward(without time lag) filters, 2D smooth differentiators of any order, band-pass differentiators, etc. Here I present only second order smooth differentiators with their properties. I will post other extensions upon request.

Coefficients of these filters can be computed by simple recursive procedure for any . Also they can be easily extended for irregular spaced data as well as for one-sided derivative estimation when only past data are available. Please check this report for more information:

Noise Robust Second Derivative.pdf (75KB)

Contact me by e-mail if you are interested in more specific cases.

In case of higher approximating order we get following estimation.

These filters can be used to approximate second derivative for irregular spaced data as

where are coefficients from the formulas for uniform spaced data derived above.

### BibTeX

Here’s a BibTeX entry that you can use to cite smooth differentiators in your academic publications:

@misc{snrd,

author = {Pavel Holoborodko},

title = {Smooth Noise Robust Differentiators},

howpublished = {http://www.holoborodko.com/pavel/numerical-methods/numerical-derivative/smooth-low-noise-differentiators/}

year = {2008}

}

## Disclaimer

You can use materials from this site in any legal activity (research, teaching, engineering, etc.) freely on the basis of respecting my authorship by giving clear statement and reference to my site or particular page on it in your work. Brief description of your project is welcome in the comments below. One point though, if you use my materials in commercial project please consider supporting my site and me financially.

## 201 Comments

#Hi Pavel.

Nice work. I’m using it now to compute the velocity of a robot (MBARS) and your methods give very good results.

I’d like to know if you have the formula of a one-sided version, as using a centered version forces me to introduce a time lag.

Thanks, Yoel Shapiro.

#Hi Yoel.

I am very grateful for your feedback.

It is very exciting for me to know about real-world applications using my work.

Noise robust differentiators without time delay (one-sided or forward filters):

(As always, longer impulse response means better noise suppression.)

Precise on 1, x:

1/4 (h[0] = 1, h[-1] = 1, h[-2] = -1, h[-3] = -1)

1/8 (h[0] = 1, h[-1] = 2, h[-2] = 0, h[-3] = -2, h[-4] = -1)

1/16 (h[0] = 1, h[-1] = 3, h[-2] = 2, h[-3] = -2, h[-4] = -3, h[-5] = -1)

1/32 (h[0] = 1, h[-1] = 4, h[-2] = 5, h[-3] = 0, h[-4] = -5, h[-5] = -4, h[-6] = -1)

1/64 (h[0] = 1, h[-1] = 5, h[-2] = 9, h[-3] = 5, h[-4] = -5, h[-5] = -9, h[-6] = -5, h[-7] = -1)

Precise on 1, x, x^2:

h[0] = 5/8, h[-1] = 1/4, h[-2] = -1, h[-3] = -1/4, h[-4] = 3/8

h[0] = 3/8, h[-1] = 1/2, h[-2] = -1/2, h[-3] = -3/4, h[-4] = 1/8, h[-5] = 1/4

h[0] = 7/32, h[-1] = 1/2, h[-2] = -1/32, h[-3] = -3/4, h[-4] = -11/32, h[-5] = 1/4, h[-6] = 5/32

h[0] = 1/8, h[-1] = 13/32, h[-2] = 1/4, h[-3] = -15/32, h[-4] = -5/8, h[-5] = -1/32, h[-6] = 1/4, h[-7] = 3/32

One-sided filters have several disadvantages comparing to centered versions.

For example, to achieve the same noise suppression one-sided filter

should be much longer than centered one. Also they strongly amplify noise

in the mid-frequency range.

It would be nice if you would say me which particular centered filter

works the best for you. Then I can try to design one-sided filter with the similar properties.

Let me know about the results.

Also I would gladly make reference on your project – web page, journal paper, etc. if you would provide me such info.

#Pavel,

Thanks for the great treatment of this filter. I was wondering if you could explain how you derived the one-sided filters that you gave to Yoel in this comment. And am I correct that you are giving the coefficients for each element of your measurement history? So that the first filter mentioned in the second block of your comment, with length $N = 5$ and exact on $1$, $x$, and $x^2$, would be:

where $h$ is the step size.

Thank you very much!

Nathan

#Nathan, your understanding and formula is correct.

I used similar reasoning for building these filters as outlined on the page above.

But relaxing the tight noise suppression requirements, to minimize overshoot in mid-frequencies.

#Hello Pavel,

thank you for posting your great work. I am an university student working on DSP. I come across in a project that I need an one-sided differentiator and I found that your centered SNRD with n=4 , N=7 works great in my test. I would like to ask do you have some one-sided differentiators which behave similarly ?

Thanks in advance.

Best regards,

Jim

#Just want to add that I tried your one-sided differentiator formulas for being precise on 1, x, x^2, but they didn’t work well. It would be great if you can post the one-sided differentiator formulas for higher order of n =3 or 4.

Thanks in advance again.

Best regards,

Jim

#Hi Pavel

Thanks again.

I experimented with a few filters of yours and found the smooth noise-robust estimator with n=4, N=7 to be satisfactory.

I’m about to hand in my thesis soon, but haven’t published yet. I can contact you later and you can reference my thesis if that helps.

you can check this to see what I’m working on:

http://brml.technion.ac.il/show_project.php?id=14&type=1

I used your filter to damp the motion of my controller (PHANToM Desktop).

if you want to stay in contact better address me at my email.

Cheers, Yoel.

#Hi Pavel:

This is an excellent work!

Could you please post 3rd order for both central difference and smooth noise robust (N=5, 7 or 9)?

I’d like to study the difference (enhancement) between these two.

How to estimate the error, O(h^n)?

Thanks

#Hi,

thanks for the comment.

Estimation of the error can be done by plug in Taylor series (evaluated at corresponding points ) instead of in the formulas.

I’ll publish third order formulas shortly.

I cannot imagine the practical need of 3rd order numerical approximation of the derivative though. Could you please enlighten me – what kind of task are you solving?

#1. Third order smooth differentiators of approximation error:

2. Third order smooth differentiators of approximation error:

#Hello Pavel,

I tried your method on some very noisy surface scan data collected from a coordinate measurement machine and it gave impressive results using the n=4, 11 point filter. However, I believe that I need something more than 11 points to get the results I need. Can you publish the general form of this particular filter or at the very least post the N=19 coefficients?

Thank you very much.

#Hello!

Thank you for your interest in my results!

Here is n=4, N=19 filter:

where

Let me know about the test results. I’ll publish filters for other N if you will need them.

I think for your task (noisy surface data) you need to use 2D smooth differentiators.

They would perform much better than 1D filters. That is because two dimensional data have noise in diagonal frequencies which cannot be suppressed well by 1D filters by definition.

Only 2D filters are capable to do that and filter out noise in all directions uniformly (isotropic noise suppression). However it is not easy to build such smooth noise robust filters for 2D and it requires considerable efforts and time.

If you are targeting commercial applications I would be grateful if you would consider supporting me in this task.

Thank you!

#Pavel,

Thank you for your quick response. In our case ,the surface data was collected by scanning curves on the surface so a 1D filter is working perfectly. We are still in the research stage at this point and have not made a decision as to how to proceed.

On a slightly different topic, can your work be applied to the data itself?

Regards,

Steve

#Yes, these filters can be applied to the raw data itself. One point though – data should be uniformly spaced.

How do you apply them now? Do you process data somehow beforehand? Smoothing? Curve fitting (you would not need numerical derivative in this case)?

I have some (rather little) experience in processing of data from a coordinate measurement machine and IMHO following workflow is preferable:

Smoothing filters with guaranteed noise suppression (similar to the described derivative filters) can be applied on the first stage of data processing pipeline to prepare data for all range of other operations – finding surface characteristics (first and second derivatives to estimate curvature, normal vectors, etc.), fitting smooth analytical surface representation (by splines) or else.

In this method you remove noise from data once and forever, using standard algorithms for further operations without worrying about noise anymore.

Does it have sense?

#Hi Pavel,

Now I’m using n=2 and N=5 in the 1D form of your smooth noise-robus, which already gives impressive results for edge detection on medical images. I would be interested to use your smooth noise-robust differentiator in the 2D form, to improve the analysis. Could you be so kind to post this?

Regards,

Astrid

#Hi, thanks for the interest in my work.

I didn’t finish research on 2D filters yet. Please use 1D filters for the moment. They estimate gradient components in 2D well.

I’ll get in touch once I will be ready to share essential 2D filters for edge detection.

Cheers!

#I’ve published 2D noise robust derivative filters here: Edge Detection.

First is similar to n=2, N=5 but has isotropic noise suppression. Second is more noise robust, corresponds to n=2, N=7.

Give it a try and let me know about the results please.

#Pavel – I am looking at some data where there is noise but where looking at the derivative profile is very informative and so I am interested in your approach. Two questions:

1. have you published any papers on this and if so can you give me the references?

2. My data starts off with h=1 then chnages to h=5 and then for the last portion h=15. Any suggestions as to how I could handle the changes to h when i us eyour formulas?

Thanks

#I have not published paper yet, although I plan to do so. You can ask me any questions by commenting or by e-mail if something unclear in the above article. Also you can reference this webpage in your paper.

In case of step change simple linear interpolation could be used to derive missing uniformly spaced data on the boundary. Then it could be plugged into differentiator. I’m sure it will give satisfactory results.

#Thanks for the quick reply. What I did was to divide each component of the calculation say

[y(i+k) – y(i-k)] by [x(i+k) – x(i-k)]/k

so that in effect each y contribution is divided by the x distance between the values of the contribution.

With this approach the transition from one h value to the enxt h value was far less obvious.

Any comments?

#I think your idea is very good. Actually your idea is exactly what I needed to extend my differentiators on irregular spaced data!

One note though, it is better to divide [y(i+k) – y(i-k)] by [x(i+k) – x(i-k)]/(

2*k).In this case formulas will give precise derivative estimation for any data spacing. Also formulas will coincide with referenced above for uniform data.

To sum up, general form of a differentiator for irregular spaced data is:

with the same as explained in the article above.

Thank you very much!

#You are correct – I had used 2*k but had mis-typed.

Thanks

#I have been trying your differentiating filter for irregularly spaced data (two related variables, sampled periodically).

The formula based on Bernard McGarvey’s suggestion is susceptible to noise in x, particularly when $noise \approx \Delta x$ so that the denominator in the suggested formula can be $\approx 0$.

I suggest the following alternative that follows from applying ${dy \over dx} = {dy \over dn} / {dx \over dn}$ to the evenly spaced version, that gives noise immunity in both x and y.

As neither $x_i$ nor $f(x_i)$ has any effect on $f'(x_i)$, I suggest using the same coefficients to smooth the x values:

#Pavel,

thank you very much for posting this project online, very useful indeed. I have been using Savitsky-Golay differentiators for ages, but faced recently a new class of problems requiring more serious attention to the details. In particular, I need minimal delays even at the expense of quality, because the results are used for real-time control system. The quality of the smoothing can be compensated for later in time, but the delay limits the response efficiency in principal. In addition to differentiators I need to smooth signal itself as well. I am curious if you experimented and/or can I suggest one-sided smoothers with N around, say 10-15. I will be more than happy to tell you about the application where these smoothers play a very important role in saving lives.

Best regards,

Michael (Михаил)

#Are there two dimensional versions of the smooth noise robust differentiators?

#Yes, check this page: Edge detection. So far I’ve published only two filters there. More to come shortly.

#Thank you very much, Pavel – hugely appreciated. This is precisely what I was looking for!

#Looks like great work. I am interested in implementing a general form of this differentiating filter in Matlab function (since I couldn’t find one you had already made) and believe that you missed a k in the last equation you give in the example for the N=5, n=2 and m=1 case. Is that correct and do you mind if I continue working on the Matlab function?

Thanks,

Nate

#Yes, you are right, I missed k. I’ve fixed it. Thank you!

As far as I know these filters are not implemented as Matlab function yet. So, I think many people would be very grateful if you would do this and share source code. I will give link to your web page with the files or I can make them available for download through this page if you want.

Thank you for your interest and efforts!

#Pavel,

Thanks that would be nice. I currently have a code that is working but for n values other than n=2 where I didn’t know a closed form solution I begin to run into numerical issues for large n values because I do not know how to directly implement the shift operation in Matlab. Therefore, if you could post or email me a code so I can see how you approach these problems I would appreciate it.

Thanks,

Nate

#Good work!!!… hugely appreciated.

I’ m try your ideas and generate examples!!

#Hi Pavel,

This is quite interesting. One thing I like about Savitzky-Golay is the ability to choose some arbitrary lag count where the interpolation is done, rather than choosing only between central and forward operations. It sounds like your work can be extended to do that, as well. You mentioned antisymmetry as one constraint in the central case.

I’d be interested in knowing the complete set of contraints required to design coefficient sets with the additional degree of freedom I mentioned (choice of lag). I’m not as interested in knowing the coefficients themselves as the specifications used to generate them.

Thanks!

#Nice! These smoothing functions should be implemented in Scilab and Maxima as built in functions.

At this time we use the Savitsky-Golay algorithm for smoothing motion data before graphing but I would like to try the smooth noise robust filter. The first derivative coefficients were nicely documented. Are there formulas for the second and third derivative and for n=3,4 and 5?

I know that there are tables that display the coefficients but I would like to see the formulas or some method algorithm for implementing this smooth noise robust filter in Scilab or Maxima.

I would like to be able to do something like this

http://www.deltamotion.com/peter/Maxima/Savitzky-Golay.html

You can see I don’t have all the combinations worked out but it is easy for anyone to expand on what I have done to compute what they want.

Peter Nachtwey

#I am a beginner in this scope and do not know much on the use of filter. By the way, how to use your method to denoise the data?In the S-G filter, there is a col of coefficient in the matrix (g(:,1)) to reconstruct the smoothed data by multipy the g(:,1) with original data. In your filter, the first and second derivative can be used to detect the sudden change or break point, but I do not know how to use your method to smooth the noisy data. This is maybe a very simple to you, but it’s a difficult for me. Wish to get your help. Thanks

#Pavel,

This is a very interesting and refreshing approach. However, I’ve noticed few missing points which might be very helpful to clarify your results:

1. There is a generic expression for coefficients {Ck} but it is not clear how it was generated.

2. For centered filters the process is clear but what is you approach for asymmetrical (“forward”) filters?

3. You describe few second and third order centered filters but few words about the process would help a lot.

These questions are important for me and others who is working with real-time DSP (forward/backward filters) on irregular time intervals with high data frequency meaning working windows more then 11 points . And, unfortunately, I could not find anything published. I am also curious if you experimented with circular convolutions and mirrored data windows to solve deficiencies of one-sided filters you mentioned in one of your replies.

I understand you might not have time to respond with answers or descriptions of your method. Maybe it is not a big deal then to post one-sided first and second order filters for N>=19?

Thanks in advance

#Thanks a lot for the much appreciable work.

I applied the fisrt order formula for n = 2 and N = 7 . It misses a minus before the whole formula.

Luca

#These formulas look extremely useful for regular meshes. But will the basic premise also work on an irregular mesh where the step size is not constant? Of course I expect the formulas to get more complicated, but I am curious if noisy data on an irregular mesh can be properly filtered? Thanks for your site. It’s quite useful.

#Thanks.

Yes, these differentiators can be extended for irregular spaced data:

with the same as explained in the article above. Idea was suggested by Bernard McGarvey in this comment..

#Fantastic! Sorry I missed the earlier comment. I assume this extends similarly to the 2nd derivative? This is what would interest me most. Thanks again.

#I actually have one more question. Can these be extended to data at the ends of the domain? Suppose I want an “upwind-biased” or even fully upwind differentiator where I only use data to the left of the point in question? I understand the order of accuracy will diminish but if I only have data on the left, it’s necessary. Thanks again.

#1. Yes, you are right, second derivative for irregular spaced data can be approximated as

where are coefficients from the formulas for uniform spaced data derived above.

2. It is also possible to derive one-sided filters for 2nd derivative & irregular spacing (I have not done it yet though).

Could you briefly explain application you need it for?

#This is great…looking forward to trying this.

The application we need it for is a CFD code. But it’s not average everyday CFD code, however, as one thing we need is a 2nd derivative of an array of values. Because of the topology of the mesh, the array of values can be very noisy. We currently use Lagrange polynomials to estimate the 2nd derivative and while the method converges, it converges to a noisy solution in some regions. A smoother differentiator that is less noisy will be tremendously helpful, I believe, Of course, we need the 2nd derivative all the way to the ends of the arrays so these stencils with a lot of points (to reduce noise) become less useful at the ends.

#David,

I am sorry for the late answer.

Thank you for clear explanation of your project – it is very interesting.

Your request motivated me to derive general formula for second derivative filters (centered) – so you can have flexibility on generating filters of any length, even long differentiators with strong noise suppression which are needed for very noisy data (as in your project).

Please try them (see updated Extensions section). In case of satisfactory results I will derive one-sided filters for the boundaries with the similar properties.

Let me know if you need my help in this experiment.

#We tried the 5 points smooth differentiator in our application. Unfortunately, for our code, central differencing is unstable..this is known to be true for this particular part of the code. We have been successful with differentiating using one-sided Lagrange polynomials with a tiny bit of smoothing, so I’m still confident a smooth upwind or even upwind-biased differentiator will be useful. We can try to derive it ourselves or if someone has time and would like to derive it for us, we woud be most obliged. Thanks for your help so far.

If this is successful, we will probably write a paper and would love to collaborate on it with you. Thanks.

#Well, I am not surprised since 5 points filter has the weakest noise suppression capability.

As I understood from your previous explanation data in your application is very noisy. So I suggest you to try filters with .

Longer filter length means stronger noise suppression.

Now I am preparing report on one sided filters for noise robust second derivative estimation. Tomorrow it will be available for download.

#David,

Here is report I promised: Noise Robust Second Derivative.pdf

It includes simple formula for coefficients and one-sided filters.

Keep me updated on progress.

#Dear Pavel,

I want to calculate a differentiator for N=19 from your formular given above. Therby I always obtain 2 for the coefficient c_{k}. How to do you obtain 42 or 48 respectivley for N=11? Thanks a lot

#Here is n=2, N=19 filter:

where

#Dear Pavel,

thanks a lot again. I saw this post before. However maybe I want to use N=17 or N=21. How to you obtain these values for the coefficient c.

If I plug in this values into your formula above I obtain c=2 for each of them.

How do you derive for example c=42 (f_1), when n=2 and N=11?

Best Wishes!!

#Here is code for Maple based on formulas above:

> c:=(k)->binomial(2*m,m-k+1)-binomial(2*m,m-k-1);

> N:=11:

> m:=(N-3)/2:

> M:=(N-1)/2:

It gives correct results:

c(1) = 42, c(2) = 48, c(3)=27, c(4) = 8, c(5) = 1

#You are wonderfull! I did not considered it to be binomial distributed! Thanks for your fast replies!

#Pavel,

Very interesting!

I have a quick question:

Is there a difference between smoothing data and then differentiating it, compared to this algorithm? I would think so, since smoothing and differentiation are trying to do opposite things – but it’s not obvious to me.

At the moment we smooth our data and then do a least squares fit of (1,x) to our data, and take the slope – which I suspect is the same as Lancszos and Savitzky-Golay for this low order? If we want to improve noise rejection, would using these techniques offer any advantage do you think?

Be interested in your thoughts,

Cheers,

Alex.

#Hi, Alex!

Bare numerical differentiation is very sensible to noise. And it is only natural to reduce such noise by smoothing beforehand. There is no contradiction – smoothing just improves quality of differentiation.

Proposed method includes smoothing inherently, by design.

It might be advantageous to try this algorithm in your application at least for one reason: performance. All filters derived above combine smoothing and differentiation in one operation. So you don’t need to smooth data first (one pass) and only after that find derivative (second pass through the data). Virtually you get 2 times speed up as you need to apply one filter only.

Other possible reason is quality improvement. One thing is that estimating derivative by slope of straight line is very rough I think. For example, even if data contain no noise (only smooth parabolic-like regions), such algorithm can not restore derivative exactly. Proposed filters can give more precise values, since they have higher approximation order.

Secondly, quality of derivative estimation in your system depends on how well smoothing is done. If it does not reject noise completely results might be deteriorated since Savitzky-Golay is not completely noise robust too. Described filters are free from such flaws.

To sum up I would advise to try smooth differentiators exact for (first table in Results section). They have simple structure, higher approximating order, general formula for the coefficients. Filter length should be selected based on noise properties.

I would gladly help if you would decide to try them.

#Hello Pavel,

Very nice and interesting work. I am currently using a SG differentiator on noisy temperature data for a boiling experiment and was bothered by the non-zero response of the SG routine at the higher frequencier. I was wondering if you had any information on the magntiude of noise reduction compared to say a basic backward difference differentiator.

Thanks,

Ben

#Please check this report One Sided Noise Robust Differentiators.

#Pavel,

Great stuff here. This is exactly what I was looking for and presented in a nice clean manner. I’m using your methods to filter very noisy tank levels to look for unauthorized movement. The algorithm is implemented using a simple weighted moving average for near-real-time differentiation.

I’m pretty interested in the one-sided filters, because the longer filters that I need to use (N=19) introduce fairly significant delay into the response of the alarm (about 30 secs). I’d like to see that drop down to closer to 0 secs, and I believe that using a one-sided filter would accomplish that. Am I right? I only need accuracy up to X or X^2 at most. Could you post the derivation of the one-sided filters? I can follow the original, but I don’t know enough about filter design to synthesize the one-sided derivation.

Thanks for the work. It’s giving us the results we want.

#Thank you Bill for your feedback.

Please check this paper for the one-sided filters: … It includes filters for up to length.

I would recommend you to use hybrid differentiators (see pages 4-5 ) for your conditions. They are particularly optimized for no-delay derivative estimation in the presense of noise. I am sure filter with will be enough – please try it. Let me know about the results.

I’ve checked website of the company you work for – automation, control etc. are tasks where numerical methods can be of great interest (e.g. PID).

I would appreciate if you would explain me some of the math-related problems you have in your products. May be we can collaborate on them. Here is my private e-mail: pavel@holoborodko.com.

P.S.

If you want to support me please use this e-mail: pavel.holoborodko@gmail.com for donations through paypal.com.

#I can help with the PID control. Control theory is one of my areas of expertise. The first thing you have got to realize is that each system is different but one can develop formulas that will provide the kind of response you desire.

See this.

http://www.deltamotion.com/peter/Mathcad/Mathcad%20-%20T0C1%20MassOnASpring-PID.pdf

The formulas for the PID gains are only valid for the mass on a spring or similar. A servo motor doing position control would be require a different set of PID gain formulas because it isn’t under damped and it is an integrating system. Temperature and velocity control systems are non integrating because if the control signal goes away the system settles at a ambient point like room temperature or stopped. In an integrating process like position control the system coasts to a stop but does not return to some starting position.

The work sheet I linked to is an easy example but it assumes that the plant transfer function is known and that is the real trick of tuning PIDs. Finding the plant parameters is called system identification and requires using optimizing or minimizing algorithms that minimize the error between a estimated value and the actual process value.

I can see you have Maple so it will be easy for you to translate the first page into maple so you can generate the symbolic gains for just about any linear system. I am sure Maple has minimizing algorithms such as Levenberg-Marquardt or L-BFGS.

#Hi Peter,

Thank you for the clear introduction to the control theory and worksheet to study.

As I understood, task of system identification is very similar to filter design. Here we also have desired transfer function, and filter’s parameters (coefficients) are derived to reproduce it as close as possible. So, basically these tasks are the same.

In filter design nature of the parameters and relations among them are “highly” nonlinear. As a consequence target function has many local extrema, and minimization becomes nontrivial task. Applying straightforward methods like L-BFGS etc. can be misleading since they will converge just to the nearest (to starting point) local minimum. And global optimal solution (minimum among of all local minima) won’t be found.

I faced such situation many times. To solve it we need to use global optimization routines. I tried numerous optimization packages of such kind even commercial and ridiculously expensive. Unfortunately none of them could solve my target function. I end up writing my own global solver, which uses simple algorithm:

1. generate random points (or low-discrepancy sequence of points).

2. simultaneously start local searches (L-BFGS) from every point.

3. select and keep track of the best local minima found so far.

4. estimate coverage of the solution space by already conducted searches – stopping criteria.

It doesn’t require huge memory or CPU power, quickly gives list of acceptable solutions to continue research, refines best solutions as long as program runs. I believe this kind of solvers is called stochastic.

Do you face similar difficulties in control theory? What do you think about my simple global solver – can it be applied in your field?

Sorry for the long post, just need to share my thoughts.

#I like to think of the problem as going down hill and trying to seek the lowest spot. What I have found in motion control that often the optimizer comes out of the hills to find a huge flat desert that is so flat that numerical errors become significant or the error windows must be extremely small. You can try to come out of the hills from different directions but finding the true bottom is difficult without a lot of starting points and even then you may not try the right one. There is nothing magic about L-BFGS except is finds an answer quickly but who is to say that one answer on the flat desert is any better than the other? My answer was to try to shrink the desert. So we have found that the trick is to excite the system to get information so there are more ‘hills’ and the valley floor is narrow and quickly leads to a small pond ( the answer). I found this out by taking a problem I had difficulties with and plotting the resulting norm vs two of the parameters at a time so I had multiple 3D plots. This gave me a visual clue of what I was against and how to excite the system to yield the best information.

Other things I have thought of is finding the point that is in the middle of the desert. This would allow for maximum variation of parameters before the norm increases. I don’t know how valid this is though.

Your multiple search technique is one that I used way back in the 80s ( yeah, I’m older ) but I/we did use random starting places. Think of coming out of the hills from different directions. BBackack then we were optimizing the orientation of logs for peeling into veneer. Since this was a real time process we would trying different solutions from different ‘directions’ until time ran out and we use the best solution so far.

I can ID non-linear systems. I write up my system as differential equations and solve using simple RK4. The optimizing routine tries different coefficients in the differential equation(s). Usually it is something simple like a non-linear valve or in one case it was water level control in a condenser with non-vertical sides. These are easy IF the excitation is good and all the significant things can be measured.

I haven’t had time to follow up on what has been said in the last two months. At this time we still use the Savitsky-Golay filter in some parts of our application. I am also looking at a simple Alpha-Beta-Gamma filter which is a very simplified steady state Kalman filter. The advantage of the Alpha-Beta-Gamm filter is that is produces the position, velocity and acceleration with little or no lag and filters noise like a Kalman filter but one doesn’t need to know the covariance of the process and measurement noise. One simply selects a bandwidth that probably isn’t optimal but more than good enough. The idea is to get 90% with 10% effort.

I am interested in the one sided filters for position, velocity and acceleration to compare with the Alpha-Beta-Gamma filter

http://en.wikipedia.org/wiki/Alpha_beta_filter

#Peter,

I’m experimenting with ABG and one-sided differentiators. I sample sine wave on with additive gaussian noise at uniform points.

I apply one-sided differentiator listed above to estimate derivative and I get average absolute error around . I did not tune filters for this data.

When I apply Alpha-Beta-Gamma filter (ABG) results depend very much on parameters selection. I’ve spent an hour to select parameters to minimize error of velocity estimation for this dataset and the best I could get was error around . The best combination was . For other combinations error was much higher.

It is too early to make conclusions but from this little experience I can say that ABG filter is not easy to tune for the particular signal/noise properties. But what if the input data change characteristics with time?

One-sided fixed filters are much easy to use – no tuning is needed+they have less computations. ABG can give comparable results after much tuning.

Could you give me test data and parameters so I can make comparison for it.

#In this directory I have my Alpha-Beta-Gamma files. They are written using wxMaxima which is free. Since you know how to use Maple you should wxMaxima to be similar with a few quirks.

I did this simulation to see how well I can estimate positions from a 14 bit marine motion sensor that measures roll, pitch and heave for a crane application. The motion controller must hold the crane steady in rolling waters. I am assuming the wave has a period of 10 seconds and an amplitude of 1 meter. I added sampling jitter and noise.

At line %i54 there is a comment about how I ‘tune’ the gains. Basically, I have 3 choices that can be cut and pasted from the comments section into the code.

The Alpha-Beta-Gamma filter is a very simple form of steady state Kalman filter where the gains are chosen to have the desired response to an error but the response may not be optimal. The IAE, integrated absolute error, coefficients work the best. The end user must only adjust the bandwidth to get acceptable results.

A Kalman filter would ensure the minimum error but it is very compute intensive. Also, the use must know covariances of the measuring device and system. Fat chance. Even experienced engineers end up ‘fudging’ the parameters to get desired results so why bother?

Notice that I can estimate positions between the readings. The sensor updates only every 10 milliseconds but I can estimate position, velocity and acceleration every millisecond. I don’t expect your filter to do this.

What I am looking for is an easy way of estimating the position, velocity and acceleration with minimal error. It will be interesting to see how your filter does with this problem.

#Oh, I found your Maxima worksheet – ABG.html

I’m studying it, thank you.

#You should load the ABG.wxm file into wxMaxima.

#Hi Peter,

Finally I’ve got time to continue studying ABG filter – sorry for the delay.

I’ve generated noisy data using your Maxima worksheet, plugged it into C program and tested ABG against SG filters.

I’ve implemented ABG in C using your algorithm and selected parameters to be optimal for IAE according to formulas in your worksheet.

Here are my results and thoughts.

ABG is second order filter in a sense that it uses 2nd degree polynomial for signal modeling with adaptive update of its coefficients on every step in the direction of minimum error. This has significant impact on quality of acceleration modeling.

I’ve got following comparison of ABG with the second order one-sided SG in terms of IAE of position(X), velocity (V) and acceleration (A) modeling (lower is better):

Second number in SG2() is a filter length. We see that by using long filters SG2 can reach and even outperform ABG. Also both types of filters predict (A) with low quality which is natural since they are based on parabola, which can predict second derivative (A) only by primitive step wise function.

Results for comparison of ABG with third order SG (on the same data) are:

Here we see that 3rd order SG filter outperforms ABG in prediction of acceleration as we expected since SG3 approximates second derivative (A) by linear function.

My conclusion.

ABG is good for signal smoothing (X) and estimating its first derivative (V). However for the second derivative modeling I would recommend to use third order filters (Alpha-Beta-Gamma-Delta?).

The question of which filter is better has no simple answer. It depends on application. The only disadvantage of SG filters is that they need N previous samples in order to do next prediction. Cases of non-uniform data and estimation between samples are covered by SG (it is just polynomial approximation after all – we can evaluate polynomial at any point, even extrapolate). For uniform sampling SG have simple rational coefficients and can be easily implemented using fixed point arithmetic – which is huge plus.

If application allows it is good idea to try both approaches.

I didn’t use SNRD filters in comparison for two reasons

1.In our application we have Gaussian noise, SG filters are optimal in such conditions. SNRD will give worse results for sure. After much research I found that SG filters have much broader applications in comparison to SNRD, which are needed only in very specific fields. I will reflect this on the page soon.2.It is not easy to build SNRD of 3rd order (even 2nd order is a problematic issue) since condition of complete noise rejection force filter to have significant error in pass-band.Anyway I am really impressed with the performance of ABG filter. Thank you for introducing it to me.

I would be grateful if you would give me further references on this topic (optimal parameters selection, etc.)

#Hi Pavel.

I am trying to calculating the acceleration of a wheel.

Magnets are regularly spaced on the wheel and time is measured when they pass before a Hall sensor.

In this case, time is x in your formulas and is not regularly spaced if the rotation speed is not constant.

So I am trying to apply the formula for the irregular spacing.

The problem is that it yields a second derivative that is always equal to zero, whatever the times that are recorded:

f(k) = f(0) + k*delta and f(-k) = f(0) – k * delta

Thus f(k) + f-k) = 2 * f(0), and the sum for the second derivative is equal to zero.

Am I missing something , or is there another way to proceed ?

#I am afraid you have detected flaw in second derivative formulas for irregular spaced data.

I will develop another generalization for such case.

#“However for the second derivative modeling I would recommend to use third order filters (Alpha-Beta-Gamma-Delta?). ”

OK, I have done that too.

http://www.deltamotion.com/peter/Maxima/ABGD.html

Pavel, now I am estimating a jerk too. In motion control the derivatives are used calculate the velocity, acceleration and even jerk of a master or reference axis that slave axes are geared too. This way feed forward gains can be calculated for the slave axes which results in a much smaller following error.

I will update the ABGD.wxm to provide IAE coefficients for the gains like the ABG filter. The IAE coefficients seem to work just a little bit better than the Butterworth coefficients.

I can expand ABG filter to any order I need but after a while there is no point. The most I have ever needed is the position, velocity, acceleration and jerk (PVAJ).

I have no doubt the Holoborodko filter can beat the ABG filter if you use enough coefficients but now you are doing quite a few multiply and adds. Is your filter still one sided? I haven’t seen how you calculate the one sided second derivative yet. Now I have an idea of what you are doing I can strain my brain and figure it out given time.

Here is a question that will stump you because I wrestle with this too. How do you initialize 20 or even 40 past values? Problems, problems, problems.

#In my previous post I explained comparison of one-sided Savitzky-Golay filters (SG) with ABG. So you can repeat tests easily.

I didn’t use my filters (SNRD) in tests since it is pretty difficult to design them for such case.

#Hi Pavel.

Great and interesting stuff that you have worked out here. I’ve been using your smooth noise-robust differentiators in my scientific work. It worked very well. Therefore I would like to refer to you/your work in my paper I am writing on.

Is there a publication dealing with this differentiators that I can refer to?

#Hi!

I am glad my method helped in your research. Thank you for letting me know.

I do not have published paper on these filters. But I would be grateful if you would site it using something like:

P. Holoborodko. (2008) Smooth noise robust differentiators.

Please send me info on your paper once you finish it – I will reference it on this page.

Thanks!

#Hi Pavel,

Thanks for an extremely helpful website!

I’m trying to determine acceleration from heave (position) data for waves in a hydraulics lab using second derivative central difference . (Sampling at 0.02sec, wave period 1-3sec)

This should be relatively simple and the results seem reasonable, but I’m getting very different results depending on the filter used:

-N=7 central difference

-N=9 central difference

-N=7 Noise robust CD

-N=9 Noise robust CD

-N =9 Noise robust CD with ‘higher approximating order’

They all follow the same basic trend, but individual values vary wildly. And the Noise Robust method gives ~1/5 magnitude of the ordinary method.

Am I using N far too low given the relatively high sampling rate? Or am I completely off track?

Thanks

#Hi Ben,

@”And the Noise Robust method gives ~1/5 magnitude of the ordinary method.”

Ordinary CD is unstable on noisy data but constant ~1/5 ratio seems strange. Usually difference between CD and SNRD looks like:

Blue – sine wave data corrupted by noise.

Red – true derivative of sine.

Grey – CD estimation.

Green – Noise robust differentiators.

CD is a ‘saw’, SNRD is in the middle of that ‘saw’. So ratio is variable.

Drawing shows first derivative estimation. Situation for second derivative is similar.

@”Am I using N far too low given the relatively high sampling rate?”

Maybe it is reasonable to increase filter length until difference between estimated values from and becomes below some threshold. But I didn’t test this idea.

Although it is hard to guess, I think it is enough to use filters exact on – their coefficients can be calculated for any , and they provide reasonable approximating precision. Just be aware that for large multiprecision arithmetic is needed for coefficients calculation.

I would suggest you to try Lanczos (Savitzkiy–Golay) filters too – they might be even better for your application depending on noise characteristics.

#Ben, what is the estimated dimension of your system (the number of degrees of freedom) ?

#Pavel,

I’ve had some convergence with the SNRD N=19 and N=21 with all values within a few percent, so we’re on the right track. I’ll try a couple orders higher and try to get it below 1%, which will be fine for this application. I’m using excel- any idea what N will be limited by excels accuracy?

I should have clarified the 1/5 ratio – this is a rough idea of the magnitude of events, it wasn’t an exact 1/5 for each timestep. Your plot above explains it nicely (though 5x is a lot of noise). And it will explain some troubles I’ve had with another application so we’ve solved two things.

I don’t really understand the phrase “exact on 1,…,x^3”. If the formulation I’ve used applies to the second differential, where does 1,x,x^3 fit in.

Many thanks Pavel – for the quick reply, for the coefficients algorithm and a great website.

#Assuming Excel is 32-bit, . Exact on means that approximation order is of , so filters give precise derivative on polynomials up to cubic parabola.

#Hi Pavel,

I have just come across your work with much interest.

I am trying to implement a backward second order derivative – in time – as part of a code I am writing for the numerical simulation of electromagnetic wave propagation.

The problem I am having is the late time instability with the propagated signal, and think that using your filtered derivative approximation might help in this regard.

With respect the PDF link you had posted above on (June 22, 2010 at 4:03 pm | Permalink), I have derived the coefficients for N=5 for the second order derivative, which gives the following:

f”(xo) = 1./4./h/h [ fo – 2. f-2 + f-4)]

When I had applied it, it does give a better response than the classical backward difference in terms of filtering out the very high frequencies that give rise to unwanted noise, however, the amplitude response has been damped by a factor of about 30% of the expected value. I am not sure if this is expected, the above formula is wrong, or do I have to use more terms to ensure better accuracy.

Also, as you recommend that the above formula is based on an exact approximation for up to x^2. Is it possible that you can make available second order derivative approximation to a higher degree at all please?

Thanks for your help with this in advance,

Adam

#Hi Adam,

Thank you for your question and feedback.

Formula for $N=5$ you’ve tried is Ok. It just targeted for low-frequency signals.

But it seems that your data contains frequencies in mid-range – maybe somewhere in $f=\frac{\omega}{2\pi}\in\,[0.1;0.15]$. You can check first drawing in the Extensions section – $N=5$ gives second derivative with reduced amplitude in that range – as you described. Longer filters will damp amplitude even more.

Special filter can be (and should be) constructed for your case. We have to use completely different procedure for derivation – one in the PDF is just not applicable.

Could you estimate major frequency range of your data? Does is change with time?

Having this information I’ll try to derive suitable filter (for the small reward of cause:-)).

#Hi Pavel,

Thank you very much for your reply.

The range of frequency in the signal I had used with respect to my previous post was extending up to 0.125 in comparison to the normalized frequency range in the extensions section you are referring to.

I have now reduced the maximum frequency range of the signal to 0.075 instead, but the same observation in terms of damping is still preset.

As for your questions:

1- I would appreciate it very much if you can construct an alternative filter that can maintain a higher accuracy on the amplitude obtained.

2- For the example I am using at the moment, the signal frequency range does not change with time. However, I will be using other structures where this is going to be the case. For example no-linear materials will generate higher frequencies to those in the input signal and so this would be extremely useful if can be considered.

In all cases, I presume the alternative filter you are going to generate will be dependent on the range of frequency used, and so if you can allow for this to be variable within the remit of a generalized formula then that would be ideal. Currently, I am using signals that have a maximum range of frequency that falls inbetween this interval [0.05; 0.2], and extending from zero.

Thanks again,

Adam

#I’ve checked $N=5,\,7$ filters on 0.075 frequency signal – estimation error < 1% for both of them. For 0.2 frequency signal estimation error was less than 12%. $N=5$ gave better estimation in both tests – it has wider bandpass interval which is required for your task. I think there is some inaccuracy in your implementation (or some misunderstanding between us) - error shouldn't be so high 30%. Could you just take simple pure-frequency signal $\cos(2\pi f x)$ and apply filters for it? That is how I did my tests using $f=0.075$ and $f=0.2$. Or just test your routine on $x^2$. Filters should give precise second derivative for it at any point. Let me know about results.

#Hey,

This looks useful. Is the method different than doing a polynomial fit over a window and then differentiating the polynomial? I am interested in the Matlab code mentioned earlier. Does it exist and is it available?

Thanks,

Daniel

#Low-order smooth differentiators (first table) can be found as least-squares polynomial fit with binomial weights + subsequent differentiation.

However this is not very helpful, since we know analytical expressions for the coefficients – no need to fit anything.

High-order approximating filters (and second order derivative) cannot be derived from polynomials.

Several guys volunteered to write Matlab library – I have no idea of the results (if any) though.

May be somebody reading this page will respond to you.

#Pavel,

I was looking for a smoothing method for data from optical sepctroscopy and stumbled upon this site. I’m not directly interested in derivatives but understand that smoothing is an inherent property of the method due to polynomial interpolation. I was wondering if you could provide a recursive formula to calculating coefficients of the noise-robust smoothing function (n=2, any odd N). I could find some hard values of coefficients for N=7 and N=9 in a comment on one of your other pages (about central differences), but not a formula for arbitrary N.

Thank you,

Dirk

#Sorry for the late response, Tokyo still having strong aftreshocks.

Smoothing filters for n=2 are just rows of Pascal’s_triangle, which can be easily calculated (check “Calculating an individual row or diagonal by itself” on wiki page).

I do not know closed form expression for n>2 though. BTW those formulas are not based on polynomial fit.

#Hello Pavel,

Thanks for the post, this is just what I was looking for. However, in following it I get a little lost between the 5 equation expansion i.e 2i*sin(0)sum… etc and the formula for ck, i.e ck=1/2^(2m+1)[… etc.

I am trying to form a method in matlab that can output a matrix of coefficients cf, given n and N. Could you possibly provide a formula that can yield this?

-Cheers

Colin

P.S I hope all is well with you with the earthquake situation, hope you didn’t lose anything.

#Hi,

Thanks for your site- I have to work through it properly but a quick question- if the optimal filter for an n’th order derivative is two sided then do you know of a simple way of providing the best one sided approximation to that one sided derivative?

Any ideas would help

many thanks,

Mark

#For noisy data you can use: (a) Savitzky-Golay one sided filters, (b) Alpha-Beta-Gamma filter, (c) or Hybrid filters described here …

(a) optimal if data corrupted by Gaussian noise.

(b) deliver similar performance to (a) but doesn’t require memory to store previous values.

(c) good for rejecting high-frequency noise and cascaded filtration.

#Hi Pavel,

I have a problem that I’m hoping you have a method for, or can give some direction.

My task is to measure the velocity of water currents at the sea bed induced by a manoeuvring boat, to determine if seagrass will be affected. Using a downward looking ADCP (Doppler) I get current vectors at 5cm bins at 0.2Hz, however the data is quite spiky particularly at the lowest bins where the signal is affected by the bed itself. So I need a filter that is one-sided vertically so that it is accurate at the bed, but possibly two-sided in the time domain, ideally it should be vectorised to account for the direction of the flow (in two dimensions, the vertical velocity is ignored)

Is this unreasonable or very difficult? This is a relatively small project so a simple, less-than-ideal solution rather than a complex one is needed. Can I filter the x and y components of velocity separately? Maybe a one sided filter applied to the vertical bins will be sufficient. Any comments that you have will be welcome.

Regards, Ben

#Hi Ben,

Sorry for late response – I was on vacation.

It is really difficult to suggest something without knowing the specifics and seeing the data.

As far as I understand – you have components of velocity vectors (d/dx, d/dy) already calculated (if not – this is very different task, look for isotropic gradient estimation, Scharr mask).

If they are noisy just apply some 2D smoothing filter on them, like median filter or Gaussian smoothing filter. And yes, you can filter x and y components separately.

However if you want some advanced processing (like rejecting outliers, making vector field very smooth, etc) you can look up methods developed for motion vectors handling in video processing. Specifically in time interpolation schemes (insertion of a new frames between existing in a video sequence) – they usually apply sophisticated methods on vector field smoothing.

#Hi Pavel,

Thanks for getting back to me. After a bit more research I ended up applying a Gaussian filter to the magnitude only to remove some noise, and the results were fine. I know its not a robust as it could be by separating the components, but will do for this project. I’ll keep your other suggestions in mind for future.

All the best -Ben

#Hi Pavel,

I used your one sided filter (of coefficients given in the report paper, with $ 1, x, x^2 $ with 15 backward points of coefficient $ 1/2856 $) to a sinusoidal (namely $ 100*\sin(2*PI*0,2*t)$ ), and see that resultant maximum value should be near 125,66 but it was 158,14. Moreover when you plot, the difference is clearly seen. What do you think the caveat comes from? Maybe I am doing something wrong?

In the report only the results are being given, I would like to learn how to generate coefficients in the general case for example up to $ x^3 $ with 15 or 19 points to try something,

Thank you, this is a very very illuminating post …

#Hard to say, could you send your plots (or code on how do you apply filters)? My e-mail is pavel@holoborodko.com

#Thanks for the answer, I have send you an excel file that should explain what I wanna say in a more clear way I hope, thanks again.

#Thanks for the data.

Frequency response plots in my report are drawn for $h=1$ (Time Diff in your worksheet). If we use different $h$ then actual input signal frequency is $f*h$.

Signal with $f=0.2$ and sampling step $h=0.1$ corresponds to $\omega/2\pi=0.02$ point on frequency response plot.

In this point $N=15$ filter gives greater values than ideal differentiator – so this is correct behavior.

If you want more precise estimation – (a) decrease sampling step (usual way in numerical analysis). This will shift high frequencies towards origin (low frequency region), where filter is more precise. Or if you cannot change sampling step then you have to (b) design specific filter which is precise on this particular frequency (so called band-pass differentiator).

Filters in report (and this page) are designed to be precise on lowest frequencies, near zero, for the most common case.

#Oops, I missed a critical point …

Well, I cannot change the values present in the excel file, and that frequency is expected maximum for our data. From your explanation what I understand is I need a smooth low-pass differentiator up to that frequency…

Many thanks for the clarification,

@li

#hi Pavel,

I’m working on some economical series and test with success your one-sided differentiator hybrid version (which is really really good) and I would like to know how it’s possible to have an N up than 15 for the hybrid formula as i need to test from 21 to 28 ??

thank you

#The irregular spacing formula doesn’t seem to work. To see this, notice that the derivative doesn’t depend on $x_0$ so we could place $x_0$ anywhere we like between $x_{-1}$ and $x_1$ and still get the same result. Perhaps you could try weighted differences, but I’m not sure that will work either – for example if the weights have linear terms in $x_0$ it would fail for quadratic functions, etc.

#Points $\{x_{k}\}$ depend on $x^{*}$, please check notation under

Derivation. They are just distances from $x^{*}$ to points where $f_{k}$ are sampled.#Not sure I follow, because Derivation defines the $x_k$ as equally spaced.

Anyway, if we set $f(x)=(x-a)^2$ with $x={-2,-1,0,2,4}$ then the irregular spacing formula seems to imply that $f'(0)=3/2-2a$. Could you please show how to apply it correctly?

#I’m sorry for late response. The problem with irregular spaced formulas is that they accurate only on $1,x$.

It is extremely difficult to derive high approximation order formulas for arbitrary spaced data.

I should have written about this fact on the page – sorry….

#No worries. The derivation of the regularly spaced formula is a very neat trick though!

#Hi Pavel,

I use your smooth noise-robust differentiators, for the analysis of time resolution of electromagnetic calorimeter prototype for the SuperB Factory R&D. I will cite your page on my thesis work.

Really a well done algorithm! 😉

cheers

Matteo

#Thanks Matteo, I appreciate this.

#I am having a really hard time with the general form for c_k. Specifically, the lower term in the second combination will be negative as k gets closer to M in the sum. I am not aware of a generalization of the binomial coefficient formula to negative k (lower number), only to negative n. Is there a problem with the formula?

#Hi Holoborodko,

For some reason, the hybrid differentiators proposed in ‘One

Sided Noise Robust Differentiators.pdf’ does not comply with

the tangency constraints proposed above!

For example, when N=4 and n=2, I’ve taken m=1 and I’ve generated a system

of (3+2) equations for tangency constraint and I’ve found that

the coefficients from ‘One Sided Noise Robust

Differentiators.pdf’ satisfies all equations but not the last one!

Can you point me to what I’m doing wrong?

Thanks in advance!

#Hi Islam,

You are doing nothing wrong – hybrid filters from the paper are not derived from tangency constraints.

They are experimental one – I’ve used very different idea to build them.

#Pavel, have you published any articles for this work?

#No.

In average 22 visitors read this page every day. Total number of readers is 16,417.

I think no journal can provide such audience.

#Hi Pavel,

I want to estimate velocity from position data at very low low speeds. My input data to the controller is digital in nature (Optical Encoder). How do i make use of your algorithm to estimate velocity information.

Thanks

#“I want to estimate velocity from position data at very low low speeds. ”

Do you want to do this real time or in some post processing phase?

Are you trying to do this with a PLC?

Calculating velocities real time at low speeds is very difficult. First you must provide yourself good data to work with and use the highest resolution encoder you can find. You may get better results counting the time between counts instead of counting the counts per fixed time. If you end up counting the time between counts then the smooth noise-robust differentiator will not work because the time periods will not be equal.

#I want to estimate velocity in real time. I am doing it in a digital device (FPGA) with optical encoder inputs (4000 cpr). I want to see improvements with the Pavel’s algo if it can help.

#Majid, an algorithm will only do so much to compensate for a poor data. 4000 cpr is not enough. Try 400,000 cpr! I have customers that use encoders that provide over 500,000 cpr!

I will assume some numbers since you have not provided any. If 1 revolution is 4000 cpr and the distance per revolution is 100mm then you have 40 counts/mm. If the product is moving 1mm/sec then there are 25 milliseconds/count. If you try to compute the speed every millisecond then most of the time you will not even see the counts change. Even at 400,000 cpr the counts will change every 0.25 milliseconds. This would mean that most of the time you would see 4 counts per millisecond with some jitter so some times there will be 3 counts and other time 5 counts in a millisecond. Even this is not very good but now you can consider using the SNRD algorithm. You can see you don’t have a chance with only 4000 cpr.

One can see that it is better to count the micro seconds between the pulses to estimate speed when the encoder resolution is low. This should be easy if you are working with an FPGA. Then I would use the αβγ algorithm discussed above but allow the time between readings to be variable. This is easy to do.

#Hi Pavel,

very good work! I use your differentiators for parameter identification of fuel cell models

and get very fine results.

Thank you for publishing online.

Regards, Markus.

#Thanks Markus, I appreciate your feedback.

#Hi Pavel,

Have you ever look into the fourth order?

Best,

Kevin

#Hi Kevin,

Not yet – (why) do you need one?

#Thanks for responding Pavel,

I’m looking into high-order image reconstruction, and I think the fourth order would be interesting to play with =D.

#Pavel, I am working with stock price data which is very jagged, up one day down the next. I would like to see the first N derivatives (N<5) side by side with minimal delay and error. What formulas on this site would you recommend?

#Hi Shelly,

I assume you need real-time processing, when you can use only past values to compute derivative at present moment.

Filters of such type are called one-sided.

If you want to compute just first derivative – you can use filters from my report: One Sided Noise Robust Differentiators.pdf .

Use filters from second table – they give less time delay.

However, if you want to see several derivatives simultaneously – it is a little bit more challenging task.

We cannot apply the same filter (1st derivative) several times to get higher order derivative, since results will be de-synchronized in time.

We need distinct filter to estimate derivative of every order – 1st, 2nd, 3rd, etc. Moreover all the filters should be

synchronizedwith each other by time delay and error. So that estimates will be provided for the same time (e.g. all delayed by one sample).Otherwise results of estimation will relate to different moments in time and analysis will be meaningless. Synchronization issue is particularly important if you want to use computed derivatives in further numerical modeling (e.g. PDE or else).

I have experience in building such special set of synchronized filters (although only for the 1st and 2nd order).

If you are interested – I can provide you assistance on the task.

Let me know your proposal by e-mail: pavel@holoborodko.com

#To get the higher derivatives would consider something like the alpha-beta-gamma filter mentioned above. However, differentiating is noisy and since all the derivatives are related by a factor of the sample time there are real problems with numbers getting too big or too small when trying to estimate higher derivatives.

Here is an example of the alpha-beta-gamma filter as used in motion control. In motion control phase delay must be minimized.

http://deltamotion.com/peter/Videos/AlphaBetaGamma.mp4

I only calculated the second derivative in this example but I could expand that to add two more but one needs 64 floats to have any chance of it working properly because the round off errors get to be significant. See this:

http://deltamotion.com/peter/Maxima/Savitzky-Golay.html

You should notice that each derivative is a factor of T from the other. Dividing by T^4 can result in very big numbers if the same time is small. If the sample time is big then the numbers can get to be very small.

#@…one needs 64 floats…

I think we can even use arbitrary precision floats, since most likely stock price analysis applications are not limited in computational resources. But most probably they just use sampling step =1.

Peter, could you give me pointers where to read about ABG analysis.

As I recall, ABG was equivalent to SG of some particular degree in my tests, which means phase delays for position, velocity and acceleration are not the same, de-synchronized.

#Paval, there is lots of information if you search for “alpha beta gamma filter”. Most applications involve tracking. In my case I want to track another motor so I can gear another motor to it. There are other applications like tracking images.

The a-b-g filter isn’t magical. It is the poor cousin of the Kalman filter. The big difference between the Kalman filter and the a-b-g filter is how the gains are computed. After that they are identical. In both cases there is a system matrix that predicts or estimates where the state will be in the next iteration. The actual position is compared to the estimated position and multiplied by a vector of gains that corrects the estimated position, velocity and acceleration.

The Kalman filter will minimize the error at the cost of a lot of CPU power. One also needs to know about the system noise and measurement noise. Most of my customers have no clue how to calculate that so even many engineers just fudge the numbers to get acceptable results. So if one is going to do that why not simply use a a-b-g filter where the gains are easily calculated but not optimal. In my links in previous threads I believe everything is shown how to implement a a-b-g. I did include three different ways of calculating the a-b-g coefficients. One closes the error using real poles, one uses a Butterworth filter. The B-filter reduces the errors faster than using 3 real poles but it can overshoot when correcting. The third one uses coefficients than minimize the integrated absolute error. It closes the poles faster yet but is still more prone to overshooting .

What ever method you chose one still selects a bandwidth and increases or decreases it to get the desired results.

BTW, in the video. In the video the estimated acceleration would sometime not settle down to zero when the motor stopped turning. The acceleration would oscillate a bit around zero. This happened because the estimated position and the quantized measure position would often never be the same so the error was multiplied by the gains so the estimated value always changed unless buy luck the estimated and actual position became identical. The articles don’t go into the quantizing effect of the feed back since they are just academic articles. The motor example is real and the resolution of the feed back is quantized. The same would happen with pixels in any vision system. Stock prices don’t have infinite resolution either.

#Hi Peter, sorry for the late response, I’ve been busy with other tasks.

I know what a-b-g is, what I asked for was detailed analysis of synchronization between position, velocity and acceleration estimated by a-b-g. Sorry for not making clear myself before. And let me describe my ideas.

A-b-g assumes that position can be approximated by parabolic polynomial, velocity by linear polynomial and acceleration by constant at any given moment of time.

In nature this is equivalent to Savitzky-Golay filter of 2nd degree (at least locally). We build parabola based on the past values, sample it at current time (position), and sample its first and second derivatives (for velocity and acceleration):

x(t) = at^2+bt+c

v(t) = 2at+b

a(t) = 2a

The problem with this approach is that all three quantities are estimated with very different approximation errors. Acceleration is approximated by step function – O(1), while position estimation error is of O(Δt^3).

In practice this means that phase delay for every parameter (x, v, a) is different. They are de-synchronized in other words. Let’s take a look at following plot for demonstration (I use N=20):

Left axis is a time delay in samples. As we see filter gives position, velocity and acceleration for different moments in time. For example, velocity is delayed by 4 samples and acceleration by 8 comparing to position at w=0.05.

From the plot we see that one-polynomial approach tries to estimate position with very high precision at cost of acceleration & velocity. Note the negative phase delay near origin – this is attempt to “predict” position ahead in time, while acceleration is given for 10th sample from the past (this mean 100ms delay for 100Hz rate). As some of my clients reported, such time shifts can be disastrous for math models (PDE or else).

I’ve been working on designing filters bank which provides synchronized estimation for position, velocity and acceleration. This is what I’ve got so far:

All three parameters are estimated with the same approximation error, and same time delay in entire pass-band (up to w=0.05).

Let me know your opinion.

#This doesn’t surprise me at all in fact it is desired. If you plot a sine wave the acceleration leads the velocity by 90 degrees and the velocity leads the position by 90 degrees.

#Yes, phases of the acceleration, velocity and position are pi/2 apart.

Of cause, plots above are already normalized in this respect, all shifts are subtracted. We see “true” delay (in samples) which every filter introduces in estimation of characteristic it aimed to. We compare apples to apples :).

So, a-b-g estimates position with approx. 1.7 samples delay, velocity with 5.5 and acceleration with 10 at frequency w=0.05.

UPDATE: Pass-band of a-b-g filter is adjustable, so precise numbers may be different for different parameters. Main point is that a-b-g by design approximates position, velocity and acceleration with different accuracy (parabolic, linear, constant), which leads to different phase delay in every characteristics.

#I need to estimate the time domain second derivative for a time domain finite difference simulator with attenuation (the second derivative is a need for loss calculation). I used a simple finite difference approach with 3, 4 and 5 samples. The one with four samples has the best characteristics of convergence and the 5 coefficients has the worst. I suspect that the poor performance of 5th order estimation of the second derivative is due to numerical noise. I would like to use one of your forward filters with 5 samples (exact on 1, x, and x^2). I cannot use the central differences because I do not have the value of the function in advance and I cannot use too many values because the storage and the reading of the values in the past are already very large with 5 samples. What are the coefficients for a five samples forward second derivative and … do you have any suggestion?

Very nice work,

Gaetano

#Here is 5-tap Savitzky-Golay filter (which I believe will be better than smooth differentiators) for quick test:

It gives 2 samples delay, and provides following noise suppression:

Good alternative is Alpha-Beta-Gamma filter which needs only 5 variables to store its state. With proper tuning it can provide even better noise suppression.

#Hi Pavel (and Peter),

thank you for the quick answer.

If you want to better understand what I want to do, you can look at this document (that I used as a base of my simulation for propagation with loss): http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0CFcQFjAB&url=http%3A%2F%2Fmurphylibrary.uwlax.edu%2Fdigital%2Fjournals%2FJASA%2FJASA2004%2Fpdfs%2Fvol_116%2Fiss_5%2F2742_1.pdf&ei=gF_IT8unN8-p-gbcpslg&usg=AFQjCNH3ciKvlTelE1SAj4I6vJoqVJHwSw&sig2=sMhbmdaJvLAoXdXsN7y__g (if the link does not work well just google: liebler efficient time domain implementation). As you can see, in that document they end up to need an approximation of the second order time derivative (equation 22).

I tried the Savitzky-Golay filter you gave me in my simulator: the resulting code is less stable than simple finite difference approximation (with 3, 4 and 5 coefficients, http://en.wikipedia.org/wiki/Finite_difference_coefficients):

(1 -2 1)/Dt^2

(2 -5 4 -1)/Dt^2

(35/12 −26/3 19/2 −14/3 11/12)/Dt^2

I’m puzzled: if I use your formula on noisy data, it gives a lower noise estimation of the second derivative but if I use in my code the results is worse. It is unstable and after a few step the estimation of the stress “explode”. As you can see, in the paper of the link, they say that the second of the above finite difference approximation will work better than the first: that is true. What I do not understand is why the third is not even better.

Is the equation you sent me delayed in time or is a forward estimation? I did not understand what you mean with: <>.

Bye

Gaetano

#I didn’t read your post of Mar 27 very well. Now I understand what you are trying to do. You must mean the phase delay for the estimated values compared to the true values. Yes, but one can increase the order of the αβγ filter to a αβγδ filter and then the acceleration is estimated using a second order equation. The usefulness of this is limited as the quantizing error multiplied by 1/Δt³. Increasing same time helps but only of the resolution of the feed back device is finer. There is no magic.

The key is to adjust the bandwidth to calculate the K gains to give the desired results. The Kalman filter finds the optimal K gains but often these gains are calculated using inaccurate noise estimates so the engineers fudge the gains anyway. You can see in the video that the phase delay is very small if I keep the bandwidth up high but if I lower it then the phase delay becomes noticeable. I have also tested the Savitsky-Golay and the Smooth noise robust differentiator using the same motor setup. The problem with these two methods is that the quantizing error from the encoder really degrades the performance if the filter is one sided.

This out weighs the easy of use ( no bandwidth to adjust ) advantage of the SG and SNRD filters.

Another problem is initializing the history or state. The S-G and SNRD filter one must filling past history to initialize it. The A-B-G filter simply needs the current state. In motion control the current state is almost always known or what it should be ideally.

The state for the alpha-beta-gamma filter only requires 3 variables. That should be all Gaetano needs to get a good velocity estimate.

Gaetano, look at my posts above about the αβγ filter.

#This won’t improve synchronization – x,v,a will be estimated with different phase delay each anyway.

This problem relates to αβγδ only, since it constructs cubic polynomial on the fly on every step (hence 1/Δt³ is needed).

However fixed filters have no such a problem. We can use polynomial of any degree in filter design process (using Maxima, Maple or else) – final formula for acceleration will be multiplied by only 1/Δt

^{2}and velocity by 1/Δt. High degree Δt^{n}cancel each other along the way, thanks to equidistant spacing and laws of physics.Bandwidth is adjusted by filter length. Phase delay is anti-proportional to bandwidth (as smaller band pass/smoothness as higher time delay) in any filter.

I beg to disagree. Besides α, β and γ values we have to store position, velocity and acceleration values from the previous step. In total 6 floating point values should be stored between steps (I was mistaken about 5 values, I forgot acceleration:)).

My point is that αβγ filter is near-equivalent to SG of some high degree (20), with its own pros and cons. But they both have drawback of de-synchronized estimation – position, velocity and acceleration have different phase delays. This can be alleviated by building family of filters which are synchronized by nature, for any sampling rate.

#Hi Pavel,

This post is some time ago, however did you continue working on synchronising the phase delay to improve with SG or ABG? Did you come to any conclusions about this in relation to ideal filtering on motion control? Thank you.

#Hi Pavel,

first of all great job, your work is amazing.

I’m writing you cause facing with some trouble with estimation of coefficients $s(k)$ in “Noise Robust Differentiators for Second Derivative Estimation”.

I would be gratefull if you could post the extended formulas for numerical estimation of second derivative-backward for 15 and 22 points.

Thanks a lot Pavel.

Jax

#Pavel,

I am not very well versed on the math of the subject however by reading your paper, I think I can use your filters in my application.

I have a signal that I sample at 500khz. I am trying to detect a rise, fall and the peak in the incoming data. The base of the peak could be for 250 usec or 2.5msec, amplitude could be 6db or 15db above the noise floor. I don’t have good snr unfortunately. The dc level of the signal is not constant but move much more slower than the ac component.

I couldn’t determine what are the N and n I should use so that I pick the right frequencies and suppress the noise.

At the decision point, I need to know the slope of the rise and fall. This is a hard realtime system and I really need to make a decision in the 100usec after the downward slope reach to dc level.

A secondary side question is the amplitude, due the signal, I cannot know the exact amplitude of the signal, however I like to compare one signal to the other one, which means I need to normalize the signal. Is there a way to do this using your filters.

Thx

KT

#Hi Pavel,

It is a great work. By the way, how can I calcualte the frequency BW for a first order smooth noise-robust differentiators?

Thanks,

Thomas

#Hi Pavel,

As with your original (two-sided) differentiators, can you explain the thinking behind the one-sided differentiators? That is, how the coefficients are derived, etc.?

Thank you for your excellent work!

Uzair

#Pavel,

I have tried to use your formula for the first derivative for irregularly spaced data without success. I can exactly reproduce the derivative of a polynomila (2nd order) when using evenly spaced data, but when the data are irregularly spaced the estimated derivative is quite a bit off. I’m happy to share my data, etc., if it helps to gain an understanding why the method is not working.

Best,

Cliff

#Do you know the corresponding smooth filters to replace the asymmetric (and symmetric) filters in the code below — I would like to compare stability within some other code. (It is a straight paste, but should be readable.)

DO 60 J=4,N-3

der1(J) = ( f_l(J+3) – f_l(J-3) &

– 9.D0 * (f_l(J+2) – f_l(J-2)) &

+ 45.D0 * (f_l(J+1) – f_l(J-1)) ) &

/ 60.D0

60 CONTINUE

! Use unsymmetric 6-point formulae for the derivative at the

! last 3 N points.

der1(N-2) = – ( 3.D0*f_l(N ) &

– 30.D0*f_l(N-1) &

– 20.D0*f_l(N-2) &

+ 60.D0*f_l(N-3) &

– 15.D0*f_l(N-4) &

+ 2.D0*f_l(N-5) ) &

/ 60.D0

der1(N-1) = – (- 12.D0*f_l(N ) &

– 65.D0*f_l(N-1) &

+ 120.D0*f_l(N-2) &

– 60.D0*f_l(N-3) &

+ 20.D0*f_l(N-4) &

– 3.D0*f_l(N-5) ) &

/ 60.D0

der1(N ) = – (- 137.D0*f_l(N ) &

+ 300.D0*f_l(N-1) &

– 300.D0*f_l(N-2) &

+ 200.D0*f_l(N-3) &

– 75.D0*f_l(N-4) &

+ 12.D0*f_l(N-5) ) &

/ 60.D0

#N.B., I am probably interested in N=7 n=4

#Very interesting post. I was wondering could you publish the one-sided first derivative on irregular spaced data. Thanks.

#Hi Pavel!

I stumbled across this web-site while searching for robust differentiators, and was impressed by your work!

I’m also wondering if it’s possible to extend the one-sided filters you published to irregularly spaced data?

Cheers

Konstantin

#Thanks for an interesting article.

I have a question about your formula for the general coefficient $c_k$ in terms of binomial coefficients,

there appears the binomial coefficient

$\left( \begin{array}{c} 2m \\ m – k – 1 \end{array} \right)$

where k goes from 1 to M. But with the way m and M are defined $m -k – 1$ will become negative for the larger values of k.

Is there a typo in the formula, or how do you define the binomial coefficient for a negative argument?

#Hi Magnus, thank you for your question!

This binomial coefficient is zero when $m-k-1 \le 0$. We can compute this by using general definition of binomial coefficient through $\Gamma(n)$ function, with reflection (+limiting) trick for negative arguments.

#“<0", not "<=0" I guess.

Is there a formula using a more common definition of the binomial coefficents so we could use the appropriate functions built in math programs (which fail for neagitive values)

#Hi Pavel,

I am trying to do some pulse sharpening using Y(t)-a*Y”(t)+b*Y””(t) where a and b are real constants. Is there a way to derive a filter for this directly ? If not, is there a formula for the fourth derviative that does not do the second derivative twice ?

Thanks

Amit

#Hi Amit,

Yes, this expression can be written as one filter (from superposition of filters for every summand: Y(t), a*Y”(t) and b*Y””(t)).

And yes, it is possible to derive explicit filter for fourth derivative (e.g. from superposition or directly).

Hope this helps.

#Could you please add a general expression for coefficients for polynominals up to 4th degree or a way to derive them? I understand how to get the number for each N, but I am not sure how to change them into binomial coefficients…

#Hi Pavel,

I am interested in a *formula* for 1D “One Sided Hybrid Differentiators” (as described in your pdf: http://www.holoborodko.com/pavel/wp-content/plugins/download-monitor/download.php?id=7 ). Note that I will need filters with N>15. Values for a specific N are thus not that useful. Any help/pointer is much appreciated.

Martin

#Hi Pavel,

I have a question about the eigenvalues of the numerical differentiation matrices which I believe is closely related to this post. If you have some time, could you take a look at it?

http://math.stackexchange.com/questions/491170/1d-schrodinger-laplace-equation-via-finite-differences-incompatible-eigenvalues

And thanks for this nice and clear post!

Felipe

#Hi Felipe,

I answered to your question on math.stackexchange site few days ago.

Short answer is that you have to use operator convolution instead of product to get higher-order derivative filter from low-order ones.

In frequency domain you can use product of transfer functions, but not in time domain.

#If I understand well, taking nth order derivative that is exact for 1, x (either centered or backward)

is numerically equivalent to

making least-square fit over the same points with Ax+B then taking A as the derivative, right?

Consequently, when the formula is exact to 2nd order polynomials, it is equivalent to fitting N points with

Ax^2+Bx+C then taking df(x_0)=2Ax_0+B at given point as a slope?

Except that those formulas are faster to calculate than performing general least-squares fit formulas, of course…

#Yes, this is correct, filters designed that way called Savitzky-Golay filters.

The only difficulty – there is no flexible way to control noise suppression nor phase delay of such filters (nor other properties).

These properties are fixed in least-squares method + polynomials (time domain).

If you consider filter design in frequency domain (as above) – you will have much more freedom to build filter with any properties preserving exactness on polynomials if desired.

It is like having more parameters / degrees of freedom to tune your filters, whereas least-squares approach have all parameters fixed except for filter length and degree of polynomial. (You can use weighted least-squares to have more degrees of freedom)

BTW, least-squares can be formulated in frequency domain too – it is just one of many possible design criteria accessible in frequency domain.

Just imagine – in frequency domain you can add additional conditions to the least-squares (like phase shift minimization) – and get improved Savitzky-Golay filters with smaller delay (if we talking about one-sided filters).

This topic is so deep, I would be a millionaire if I would get penny for every imaginable criteria & its combinations ;))).

#Hi, i’m doing my thesis in physics.

In one of my experiments i need the second derivative of one of my measurements.

The data is irregularly spaced.

Searching for the “best” second derivative algorithm i came up with this site, which is very interesting by the way.

So, i tried to implement the “second derivative for irregular spaced data ” algorithm

What are the possibilities that something is wrong in the formula?

Just by looking at it i realize that there is no S[0] coefficient.

alpha[k]~k^2*S[k] . So it means alpha[0]=0.

Also, the sumatories there goes from k=1 to M, so the alpha[k] involved does not cointain S[0].

So S[0] is not in the formula, but if you choose regularly saced data “h” you need to recover the original “Uniform Spacing” formula in which S[0] is present.

Could it be that “-2” the missing S[0]?

Thanks in advance

ps: i tried the algorithm “as is” and get all near zero and zero values.

Nevertheless my code could be wrong i need to chek!!!.

But i wanted to be sure of the algorithm first

#Me again,

I leave the algorithm as you said with the “-2” and check with some “test” functions (some polynomials).

It is working fine with these test functions.

(still don’t understand when the S[0] enter in the formula)

Nevertheless, when i use the algorithm in my data measurements i get second derivative near zero values!

If, on the contrary, i use 5 point stencil the derivative works fine.

#Thanks for the algorithms; I am using it for the development of a rate time decline analysis software suite of – always noisy – oil and gas production rate versus time data. Very useful; thanks again

#Hello,

I saw your solution and I tried what you suggested to the first user in order to avoid time lag, in particular

h[0] = 3/8, h[-1] = 1/2, h[-2] = -1/2, h[-3] = -3/4, h[-4] = 1/8, h[-5] = 1/4

and it seems pretty good. For this filter I noticed that the gain remains pretty high and alligned witht he ideal solution even at high frequencies while the phase starts going out of 90° before. Is there a pattern of coefficient that can do the viceversa? I mean keep the phase higher in frequency and start decresing the gain before?

thank you very much

Michele

#Hello,

at first I didn’t notice your pdf “One Sided Noise Robust Differentiators.pdf “, I just found it now. I tested the One Sided Hybrid Differentiators and they are fantastic. Thank you very much to share your work.

Michele

#Pavel,

I am looking at implementing your equations in MATLAB, as shown below. I need to work out the general form of the one sided Z-transform of your difference equation to directly calculate the coefficients in the transfer function. If I have the coefficients in the transfer function, I can directly use MATLAB’s very fast “filter” function. This is not my area of expertise and frankly I am new to the Z-transform. Do you have any suggestions?

,

#A matlab function implement one of Pavel’s formula can be found here:

http://www.mathworks.com/matlabcentral/fileexchange/45745-smooth-robust-differentiators

#The links to the On-Line Encyclopedia of Integer Sequences” (A039598, A050166) are broken. The Encyclopedia can now be found at https://oeis.org. The links are https://oeis.org/A039598 and https://oeis.org/A050166

#Now links are fixed. Thank you.

#First, excellent work… Second, in the generalized form shown for n=2, it appears that the coefficient for the last term, deltaf( (N-1)/2) is always one and the coefficient for the previous term, deltaf((N-3)/2) is always (N-3). Is that true for all cases involving n=2? If so, that might help address some of the previous posts concerning the computation of these coefficients when the binomial coefficient involves a negative value, which appears to only occur for the last two coefficients.

Also, is it possible to derive this generalized equation for n=2 s.t. the location of the derivative estimate within the window is a variable? If that is feasible, would you be able to post that?

Again, excellent work, and thank you

#Hello Pavel,

great work indeed. I played around a bit, and I found I found an interesting implementation in Matlab for the noise differentiators with n=2.

This implementation also generalizes for even filter length.

It is based on first computing the cumulative sum of the filter coefficients, based on tartaglia triangle, or repeated duo-binary filtering,

and then extracting the coefficient as the derivative of this. The cumsum tends to a gaussian since repeated convolutions of [1,1] bring to a gaussian,

and the differentiator filter kernel tends to its derivative.

—————————-

function [filterKer,filterKerINT] = lowFreqDiffCoeffOrderN_v2(N)

% x1 = [1,1]; x11=[1 2 1]; %M=1 1 1 0

% x2 = conv(x1,x11) %M=2 1 3 3 1 0

% x3 = conv(x2,x11) %M=3 1 5 10 10 5 1 0

% x4 = conv(x3,x11) %M=4 1 7 21 35 35 21 7 1 0

% xx = [-0.5 0.5 1.5]; N = 3; 2*filterKerCumsum = 1 1 0

% xx = [ -1.5 -0.5 0.5 1.5 2.5]; N = 5; 8*filterKerCumsum = 1 3 3 1 0

% xx = [-2.5 -1.5 -0.5 0.5 1.5 2.5 3.5]; N = 7; 32*filterKerCumsum = 1 5 10 10 5 1 0

% xx = [-3.5 -2.5 -1.5 -0.5 0.5 1.5 2.5 3.5 4.5]; N = 9; 128*filterKerCumsum = 1 7 21 35 35 21 7 1 0

% 2^(N-2)*filterKerCumsum = correct convolution

%like this it is valid also for even N

if N < 3

error('invalid param');

end

x1 = [1,1];

for jj = 2:N-2

x1 = conv(x1,[1,1]); %this tends to a Gaussian

end

filterKerINT = diff([0,x1,0]); %a LPF differentiator is indeed diff(Gaussian)

filterKer = filterKerINT./(2^(N-2));

#Hello Massimo,

Thank you for sharing your code & findings. I am sure – it will be useful for other people, as I have been receiving requests for MATLAB code :).

I suspected that filters should be close to derivative of Gaussian. The difference comes from the fact that low-noise filters are more “perfectionistic”, using all the DOFs to explicitly zero the high frequencies (at any cost). Gaussian is more relaxed in that respect.

I would say that least squares filters and low-noise are two extreme examples of how noise can be suppressed (non-smooth and smooth).

There are infinite number of filters could be designed in-between of the extremes.

Gaussian is just one of the possibility, which are closer to low-noise.

P.S.

Another interesting fact is that low-noise filters (n=2) can be build using least squares (aka Lanczos or Savitzky-Golay approach) with weights from Pascal triangle. This way we would use Choleskiy decomposition to build the filters. Math is fascinating indeed!

#Dear Pavel,

Thank you for the great solutions that you offer to solve my problems!

I’m interested in advanced finite difference schemes computed on noisy signals. More precisely, I focus on real-time calculation because of my laboratory applications, and thus I’m only looking for backward schemes (because these are causal).

Could you please give me the general formula for the coefficients ck for your one-sided differentiators?

Best regards,

Fabien

#Hi Fabien,

In my Matlab implementation’s of Pavel’s work, I layout the recursive formula and non-recursive formula.

http://www.mathworks.com/matlabcentral/fileexchange/45745-smooth-robust-differentiators

This is from the documentation I wrote for the one sided robust differentiation:

Recursive Formula

coefficients = zeros(1,n+1);

coefficients(1) = 1;

for iRow = 2:n

previousCoefficients = coefficients;

for iColumn = 2:((iRow+1)/2)

coefficients(iColumn) = previousCoefficients(iColumn-1) + previousCoefficients(iColumn);

end

end

% Reflect coefficients about the center of the vector and multiply by -1

coefficients(ceil((n+1)/2+1):end) = -coefficients(floor((n+1)/2):-1:1);

Non-Recursive Formula

k = 1:(n+1)/2;

coefficients(1) = 1;

coefficients(2:ceil((n+1)/2)) = gamma(n)./(gamma(k+1).*gamma(n-k)).*(n-2*k)./(n-k);

% Reflect coefficients about the center of the vector and multiply by -1

coefficients(ceil((n+1)/2+1):end) = -coefficients(floor((n+1)/2):-1:1);

#Thank you Jason, very efficient!

#Hello Pavel!

I have tried your filter for smoothing chromatographic data, and it seems to perform much better compared to classic Savitsky-Golay. In particular, peaks are preserved even for large filter lengths — while Savitsky-Golay tends to “merge” them into one.

I wonder however whether filter order n > 2 would bring additional desired properties, for my type of data. I wasn’t able to derive closed-form expressions for filter coefficients in case of n > 2. I noticed that you provide the coefficients for particular filter length, but not general formula — is it possible that you share a sketch of how you derive these?

Your idea and results seem very simple and sharp at the same time… I think that’s what called brilliance 🙂

#Hello Andrey,

Thank you for your warm feedback!

I am not sure that closed-form expression exists for the case. Just derive few filters and try it.

#I’ll try to follow your derivation, and share the results if discover something interesting.

I have another question, however. Your filter works very well for estimating derivative. But is it possible to obtain just smoothing filter (not differentiator) with similar properties (exact on 1, x, x^2, … and good noise suppression)? I guess is it 🙂 — did you try it?

#Yes, it is possible. Use symmetric filter structure and use other boundary conditions at origin (frequency response = 1, and derivatives = 0).

I did derived such smoothing filters (even for 2D), but couldn’t find my scripts at the moment :(.

Keep me updated on your progress, my direct email is: pavel@holoborodko.com

#hi Pavel,

i am using your differentiator in a very different way as most people here.

i have a gps watch which allows me to enter some basic programming. i am using your algorithm in order to calculate current slope, as derivative of altitude with respect to distance. gps data is noisy and with a very poor granularity (altitude resolution is 1m, and i get distance data only every aprox. 5 seconds, but not constant, might be 3 but also 8, f.i.).

this is one of the versions of my program

http://www.movescount.com/apps/app10090854-Running_Power_GOVSS_V2

you’ll see i mention you in the source code and the app description.

i need the slope to calculate estimated power output, and two sided versions of your algorithm cause a delay which causes problems with the power calculation. therefore i am using your One Sided Noise Robust Differentiators.pdf

my problem is, i have irregularly spaced data. there are no equations for this in your pdf. i could infere the first one, which was pretty obvious. but does not filter that much. could you provide me the formula for N=3, and maybe also N=4?

thank you in advance

#Hi, your algorithms are great – to say the least.

I was looking at NoiseRobustSecondDerivative.pdf for second order derivates with N = 11 or higher.

However I fail to reproduce the coefficients given on this page for 5, 7 and 9.

I checked my algorithm several times, but instead for N=5 of

0: -2

1: 0

2: 1

I keep getting

0: -125.75

1: 0

2: 1

I am using the following node.js code, identical to yours in NoiseRobustSecondDerivative.pdf:

var n = process.argv[2];

var m = (n-1)/2;

`function s(k) {`

if (k>m) return 0;

if (k==m) return 1;

return ( (2*n-10)*s(k+1) - (n+2*k+3)*s(k+2) ) / (n-2*k-1);

}

`console.log("k s(k) for N = ", n);`

for (var i=0;i<=m;i++) console.log(i, s(i));

console.log("Divider", Math.pow(2,n-3));

which can be run by

`node file.js 5`

#Dear Pavel,

excellent work! Are there closed formulas for the coefficients for n>2, i.e. n=3 or n=4?

Best regards,

Marc

#Thank you, Marc.

Probably there is analytic expressions, but unfortunately I am unaware of it.

Andrey Paramonov has used new approach to derive the closed formulas for smoothing filter of the same type: http://www.holoborodko.com/pavel/numerical-methods/noise-robust-smoothing-filter/

Something similar can be used for differentiators I guess.

#Thank you Pavel. Could you publish the frequency response of the second derivative filter as function of the coefficients sk (centered type)?

#I mean the formula like H(w) = 2i … for the first derivative.

#Hi Pavel,

Thanks for your great formulae for noise-robust differentiators!

I’m using your differentiator to calculate D part of PID controller in my INAV flight control software. Discussion is done at RCGroups thread here: http://www.rcgroups.com/forums/showthread.php?t=2495732. Source code (if you are interested) is hosted on GitHub – https://github.com/iNavFlight/inav.

I find PID controller for flight stabilisation with your differentiators much more robust and noise-resistent – which is exactly the way to have perfect flight performance.

Best regards,

Konstantin

#Thank you for letting me know and congratulations on the great project! (omg, there are 365 pages of discussion…).

#Thanks! The project is about a year old, so it’s a page per day on average…

#Pavel, hi again.

There’s an ongoing discussion on using your approach for derivative calculations on GitHub here https://github.com/cleanflight/cleanflight/pull/2087

People are discussing frequency responce and phase delay of your filters against 1-st order/2-order low-pass filtering and Savitzky Golay filter. I feel you may be interested in this discussion and may provide a valuable input from mathematical point of view.

#Very interesting discussion indeed!

Of course, it is not good idea to apply (anti-)symmetric filters to estimate characteristics of a signal in real-time (delay will be half of the filter length). Causal systems need causal filters :). Only then good balance between delay/desired characteristics/overshoot can be achieved.

Design of causal filter comes to minimization of non-linear cost function with many degrees of freedom. I have been working on this for a while with some promising results, but not ready to share the details now.

Actually I know entire industry where such filters are of crucial importance – finance. They need them to detect/predict price change with minimum delay :).

(Sorry for delay in reply, we were having looong Golden Week here).

#Very nice method. I looked for, but didn’t find, a python implementation of it, so I created one. You can find it on my github page under the name holoborodko_diff.

#Dear Jonathan, thank you very much for your efforts!

#Hello, I’d like to thank you for the derivative expressions as they’ve been a huge help.

However, when I tried to recreate One Sided Smooth Differentiators using the formula given at OneSidedNoiseRobustDifferentiators.pdf my results did not match with the ones obtained from given expressions. I was wondering if you could check that general formula in OneSidedNoiseRobustDifferentiators.pdf as I suspect there is a second component missing (similar to Fk – F-k in your article).

Thank you

#Dear Pavel,

For my research work I obtain numerical derivatives of oscillating bodies’ displacement data to reconstruct the force on the body undergoing fluid induced motions as a spring-mass-damper oscillator. The experiment data is coming from a virtual spring-damper system, so the displacement I get directly is pretty smooth. However, when I take the first and second derivatives of it to be used in reconstrucctin of the force the error introduced by the numerical differentiation becomes important. I use the reconstructed force in an empirical mode decomposition method for which mode-mixing can be an important issue if the signal has high frequency noise (error coming from the numerical differentiation in my case.) As a result, I wanted to loo kfor noise-smoothing/robust numerical differentiation schemes. I am assuming I will need a long stencil-low order method to do this.

Having summarized my interest and problem, I am willing to code the procedure you describe here. I am thinking in a way such that the proper zero-lag filter properties (using a function like MATLAB’s filtfilt) with the stencil length corresponding to a desired smooth ratio are automatically computed by looking at the frequency characteristics (i.e. average zero-crossings and FFT) of the signal. I already have a way of computing the proper stencil length for a rectangular or triangular pass with a desired smooth ratio assuming the signal is behaving like a sinusoidal. What I am curios about is that could you please provide a general system of equations for any desired order of tangency accuracy? You already provided an example for the case where n=2 like:

Differentiator of any filter length N can be written as:

f'(x*) = …

where

c_k= …

Would you mind putting general expressions for the system of equations to be solved for any order of accuracy ‘n’?

I am open to an email conversation on this if you are willing as well. I would be glad to hear from you.

#Hello Mert, Pavel sent me an e-mail because he knows I do what you want to do. My company makes hydraulic motion controllers. Controlling hydraulic motion is controlling a mass between two springs. Our motion controller can compute the actual velocity and actual acceleration fairly accurately. Calculating the actual velocity is necessary for using the derivative gain. Calculating the actual acceleration is necessary for using the second derivative gain..

The problem you have is that you are using the quantized and possibly noisy data to compute the actual velocity and actual acceleration. As you have found out that doesn’t work unless the signal is clean and the feedback resolution is close to infinite.

My solution is much different than using a Savitsky-Golay or Pavel’s smooth robust differentiator. These methods are easy to use but have weaknesses when data is digitized ( quantized ) or not sampled at regular intervals. The solution I use requires a lot more math. First, I use a method called system identification to compute the coefficients of the second order differential function. The specific functions I use are Levenberg-Marquardt or the BFGS minimization functions. What they do is minimized the sum of squared errors between the measured data and the estimated data generated by the model. I use the model to estimate position, velocity and acceleration and sometimes even the jerk or third derivative but the estimated jerk starts to get “noisy”.

Since the model is never perfect there needs to be a correction method. For this I use a Luenberg Observer to estimate position, velocity and acceleration. The Luenberg observer uses the error between the estimated and measured data. The observer is a simplified Kalman filter. Both techniques are about the same but the Kalman filter guarantees minimizing the error between the estimated data and the actual data in theory but the means the measurement and system noise must be known which rarely happens. The Luenberg observer simply lets the user select the desired pole locations that provide what appears to provide the best results]

See my videos on system identification. There is a simple version and advanced version.

https://www.youtube.com/channel/UCW-m6-nwUfJrnZ0ftoaTU_w

or simply search for Peter Ponders PID on YouTube.

Another thing you should do is see some of my previous post on this forum about alpha-beta-gamma and alpha-beta-gamma filters.

#Hello Pavel

I’m making a program in C++ (learning) and I’d like to try my hand in implementing your differentiator for various n, m, and orders. But I’d like to clear out a few misunderstandings from my part (I’m not a mathematician).

In generating the system of equations, you use the nth derivative of H(x) (for the sake of cleanliness, let’s consider the real axis, get rid of i). For x=0, you equate diff(H(x), x, n) = diff(Hd(x), x, n), but Hd(x) = x, which means the only time when the evaluation has a result is at the 1st derivative, when it’s 1, in rest, it’s zero. For x=pi, everything should be zero. Is this true? And all even derivatives are zero, which means the matrix will be square, size M. With these in mind (assuming they are true), is there a special arrangement this leads to because of the “trend”: diff(,,1)=>k*c[k], diff(,,3)=>k^3*c[k], etc? I mean if this leads to some special, known form of matrix that would facilitate its numerical evaluation (I plan to use LU decomposition).

I am already using MPFRC++ which turns out to be of a great help compared to bare MPFR, but I’d also like to ask if it’s alright to use your method (and, maybe, others) in a program that, if and when it will see the daylight, it will be open-source (Qt)?

Vlad

PS: I sent this yesterday, but it didn’t appear, so if it comes up as a duplicate, my apologies.

#Hi Pavel,

Please take a look at the following works, I beilieve what’re you trying to do here is known as MaxFlat lowpass differentiaion

Selesnick, Ivan W. “Maximally flat low-pass digital differentiator.” IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing 49.3 (2002): 219-223.

Also, a recent what just got published, might be useful:

Hosseini, Mahdi S., and Konstantinos N. Plataniotis. “Derivative Kernels: Numerics and Applications.” IEEE Transactions on Image Processing 26.10 (2017): 4596-4611.

Cheers,

Mike

#Hi Pavel, Hi Mike,

I can confirm that the latter paper of the two papers Mike mentions produces the same filter coefficients, as far as I have tested.

Thank you in any case Pavel, I really appreciate your write-up on the matter.

Cheers

Oliver Haas

#The second paper is on arxiv as well:

https://arxiv.org/pdf/1709.08321.pdf

#Did you ever think that this methodology could be used to derive Linear Multistep Methods for integration of Ordinary Differential Equations?

The Backward Differentiation Formulas (BDF’s, aka Gear’s method) are derived from Lagrange polynomials. The Lagrange polynomial is differentiated at right point of the window and thus “backwards difference.”

https://en.wikipedia.org/wiki/Backward_differentiation_formula

I can use using this idea to step over faster dynamics when running ODE fixed step integration. We often need a solution in fixed step solver for microcontrollers. However, we need stability over accuracy. Accuracy is important but stability is more important at the fixed time step.

#Dear Pavel,

i am using the approximation formula for second derivative for irregular spaced data in the backward form which I found in “Noise Robust Second Derivative.pdf”. I want to estimate acceleration from optical encoder counts (position) using the timestamp information of the counts.

I get strange results (always 0) using the formula $\frac{1}{2^{N-3}} \left(\sum_{k=1}^M \beta_k(f_{-M-k}+f_{-M+k})-2f_{-M}\sum_{k=1}^M \beta_k \right)$, and looking deeper on it i realized that $(f_{-M-k}+f_{-M-k}-2f_{-M})$ is always zero if there are no changes in direction an no missing counts. This because the position $f_k$ is equidistant spaced.

I see the same problem happen in the centered version of the formula.

I am missing something? can I have some hint to derive myself the formula?

Thanks

#Hello Pavel

Nice work publishing this here and thanks for sharing. Really educative.

A question:

I tried applying the 4-point differentiator you list in the table titled as “smooth noise robust differentiators (n=2, exact on 1, x, x^2 )”.

The sign of the derivatives is not what I expected and not what I get by using a 2-point central difference scheme.

Could you check the signs in these formulas?

I believe they are the opposite of what they should, unless I am doing something wrong.

Many thanks,

Konstantinos

#The low noise differentiators have been used in our signal processing systems for the BCM1F detector of the CMS Experiment at CERN.

Thank you very much.

Here is the link of a publication:

https://www.sciencedirect.com/science/article/pii/S0168900218310647?via%3Dihub#bb5

#I like your one-sided differentiators. Is there a simple way to introduce various levels of lag, from very small to larger, in the formulas/results?

#Have you seen this:

Maximally Flat Lowpass Digital Differentiators

Ivan W. Selesnick

August 3, 2001

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.218.4587&rep=rep1&type=pdf

Looks awfully similar to what you are proposing here.

#I had no idea about the results of other researchers when I was working on the filters (~2008). I saw the related papers long after writing this post.

(We exchanged emails with Ivan on this topic few years ago).

In fact, design of maxflat filters is a long standing research topic in pure signal processing – there are many related papers (including smoothing and narrow band filters), see comments here: http://www.holoborodko.com/pavel/numerical-methods/noise-robust-smoothing-filter/

My initial motivation to build the filters was rooted in numerical analysis (rather than in signal processing) and my design approach reflects this – providing (I think) interesting mixture of tools from both fields.

Classic numerical methods are usually build around accuracy on polynomials (interpolation, series approximation, etc.).

But, if we consider its properties in frequency domain (as digital filter), then we have entire new dimension of possibilities – while keeping the polynomial accuracy we can improve stability (e.g. high-freq. suppression), derive coefficients of any form (e.g. powers of 2), and derive method with special properties for the particular task. This is the main point of my approach – combine numerical analysis and signal processing to view classic methods from new perspective. After all, exactness on polynomials is just one-point constraint in frequency domain, all other DoFs can be used for other purposes.

This idea can be applied to nearly any other classic numerical methods (e.g. integration, interpolation, etc.). Numerical derivative is supposed to be one simple demonstration of the flexibility we can get from this approach.

#Hi Pavel,

this is a really interesting read.

I took the liberty of taking the large m limit of your expression, as I was interested what form it would take if I had very finely sampled data, and found that it seemed quite clearly to be converging to:

how interesting! It appears just to be a Gaussian multiplied by a linear term! Is this in essence a reinvention of differentiation by integration with a Gaussian kernel? I’m quite interested to hear your thoughts on this.

I repeated the same for Andrey’s work, but instead did so numerically. It became quite difficult to expand analytically and numerically, so I took the Stirling Approximation to compute the terms and found that if one takes the large m limit there, and plots numerically, the function appears to be very close in form to the Lanczos filter sinc(x) sinc(x/2) for |x|<2 and 0 else, but with some extra undulations to obtain the required smoothness in frequency space. I found very little difference (aesthetically and numerically) if one uses this replacement kernel, but instead note that the kernal is about 1/20th the size. Curious to hear your thoughts!

Cheers,

Jesse

#Hello Pavel,

After reading your work, I implemented a small C++ library that I was using for trying out different filters. I think the lib can largely be improved but if you are interest in, please check here: https://github.com/vsamy/DiFipp

Cheers,

Vincent

#Pavel

I have been reading your emails for many years. I have a problem that I hope you could help me with:

I found it amazing that “Borwein, etc.” had come up with a technique to calculate pi digit-by-digit. Their technique puts out the digits in the hexadecimal format. I would prefer decimal format. Would you know of any technique that accomplishes this?

Thank you

Sheldon Lebowitz

#Hi Pavel,

Thank you so much for sharing these helpful information to get smooth derivatives, however I have data of non-constant intervals, so I would appreciate if you could share the equation related to non-constant interval.

Great thanks,

Ashkan Alborzi