Smooth noise-robust differentiators

Introduction

This page is about numerical differentiation of a noisy data or functions. Its first part briefly summarizes results from the other articles I wrote covering standard approaches of computing derivative numerically. Text is self-contained, so you don’t have to read all other articles for understanding.

Description begins with analysis of well known central differences establishing reasons of its bad noise suppression capabilities.

Then I consider differentiators with improved noise suppression derived by classical method of least-squares smoothing. Such filters are known as low-noise Lanczos differentiators or Savitzky-Golay filters. I briefly sketch its properties and main drawback – “wavy” always-non-zero response on noisy high frequencies.

Finally I propose my original filters for numerical differentiation with guaranteed noise suppression and very efficient computational structure (coefficients are fractions with the denominators of power of 2). You can skip directly to the table of final formulas.

November 10, 2011: It is not technically correct to state that Savitzky-Golay filters are inferior to proposed ones. Actually both have situations when one outperforms another. Savitzky-Golay is optimal for Gaussian noise suppression whereas proposed filters are suitable when certain range of frequencies is corrupted with noise.

Prior methods

Classical approach of computing derivative numerically relies on approximation of f(x) near target point x^* by some polynomial P(x). If neighborhood of x^* is chosen properly then P(x) approximates f(x) well on it and we can assume that f'(x^*)\approx P'(x^*).

Particular methods can be derived by using different approximation technique of f(x) by P(x) in this general framework.

Central differences are obtained by interpolation of f(x) by P(x). Since interpolation lacks for high-frequencies suppression they work well only for noiseless functions whose values can be computed precisely. In the case of noisy functions/data central differences give inadequate results.

Red dashed line is the response of an ideal differentiator H_d(\omega) = i\omega. More detailed analysis on their properties can be found on central differences page.

To resolve this issue Lanczos differentiators (Savitzky-Golay filters) use smoothing least-squares approximation of f(x) by P(x) to remove noise from the data. As a result such method is much more robust to noise. However due to least-squares nature Lanczos differentiators cannot guarantee complete noise suppression in high frequencies. Its magnitude response doesn’t go smoothly to zero near \pi. Instead they resemble “wavy” always-non-zero response on high frequencies.

Although amplitude of waves is decreasing for longer filters in order to achieve acceptable suppression one should choose very long filters. See Low-noise Lanczos differentiators page for more details.

From signal processing point of view differentiators are anti-symmetric digital filters with finite impulse response of Type III. There are powerful methods for its construction: based on windowed Fourier series, frequency sampling, etc. Parks-McClellan algorithm is one of the best among others. The main idea behind them is to construct filter from the required magnitude response. Although such algorithms are capable to generate near-optimal solution they spread error (deviance from the target magnitude response) uniformly in passband (low frequencies) and stopband (high frequencies). On the contrary differentiator needs to be very precise on low frequencies. Moreover filters designed by standard frequency-based methods lack for crucial time-domain characteristic – exactness on polynomials.

Task statement

To summarize, desired numerical derivative computation schema (filter) should posses following properties:

  • Exactness on polynomials.
  • Preciseness on low frequencies.
  • Smooth and guaranteed suppression of high frequencies (to be noise robust).

Additionally it should have computationally favorable structure to be effectively applied in practice. This page is devoted to development of such method (skip to the table of final results).

Derivation

Let’s denote filter length as N (odd), filter’s coefficients as \{c_k\}, function values sampled at N equidistant points around x^* with some step h as:

f_k = f(x_k),\: x_k = x^*+kh,\: k=-M,\dots,M,\, M = \frac{N-1}{2}

Then numerical derivative can be written in general form as

f'(x^*)\approx \frac{1}{h}\sum_{k=1}^{M}{c_k\cdot(f_k-f_{-k})},

As we said before, \{c_k\} is anti-symmetric filter of Type III. Its frequency response is (h=1):

H(\omega)=2i\sum_{k=1}^{M}{c_k\sin(k\omega)}

Our goal is to select coefficients \{c_k\} such that H(\omega) will be as close as possible to the response of an ideal differentiator H_d(\omega)=i\omega in low frequency region and smoothly tend to zero towards highest frequency \omega=\pi.

The most intuitive way of doing this is to force H(\omega) to have high tangency order with H_d(\omega) at \omega=0 as well as high tangency order with \omega axis at \omega=\pi. This leads us to the system of linear equations against \{c_k\}:

 \left\{ \begin{matrix} \displaystyle \left. \frac{\partial^{i} H(\omega)}{ \partial \omega^{i}} \right|_0 & = & \displaystyle \left. \frac{\partial^{i} H_d(\omega)}{ \partial \omega^{i}} \right|_0 & i = 0,\dots,n\\ &&&\\ \left. \displaystyle \frac{\partial^{j} H(\omega)}{ \partial \omega^{j}} \right|_{\pi} & = & 0 & j = 0,\dots,m \end{matrix} \right.

Let’s consider particular example for N=5,\, n=2, and  m=1:

 \left\{ \begin{matrix} 2i\sin(0)\sum_{k=1}^{2}{c_k} & = & 0\\ &&\\ 2i\cos(0)\sum_{k=1}^{2}{k\,c_k} & = & i\\ &&\\ -2i\sin(0)\sum_{k=1}^{2}{k^2\,c_k} & = & 0\\ &&\\ 2i\sum_{k=1}^{2}{c_k\sin(k\pi)} & = & 0\\ &&\\ 2i\sum_{k=1}^{2}{kc_k\cos(k\pi)} & = & 0 \end{matrix} \right. \quad \Rightarrow \quad \left\{ \begin{matrix} 0 & = & 0\\ &&\\ 2c_1+4c_2 & = & 1\\ &&\\ 0 & = & 0\\ &&\\ 0 & = & 0\\ &&\\ -2c_1+4c_2 & = & 0 \end{matrix} \right.

Finally we arrive at the simple solution c_1=\frac{1}{4},\,c_2=\frac{1}{8}. In the same way we can obtain differentiators for any N.

As it can be clearly seen tangency of H(\omega) with response of ideal differentiator at \omega=0 is equivalent to exactness on monomials up to corresponding degree: 1,\,x,\,x^2,\dots,x^n. Thus this condition reformulates exactness on polynomials in terms of frequency domain. So, we don’t need to impose additional requirements to guarantee this valuable property.

Results

Below we present coefficients for the case when n=2 and m are chosen to make quantity of unknowns to be equal to the quantity of equations.


 \begin{tabular}{|r|c|} \hline $N$& Smooth noise-robust differentiators (n=2, exact on $1,\,x,\,x^2$)\\ \hline &\\ 5&$\displaystyle\frac{2(f_{1}-f_{-1})+f_{2}-f_{-2}}{8h}$\\ &\\ 7&$\displaystyle\frac{5(f_{1}-f_{-1})+4(f_{2}-f_{-2})+f_{3}-f_{-3}}{32h}$\\ &\\ 9&$\displaystyle\frac{14(f_{1}-f_{-1})+14(f_{2}-f_{-2})+6(f_{3}-f_{-3})+f_{4}-f_{-4}}{128h}$\\ &\\ 11&$\displaystyle\frac{42(f_{1}-f_{-1})+48(f_{2}-f_{-2})+27(f_{3}-f_{-3})+8(f_{4}-f_{-4})+f_{5}-f_{-5}}{512h}$\\ &\\ \hline \end{tabular}

Differentiator of any filter length N can be written as:

 \displaystyle {f'(x^*)\approx\frac{1}{h}\sum_{k=1}^{M}{c_k\cdot(f_k-f_{-k})}},

where

 \displaystyle {c_k = \frac{1}{2^{2m+1}}\left[{2m\choose m-k+1}-{2m\choose m-k-1}\right]},\quad \displaystyle{m=\frac{N-3}{2}},\quad M=\frac{N-1}{2}

Interestingly that number sequence in numerator (formed by expression in the square brackets) is well known in other areas of mathematics. Please check “On-Line Encyclopedia of Integer Sequences” for more information: A039598, A050166.

Frequency-domain characteristics for the differentiators are drawn below. Red dashed line is the response of ideal differentiator H_d(\omega) = i\omega.

Besides guaranteed noise suppression smooth differentiators have efficient computational structure. Costly floating-point division can be completely avoided. Denominator is a power of 2 and with appropriate selection of h=2^{-p} ( h=1/SamplingRate for digital data) division can be replaced by fast shift operation.

Shift operation doesn’t introduce any rounding error (only exponent is altered, mantissa remains the same) comparing to plain division. As a consequence smooth differentiators are not only computationally efficient but also capable to give more accurate results comparing to other methods (Savitzky-Golay filters, etc.).

Also smooth differentiators can be effectively implemented using fixed point (e.g. for robust edge detection in images).

If we will require tangency of 4th order at \omega=0 with H_d(\omega) (which is equivalent to exactness on polynomials up to 4th degree) following filters are obtained:


 \begin{tabular}{|r|c|} \hline $N$& Smooth noise-robust differentiators (n=4, exact on $1,\,x,\,x^2,\,x^3,\,x^4$)\\ \hline &\\ 7&$\displaystyle\frac{39(f_{1}-f_{-1})+12(f_{2}-f_{-2})-5(f_{3}-f_{-3})}{96h}$\\ &\\ 9&$\displaystyle\frac{27(f_{1}-f_{-1})+16(f_{2}-f_{-2})-(f_{3}-f_{-3})-2(f_{4}-f_{-4})}{96h}$\\ &\\ 11&$\displaystyle\frac{322(f_{1}-f_{-1})+256(f_{2}-f_{-2})+39(f_{3}-f_{-3})-32(f_{4}-f_{-4})-11(f_{5}-f_{-5})}{1536h}$\\ &\\ \hline \end{tabular}

These filters also show smooth noise suppression with extended passband. As filter length N grows it smoothly zeroing frequencies starting from the highest towards the lowest.

Proposed method can be easily extended to calculate numerical derivative for irregular spaced data. General formula is:

    \[ \displaystyle {f'(x^*)\approx \sum_{k=1}^{M}{c_k\cdot\frac{f_{k}-f_{-k}}{ x_{k}-x_{-k}}\cdot 2k}}\]

where coefficients \{c_k\} are the same as for uniform data (can be chosen from the both tables above).

Another extension is filters using only past data or forward differentiators. Please check this report for more information:

One Sided Noise Robust Differentiators.pdf (170KB)

Extensions

There are several obvious extensions of proposed smooth differentiators: second(and higher) order filters, forward(without time lag) filters, 2D smooth differentiators of any order, band-pass differentiators, etc. Here I present only second order smooth differentiators with their properties. I will post other extensions upon request.


 \begin{tabular}{|r|c|} \hline $N$& Second order smooth differentiators (exact on $1,\,x,\,x^2,\,x^3$)\\ \hline &\\ 5&$\displaystyle{\frac {(f_{{2}}+f_{{-2}})-2\,f_{{0}}}{4\,{h}^{2}}}$\\ &\\ 7&$\displaystyle{\frac {(f_{{3}}+f_{{-3}})+2\,(f_{{2}}+f_{{-2}})-(f_{{1}}+f_{{-1}})-4\,f_{{0}}}{16\,{h}^{2}}}$\\ &\\ 9&$\displaystyle{\frac {(f_{{4}}+f_{{-4}})+4\,(f_{{3}}+f_{{-3}})+4\,(f_{{2}}+f_{{-2}})-4\,(f_{{1}}+f_{{-1}})- 10\,f_{{0}}}{64\,{h}^{2}}}$\\ &\\ \hline \end{tabular}

Coefficients of these filters can be computed by simple recursive procedure for any N. Also they can be easily extended for irregular spaced data as well as for one-sided derivative estimation when only past data are available. Please check this report for more information:

Noise Robust Second Derivative.pdf (75KB)

Contact me by e-mail if you are interested in more specific cases.

In case of higher approximating order we get following estimation.


 \begin{tabular}{|r|c|} \hline $N$& Second order smooth differentiators (exact on $1,\,x,\,x^2,\,x^3,\,x^4,\,x^5$)\\ \hline &\\ 7&$\displaystyle {\frac {-(f_{{3}}+f_{{-3}})+5\,(f_{{2}}+f_{{-2}})+(f_{{1}}+f_{{-1}})-10\,f_{{0}}}{12\,{h}^{2}}} $\\ &\\ 9&$\displaystyle {\frac {-7\,(f_{{4}}+f_{{-4}})+12\,(f_{{3}}+f_{{-3}})+52\,(f_{{2}}+f_{{-2}})-12\,(f_{{1}}+f_{{-1}})-90\,f_{{0}}}{192\,{h }^{2}}} $\\ &\\ \hline \end{tabular}

These filters can be used to approximate second derivative for irregular spaced data as

    \[  \displaystyle{f^{\prime\prime}(x^*) =  \sum_{k=1}^{M}{{\alpha_k}(f_k+f_{-k})}}-2f_{0}\sum_{k=1}^{M}{\alpha_k}, \quad\displaystyle{\alpha_k = \frac{4k^2s_k}{(x_k-x_{-k})^2}}\]

where \{s_k\} are coefficients from the formulas for uniform spaced data derived above.

BibTeX

Here’s a BibTeX entry that you can use to cite smooth differentiators in your academic publications:

@misc{snrd,
author = {Pavel Holoborodko},
title = {Smooth Noise Robust Differentiators},
howpublished = {http://www.holoborodko.com/pavel/numerical-methods/numerical-derivative/smooth-low-noise-differentiators/}
year = {2008}
}

Disclaimer

You can use materials from this site in any legal activity (research, teaching, engineering, etc.) freely on the basis of respecting my authorship by giving clear statement and reference to my site or particular page on it in your work. Brief description of your project is welcome in the comments below. One point though, if you use my materials in commercial project please consider supporting my site and me financially.


1 Star2 Stars3 Stars4 Stars5 Stars (21 votes, average: 5.00)
Loading ... Loading ...
Print This Page Print This Page

153 Comments

  1. Yoel
    Posted February 4, 2009 at 12:43 am | #

    Hi Pavel.

    Nice work. I’m using it now to compute the velocity of a robot (MBARS) and your methods give very good results.

    I’d like to know if you have the formula of a one-sided version, as using a centered version forces me to introduce a time lag.

    Thanks, Yoel Shapiro.

  2. Posted February 6, 2009 at 3:28 pm | #

    Hi Yoel.

    I am very grateful for your feedback.
    It is very exciting for me to know about real-world applications using my work.

    Noise robust differentiators without time delay (one-sided or forward filters):
    (As always, longer impulse response means better noise suppression.)

    Precise on 1, x:
    1/4 (h[0] = 1, h[-1] = 1, h[-2] = -1, h[-3] = -1)
    1/8 (h[0] = 1, h[-1] = 2, h[-2] = 0, h[-3] = -2, h[-4] = -1)
    1/16 (h[0] = 1, h[-1] = 3, h[-2] = 2, h[-3] = -2, h[-4] = -3, h[-5] = -1)
    1/32 (h[0] = 1, h[-1] = 4, h[-2] = 5, h[-3] = 0, h[-4] = -5, h[-5] = -4, h[-6] = -1)
    1/64 (h[0] = 1, h[-1] = 5, h[-2] = 9, h[-3] = 5, h[-4] = -5, h[-5] = -9, h[-6] = -5, h[-7] = -1)

    Precise on 1, x, x^2:
    h[0] = 5/8, h[-1] = 1/4, h[-2] = -1, h[-3] = -1/4, h[-4] = 3/8
    h[0] = 3/8, h[-1] = 1/2, h[-2] = -1/2, h[-3] = -3/4, h[-4] = 1/8, h[-5] = 1/4
    h[0] = 7/32, h[-1] = 1/2, h[-2] = -1/32, h[-3] = -3/4, h[-4] = -11/32, h[-5] = 1/4, h[-6] = 5/32
    h[0] = 1/8, h[-1] = 13/32, h[-2] = 1/4, h[-3] = -15/32, h[-4] = -5/8, h[-5] = -1/32, h[-6] = 1/4, h[-7] = 3/32

    One-sided filters have several disadvantages comparing to centered versions.
    For example, to achieve the same noise suppression one-sided filter
    should be much longer than centered one. Also they strongly amplify noise
    in the mid-frequency range.

    It would be nice if you would say me which particular centered filter
    works the best for you. Then I can try to design one-sided filter with the similar properties.

    Let me know about the results.
    Also I would gladly make reference on your project – web page, journal paper, etc. if you would provide me such info.

  3. Yoel
    Posted February 9, 2009 at 7:33 pm | #

    Hi Pavel

    Thanks again.

    I experimented with a few filters of yours and found the smooth noise-robust estimator with n=4, N=7 to be satisfactory.

    I’m about to hand in my thesis soon, but haven’t published yet. I can contact you later and you can reference my thesis if that helps.

    you can check this to see what I’m working on:

    http://brml.technion.ac.il/show_project.php?id=14&type=1

    I used your filter to damp the motion of my controller (PHANToM Desktop).

    if you want to stay in contact better address me at my email.

    Cheers, Yoel.

  4. Henry
    Posted May 26, 2009 at 3:29 pm | #

    Hi Pavel:

    This is an excellent work!
    Could you please post 3rd order for both central difference and smooth noise robust (N=5, 7 or 9)?
    I’d like to study the difference (enhancement) between these two.
    How to estimate the error, O(h^n)?
    Thanks

    • Posted May 26, 2009 at 5:12 pm | #

      Hi,
      thanks for the comment.

      Estimation of the error can be done by plug in Taylor series (evaluated at corresponding points kh) instead of {f_k} in the formulas.

      I’ll publish third order formulas shortly.

      I cannot imagine the practical need of 3rd order numerical approximation of the derivative though. Could you please enlighten me – what kind of task are you solving?

  5. Posted May 29, 2009 at 4:17 pm | #

    1. Third order smooth differentiators of O(h^2) approximation error:

     \begin{tabular}{|r|c|} \hline $N$&Third order smooth differentiators (exact on $x^m,\,m=0,\dots,4$)\\ \hline &\\ 7&$\frac{-3(f_{1}-f_{-1})+f_{3}-f_{-3}}{8h^3}$\\ &\\ 9&$\frac{-6(f_{1}-f_{-1})-2(f_{2}-f_{-2})+2(f_{3}-f_{-3})+f_{4}-f_{-4}}{32h^3}$\\ &\\ 11&$\frac{-14(f_{1}-f_{-1})-8(f_{2}-f_{-2})+3(f_{3}-f_{-3})+4(f_{4}-f_{-4})+f_{5}-f_{-5}}{128h^3}$\\ &\\ \hline \end{tabular}

    2. Third order smooth differentiators of O(h^4) approximation error:

     \begin{tabular}{|r|c|} \hline $N$&Third order smooth differentiators (exact on $x^m,\,m=0,\dots, 6$)\\ \hline &\\ 9&$\frac{-12(f_{1}-f_{-1})+2(f_{2}-f_{-2})+4(f_{3}-f_{-3})-(f_{4}-f_{-4})}{16h^3}$\\ &\\ 11&$\frac{-54(f_{1}-f_{-1})-8(f_{2}-f_{-2})+23(f_{3}-f_{-3})+4(f_{4}-f_{-4})-3(f_{5}-f_{-5})}{128h^3}$\\ &\\ \hline \end{tabular}

  6. Steve Scigliano
    Posted July 14, 2009 at 2:04 am | #

    Hello Pavel,

    I tried your method on some very noisy surface scan data collected from a coordinate measurement machine and it gave impressive results using the n=4, 11 point filter. However, I believe that I need something more than 11 points to get the results I need. Can you publish the general form of this particular filter or at the very least post the N=19 coefficients?

    Thank you very much.

    • Posted July 14, 2009 at 11:43 am | #

      Hello!
      Thank you for your interest in my results!

      Here is n=4, N=19 filter:

           \[ \displaystyle{ \frac{1}{393216\,h} \left[\begin{matrix} 36894\Delta_1+45552\Delta_2+28028\Delta_3+7056\Delta_4\\ -2700\Delta_5-3152\Delta_6-1281\Delta_7-264\Delta_8-23\Delta_9 \end{matrix}\right] } \]

      where

          \[ \Delta_i = f_i-f_{-i}\]

      Let me know about the test results. I’ll publish filters for other N if you will need them.

      I think for your task (noisy surface data) you need to use 2D smooth differentiators.
      They would perform much better than 1D filters. That is because two dimensional data have noise in diagonal frequencies which cannot be suppressed well by 1D filters by definition.

      Only 2D filters are capable to do that and filter out noise in all directions uniformly (isotropic noise suppression). However it is not easy to build such smooth noise robust filters for 2D and it requires considerable efforts and time.
      If you are targeting commercial applications I would be grateful if you would consider supporting me in this task.

      Thank you!

      • Steve Scigliano
        Posted July 14, 2009 at 11:53 pm | #

        Pavel,

        Thank you for your quick response. In our case ,the surface data was collected by scanning curves on the surface so a 1D filter is working perfectly. We are still in the research stage at this point and have not made a decision as to how to proceed.

        On a slightly different topic, can your work be applied to the data itself?

        Regards,

        Steve

        • Posted July 15, 2009 at 12:21 pm | #

          Yes, these filters can be applied to the raw data itself. One point though – data should be uniformly spaced.

          How do you apply them now? Do you process data somehow beforehand? Smoothing? Curve fitting (you would not need numerical derivative in this case)?

          I have some (rather little) experience in processing of data from a coordinate measurement machine and IMHO following workflow is preferable:

          Smoothing filters with guaranteed noise suppression (similar to the described derivative filters) can be applied on the first stage of data processing pipeline to prepare data for all range of other operations – finding surface characteristics (first and second derivatives to estimate curvature, normal vectors, etc.), fitting smooth analytical surface representation (by splines) or else.

          In this method you remove noise from data once and forever, using standard algorithms for further operations without worrying about noise anymore.

          Does it have sense?

  7. Astrid
    Posted July 24, 2009 at 11:17 pm | #

    Hi Pavel,

    Now I’m using n=2 and N=5 in the 1D form of your smooth noise-robus, which already gives impressive results for edge detection on medical images. I would be interested to use your smooth noise-robust differentiator in the 2D form, to improve the analysis. Could you be so kind to post this?

    Regards,
    Astrid

    • Posted July 27, 2009 at 11:03 am | #

      Hi, thanks for the interest in my work.

      I didn’t finish research on 2D filters yet. Please use 1D filters for the moment. They estimate gradient components in 2D well.

      I’ll get in touch once I will be ready to share essential 2D filters for edge detection.

      Cheers!

      • Posted July 27, 2009 at 6:42 pm | #

        I’ve published 2D noise robust derivative filters here: Edge Detection.

        First is similar to n=2, N=5 but has isotropic noise suppression. Second is more noise robust, corresponds to n=2, N=7.

        Give it a try and let me know about the results please.

  8. Bernard McGarvey
    Posted August 5, 2009 at 8:26 pm | #

    Pavel – I am looking at some data where there is noise but where looking at the derivative profile is very informative and so I am interested in your approach. Two questions:
    1. have you published any papers on this and if so can you give me the references?
    2. My data starts off with h=1 then chnages to h=5 and then for the last portion h=15. Any suggestions as to how I could handle the changes to h when i us eyour formulas?

    Thanks

    • Posted August 5, 2009 at 9:02 pm | #

      I have not published paper yet, although I plan to do so. You can ask me any questions by commenting or by e-mail if something unclear in the above article. Also you can reference this webpage in your paper.

      In case of step change simple linear interpolation could be used to derive missing uniformly spaced data on the boundary. Then it could be plugged into differentiator. I’m sure it will give satisfactory results.

      • Bernard McGarvey
        Posted August 6, 2009 at 1:05 am | #

        Thanks for the quick reply. What I did was to divide each component of the calculation say

        [y(i+k) - y(i-k)] by [x(i+k) - x(i-k)]/k

        so that in effect each y contribution is divided by the x distance between the values of the contribution.

        With this approach the transition from one h value to the enxt h value was far less obvious.

        Any comments?

        • Posted August 6, 2009 at 10:58 pm | #

          I think your idea is very good. Actually your idea is exactly what I needed to extend my differentiators on irregular spaced data!

          One note though, it is better to divide [y(i+k) - y(i-k)] by [x(i+k) - x(i-k)]/(2*k).

          In this case formulas will give precise derivative estimation for any data spacing. Also formulas will coincide with referenced above for uniform data.

          To sum up, general form of a differentiator for irregular spaced data is:

              \[ \displaystyle {f'(x^*)\approx \sum_{k=1}^{M}{c_k\cdot\frac{f_{k}-f_{-k}}{ x_{k}-x_{-k}}\cdot 2k}}\]

          with the same \{c_k\} as explained in the article above.

          Thank you very much!

          • Bernard McGarvey
            Posted August 8, 2009 at 9:17 pm | #

            You are correct – I had used 2*k but had mis-typed.
            Thanks

  9. Michael Kremliovsky
    Posted August 20, 2009 at 4:47 pm | #

    Pavel,

    thank you very much for posting this project online, very useful indeed. I have been using Savitsky-Golay differentiators for ages, but faced recently a new class of problems requiring more serious attention to the details. In particular, I need minimal delays even at the expense of quality, because the results are used for real-time control system. The quality of the smoothing can be compensated for later in time, but the delay limits the response efficiency in principal. In addition to differentiators I need to smooth signal itself as well. I am curious if you experimented and/or can I suggest one-sided smoothers with N around, say 10-15. I will be more than happy to tell you about the application where these smoothers play a very important role in saving lives.

    Best regards,
    Michael (Михаил)

  10. Atul
    Posted August 31, 2009 at 3:05 pm | #

    Are there two dimensional versions of the smooth noise robust differentiators?

  11. Atul
    Posted September 1, 2009 at 12:20 pm | #

    Thank you very much, Pavel – hugely appreciated. This is precisely what I was looking for!

  12. Posted November 14, 2009 at 5:53 am | #

    Looks like great work. I am interested in implementing a general form of this differentiating filter in Matlab function (since I couldn’t find one you had already made) and believe that you missed a k in the last equation you give in the example for the N=5, n=2 and m=1 case. Is that correct and do you mind if I continue working on the Matlab function?

    Thanks,
    Nate

    • Posted November 14, 2009 at 7:42 pm | #

      Yes, you are right, I missed k. I’ve fixed it. Thank you!

      As far as I know these filters are not implemented as Matlab function yet. So, I think many people would be very grateful if you would do this and share source code. I will give link to your web page with the files or I can make them available for download through this page if you want.

      Thank you for your interest and efforts!

      • Posted November 15, 2009 at 12:36 am | #

        Pavel,
        Thanks that would be nice. I currently have a code that is working but for n values other than n=2 where I didn’t know a closed form solution I begin to run into numerical issues for large n values because I do not know how to directly implement the shift operation in Matlab. Therefore, if you could post or email me a code so I can see how you approach these problems I would appreciate it.

        Thanks,
        Nate

  13. Eduardo Parada
    Posted February 11, 2010 at 2:39 am | #

    Good work!!!… hugely appreciated.
    I’ m try your ideas and generate examples!!

  14. Michael
    Posted February 17, 2010 at 2:43 am | #

    Hi Pavel,

    This is quite interesting. One thing I like about Savitzky-Golay is the ability to choose some arbitrary lag count where the interpolation is done, rather than choosing only between central and forward operations. It sounds like your work can be extended to do that, as well. You mentioned antisymmetry as one constraint in the central case.

    I’d be interested in knowing the complete set of contraints required to design coefficient sets with the additional degree of freedom I mentioned (choice of lag). I’m not as interested in knowing the coefficients themselves as the specifications used to generate them.

    Thanks!

  15. Posted February 18, 2010 at 4:03 am | #

    Nice! These smoothing functions should be implemented in Scilab and Maxima as built in functions.

    At this time we use the Savitsky-Golay algorithm for smoothing motion data before graphing but I would like to try the smooth noise robust filter. The first derivative coefficients were nicely documented. Are there formulas for the second and third derivative and for n=3,4 and 5?
    I know that there are tables that display the coefficients but I would like to see the formulas or some method algorithm for implementing this smooth noise robust filter in Scilab or Maxima.
    I would like to be able to do something like this
    http://www.deltamotion.com/peter/Maxima/Savitzky-Golay.html
    You can see I don’t have all the combinations worked out but it is easy for anyone to expand on what I have done to compute what they want.

    Peter Nachtwey

  16. Junrong
    Posted May 4, 2010 at 7:18 pm | #

    I am a beginner in this scope and do not know much on the use of filter. By the way, how to use your method to denoise the data?In the S-G filter, there is a col of coefficient in the matrix (g(:,1)) to reconstruct the smoothed data by multipy the g(:,1) with original data. In your filter, the first and second derivative can be used to detect the sudden change or break point, but I do not know how to use your method to smooth the noisy data. This is maybe a very simple to you, but it’s a difficult for me. Wish to get your help. Thanks

  17. Andrew
    Posted May 15, 2010 at 6:24 am | #

    Pavel,

    This is a very interesting and refreshing approach. However, I’ve noticed few missing points which might be very helpful to clarify your results:
    1. There is a generic expression for coefficients {Ck} but it is not clear how it was generated.
    2. For centered filters the process is clear but what is you approach for asymmetrical (“forward”) filters?
    3. You describe few second and third order centered filters but few words about the process would help a lot.
    These questions are important for me and others who is working with real-time DSP (forward/backward filters) on irregular time intervals with high data frequency meaning working windows more then 11 points . And, unfortunately, I could not find anything published. I am also curious if you experimented with circular convolutions and mirrored data windows to solve deficiencies of one-sided filters you mentioned in one of your replies.
    I understand you might not have time to respond with answers or descriptions of your method. Maybe it is not a big deal then to post one-sided first and second order filters for N>=19?

    Thanks in advance

  18. Luca
    Posted May 24, 2010 at 11:46 pm | #

    Thanks a lot for the much appreciable work.
    I applied the fisrt order formula for n = 2 and N = 7 . It misses a minus before the whole formula.

    Luca

  19. David
    Posted June 12, 2010 at 3:30 pm | #

    These formulas look extremely useful for regular meshes. But will the basic premise also work on an irregular mesh where the step size is not constant? Of course I expect the formulas to get more complicated, but I am curious if noisy data on an irregular mesh can be properly filtered? Thanks for your site. It’s quite useful.

    • Posted June 13, 2010 at 10:23 pm | #

      Thanks.
      Yes, these differentiators can be extended for irregular spaced data:

          \[ \displaystyle {f'(x^*)\approx \sum_{k=1}^{M}{c_k\cdot\frac{f_{k}-f_{-k}}{ x_{k}-x_{-k}}\cdot 2k}}\]

      with the same \{c_k\} as explained in the article above. Idea was suggested by Bernard McGarvey in this comment..

      • David
        Posted June 14, 2010 at 3:22 am | #

        Fantastic! Sorry I missed the earlier comment. I assume this extends similarly to the 2nd derivative? This is what would interest me most. Thanks again.

        • David
          Posted June 14, 2010 at 3:30 am | #

          I actually have one more question. Can these be extended to data at the ends of the domain? Suppose I want an “upwind-biased” or even fully upwind differentiator where I only use data to the left of the point in question? I understand the order of accuracy will diminish but if I only have data on the left, it’s necessary. Thanks again.

        • Posted June 14, 2010 at 9:56 pm | #

          1. Yes, you are right, second derivative for irregular spaced data can be approximated as

              \[  \displaystyle{f^{\prime\prime}(x^*) =  \sum_{k=1}^{M}{{\alpha_k}(f_k+f_{-k})}}-2f_{0}\sum_{k=1}^{M}{\alpha_k}, \quad\displaystyle{\alpha_k = \frac{4k^2s_k}{(x_k-x_{-k})^2}}\]

          where \{s_k\} are coefficients from the formulas for uniform spaced data derived above.

          2. It is also possible to derive one-sided filters for 2nd derivative & irregular spacing (I have not done it yet though).
          Could you briefly explain application you need it for?

          • David
            Posted June 15, 2010 at 2:15 am | #

            This is great…looking forward to trying this.

            The application we need it for is a CFD code. But it’s not average everyday CFD code, however, as one thing we need is a 2nd derivative of an array of values. Because of the topology of the mesh, the array of values can be very noisy. We currently use Lagrange polynomials to estimate the 2nd derivative and while the method converges, it converges to a noisy solution in some regions. A smoother differentiator that is less noisy will be tremendously helpful, I believe, Of course, we need the 2nd derivative all the way to the ends of the arrays so these stencils with a lot of points (to reduce noise) become less useful at the ends.

          • Posted June 17, 2010 at 11:59 pm | #

            David,

            I am sorry for the late answer.
            Thank you for clear explanation of your project – it is very interesting.

            Your request motivated me to derive general formula for second derivative filters (centered) – so you can have flexibility on generating filters of any length, even long differentiators with strong noise suppression which are needed for very noisy data (as in your project).

            Please try them (see updated Extensions section). In case of satisfactory results I will derive one-sided filters for the boundaries with the similar properties.

            Let me know if you need my help in this experiment.

          • David
            Posted June 21, 2010 at 11:50 am | #

            We tried the 5 points smooth differentiator in our application. Unfortunately, for our code, central differencing is unstable..this is known to be true for this particular part of the code. We have been successful with differentiating using one-sided Lagrange polynomials with a tiny bit of smoothing, so I’m still confident a smooth upwind or even upwind-biased differentiator will be useful. We can try to derive it ourselves or if someone has time and would like to derive it for us, we woud be most obliged. Thanks for your help so far.

            If this is successful, we will probably write a paper and would love to collaborate on it with you. Thanks.

          • Posted June 21, 2010 at 10:30 pm | #

            Well, I am not surprised since 5 points filter has the weakest noise suppression capability.

            As I understood from your previous explanation data in your application is very noisy. So I suggest you to try filters with N\ge 11.
            Longer filter length means stronger noise suppression.

            Now I am preparing report on one sided filters for noise robust second derivative estimation. Tomorrow it will be available for download.

          • Posted June 22, 2010 at 4:03 pm | #

            David,

            Here is report I promised: Noise Robust Second Derivative.pdf

            It includes simple formula for coefficients and one-sided filters.

            Keep me updated on progress.

  20. Sebastian
    Posted June 14, 2010 at 8:59 pm | #

    Dear Pavel,

    I want to calculate a differentiator for N=19 from your formular given above. Therby I always obtain 2 for the coefficient c_{k}. How to do you obtain 42 or 48 respectivley for N=11? Thanks a lot

    • Posted June 14, 2010 at 9:37 pm | #

      Here is n=2, N=19 filter:

          \[ \displaystyle{ \frac{1}{131072\,h} \left[\begin{matrix} 4862\Delta_1+7072\Delta_2+6188\Delta_3+3808\Delta_4+\\ +1700\Delta_5+544\Delta_6+119\Delta_7+16\Delta_8+\Delta_9 \end{matrix}\right] }\]

      where

          \[ \Delta_i = f_i-f_{-i}\]

      • Sebastian
        Posted June 14, 2010 at 9:44 pm | #

        Dear Pavel,

        thanks a lot again. I saw this post before. However maybe I want to use N=17 or N=21. How to you obtain these values for the coefficient c.

        If I plug in this values into your formula above I obtain c=2 for each of them.

        How do you derive for example c=42 (f_1), when n=2 and N=11?

        Best Wishes!!

        • Posted June 14, 2010 at 10:21 pm | #

          Here is code for Maple based on formulas above:

          > c:=(k)->binomial(2*m,m-k+1)-binomial(2*m,m-k-1);
          > N:=11:
          > m:=(N-3)/2:
          > M:=(N-1)/2:

          It gives correct results:

          c(1) = 42, c(2) = 48, c(3)=27, c(4) = 8, c(5) = 1

          • Sebastian
            Posted June 14, 2010 at 10:34 pm | #

            You are wonderfull! I did not considered it to be binomial distributed! Thanks for your fast replies!

  21. Posted June 16, 2010 at 10:55 pm | #

    Pavel,

    Very interesting!

    I have a quick question:
    Is there a difference between smoothing data and then differentiating it, compared to this algorithm? I would think so, since smoothing and differentiation are trying to do opposite things – but it’s not obvious to me.

    At the moment we smooth our data and then do a least squares fit of (1,x) to our data, and take the slope – which I suspect is the same as Lancszos and Savitzky-Golay for this low order? If we want to improve noise rejection, would using these techniques offer any advantage do you think?

    Be interested in your thoughts,

    Cheers,

    Alex.

    • Posted June 17, 2010 at 6:00 pm | #

      Hi, Alex!

      Bare numerical differentiation is very sensible to noise. And it is only natural to reduce such noise by smoothing beforehand. There is no contradiction – smoothing just improves quality of differentiation.

      Proposed method includes smoothing inherently, by design.

      It might be advantageous to try this algorithm in your application at least for one reason: performance. All filters derived above combine smoothing and differentiation in one operation. So you don’t need to smooth data first (one pass) and only after that find derivative (second pass through the data). Virtually you get 2 times speed up as you need to apply one filter only.

      Other possible reason is quality improvement. One thing is that estimating derivative by slope of straight line is very rough I think. For example, even if data contain no noise (only smooth parabolic-like regions), such algorithm can not restore derivative exactly. Proposed filters can give more precise values, since they have higher approximation order.

      Secondly, quality of derivative estimation in your system depends on how well smoothing is done. If it does not reject noise completely results might be deteriorated since Savitzky-Golay is not completely noise robust too. Described filters are free from such flaws.

      To sum up I would advise to try smooth differentiators exact for 1, \,x,\, x^2 (first table in Results section). They have simple structure, higher approximating order, general formula for the coefficients. Filter length should be selected based on noise properties.

      I would gladly help if you would decide to try them.

  22. Benjamin Tan
    Posted June 19, 2010 at 7:49 am | #

    Hello Pavel,

    Very nice and interesting work. I am currently using a SG differentiator on noisy temperature data for a boiling experiment and was bothered by the non-zero response of the SG routine at the higher frequencier. I was wondering if you had any information on the magntiude of noise reduction compared to say a basic backward difference differentiator.

    Thanks,
    Ben

  23. Bill
    Posted September 1, 2010 at 1:26 pm | #

    Pavel,
    Great stuff here. This is exactly what I was looking for and presented in a nice clean manner. I’m using your methods to filter very noisy tank levels to look for unauthorized movement. The algorithm is implemented using a simple weighted moving average for near-real-time differentiation.

    I’m pretty interested in the one-sided filters, because the longer filters that I need to use (N=19) introduce fairly significant delay into the response of the alarm (about 30 secs). I’d like to see that drop down to closer to 0 secs, and I believe that using a one-sided filter would accomplish that. Am I right? I only need accuracy up to X or X^2 at most. Could you post the derivation of the one-sided filters? I can follow the original, but I don’t know enough about filter design to synthesize the one-sided derivation.

    Thanks for the work. It’s giving us the results we want.

    • Posted September 1, 2010 at 9:42 pm | #

      Thank you Bill for your feedback.

      Please check this paper for the one-sided filters: … It includes filters for up to N=15 length.

      I would recommend you to use hybrid differentiators (see pages 4-5 ) for your conditions. They are particularly optimized for no-delay derivative estimation in the presense of noise. I am sure filter with N=15 will be enough – please try it. Let me know about the results.

      I’ve checked website of the company you work for – automation, control etc. are tasks where numerical methods can be of great interest (e.g. PID).
      I would appreciate if you would explain me some of the math-related problems you have in your products. May be we can collaborate on them. Here is my private e-mail: pavel@holoborodko.com.

      P.S.
      If you want to support me please use this e-mail: pavel.holoborodko@gmail.com for donations through paypal.com.

  24. Posted September 2, 2010 at 1:57 am | #

    I can help with the PID control. Control theory is one of my areas of expertise. The first thing you have got to realize is that each system is different but one can develop formulas that will provide the kind of response you desire.
    See this.
    http://www.deltamotion.com/peter/Mathcad/Mathcad%20-%20T0C1%20MassOnASpring-PID.pdf
    The formulas for the PID gains are only valid for the mass on a spring or similar. A servo motor doing position control would be require a different set of PID gain formulas because it isn’t under damped and it is an integrating system. Temperature and velocity control systems are non integrating because if the control signal goes away the system settles at a ambient point like room temperature or stopped. In an integrating process like position control the system coasts to a stop but does not return to some starting position.

    The work sheet I linked to is an easy example but it assumes that the plant transfer function is known and that is the real trick of tuning PIDs. Finding the plant parameters is called system identification and requires using optimizing or minimizing algorithms that minimize the error between a estimated value and the actual process value.

    I can see you have Maple so it will be easy for you to translate the first page into maple so you can generate the symbolic gains for just about any linear system. I am sure Maple has minimizing algorithms such as Levenberg-Marquardt or L-BFGS.

    • Posted September 2, 2010 at 11:17 pm | #

      Hi Peter,

      Thank you for the clear introduction to the control theory and worksheet to study.

      As I understood, task of system identification is very similar to filter design. Here we also have desired transfer function, and filter’s parameters (coefficients) are derived to reproduce it as close as possible. So, basically these tasks are the same.

      In filter design nature of the parameters and relations among them are “highly” nonlinear. As a consequence target function has many local extrema, and minimization becomes nontrivial task. Applying straightforward methods like L-BFGS etc. can be misleading since they will converge just to the nearest (to starting point) local minimum. And global optimal solution (minimum among of all local minima) won’t be found.

      I faced such situation many times. To solve it we need to use global optimization routines. I tried numerous optimization packages of such kind even commercial and ridiculously expensive. Unfortunately none of them could solve my target function. I end up writing my own global solver, which uses simple algorithm:
      1. generate random points (or low-discrepancy sequence of points).
      2. simultaneously start local searches (L-BFGS) from every point.
      3. select and keep track of the best local minima found so far.
      4. estimate coverage of the solution space by already conducted searches – stopping criteria.

      It doesn’t require huge memory or CPU power, quickly gives list of acceptable solutions to continue research, refines best solutions as long as program runs. I believe this kind of solvers is called stochastic.

      Do you face similar difficulties in control theory? What do you think about my simple global solver – can it be applied in your field?

      Sorry for the long post, just need to share my thoughts.

  25. Posted September 3, 2010 at 2:42 am | #

    I like to think of the problem as going down hill and trying to seek the lowest spot. What I have found in motion control that often the optimizer comes out of the hills to find a huge flat desert that is so flat that numerical errors become significant or the error windows must be extremely small. You can try to come out of the hills from different directions but finding the true bottom is difficult without a lot of starting points and even then you may not try the right one. There is nothing magic about L-BFGS except is finds an answer quickly but who is to say that one answer on the flat desert is any better than the other? My answer was to try to shrink the desert. So we have found that the trick is to excite the system to get information so there are more ‘hills’ and the valley floor is narrow and quickly leads to a small pond ( the answer). I found this out by taking a problem I had difficulties with and plotting the resulting norm vs two of the parameters at a time so I had multiple 3D plots. This gave me a visual clue of what I was against and how to excite the system to yield the best information.

    Other things I have thought of is finding the point that is in the middle of the desert. This would allow for maximum variation of parameters before the norm increases. I don’t know how valid this is though.

    Your multiple search technique is one that I used way back in the 80s ( yeah, I’m older ) but I/we did use random starting places. Think of coming out of the hills from different directions. BBackack then we were optimizing the orientation of logs for peeling into veneer. Since this was a real time process we would trying different solutions from different ‘directions’ until time ran out and we use the best solution so far.

    I can ID non-linear systems. I write up my system as differential equations and solve using simple RK4. The optimizing routine tries different coefficients in the differential equation(s). Usually it is something simple like a non-linear valve or in one case it was water level control in a condenser with non-vertical sides. These are easy IF the excitation is good and all the significant things can be measured.

    I haven’t had time to follow up on what has been said in the last two months. At this time we still use the Savitsky-Golay filter in some parts of our application. I am also looking at a simple Alpha-Beta-Gamma filter which is a very simplified steady state Kalman filter. The advantage of the Alpha-Beta-Gamm filter is that is produces the position, velocity and acceleration with little or no lag and filters noise like a Kalman filter but one doesn’t need to know the covariance of the process and measurement noise. One simply selects a bandwidth that probably isn’t optimal but more than good enough. The idea is to get 90% with 10% effort.

    I am interested in the one sided filters for position, velocity and acceleration to compare with the Alpha-Beta-Gamma filter
    http://en.wikipedia.org/wiki/Alpha_beta_filter

    • Posted September 7, 2010 at 4:56 pm | #

      Peter,

      I’m experimenting with ABG and one-sided differentiators. I sample sine wave on [0,2\pi] with additive gaussian noise \mu=0,\,\,\sigma=0.02 at N=100 uniform points.

      I apply one-sided differentiator listed above to estimate derivative and I get average absolute error around 0.2\%. I did not tune filters for this data.

      When I apply Alpha-Beta-Gamma filter (ABG) results depend very much on parameters selection. I’ve spent an hour to select parameters to minimize error of velocity estimation for this dataset and the best I could get was error around 0.4\%. The best combination was \alpha=0.8,\,\beta=0.2,\,\gamma=0.05. For other combinations error was much higher.

      It is too early to make conclusions but from this little experience I can say that ABG filter is not easy to tune for the particular signal/noise properties. But what if the input data change characteristics with time?

      One-sided fixed filters are much easy to use – no tuning is needed+they have less computations. ABG can give comparable results after much tuning.

      Could you give me test data and parameters so I can make comparison for it.

  26. Posted September 8, 2010 at 1:25 am | #

    In this directory I have my Alpha-Beta-Gamma files. They are written using wxMaxima which is free. Since you know how to use Maple you should wxMaxima to be similar with a few quirks.

    I did this simulation to see how well I can estimate positions from a 14 bit marine motion sensor that measures roll, pitch and heave for a crane application. The motion controller must hold the crane steady in rolling waters. I am assuming the wave has a period of 10 seconds and an amplitude of 1 meter. I added sampling jitter and noise.
    At line %i54 there is a comment about how I ‘tune’ the gains. Basically, I have 3 choices that can be cut and pasted from the comments section into the code.

    The Alpha-Beta-Gamma filter is a very simple form of steady state Kalman filter where the gains are chosen to have the desired response to an error but the response may not be optimal. The IAE, integrated absolute error, coefficients work the best. The end user must only adjust the bandwidth to get acceptable results.

    A Kalman filter would ensure the minimum error but it is very compute intensive. Also, the use must know covariances of the measuring device and system. Fat chance. Even experienced engineers end up ‘fudging’ the parameters to get desired results so why bother?

    Notice that I can estimate positions between the readings. The sensor updates only every 10 milliseconds but I can estimate position, velocity and acceleration every millisecond. I don’t expect your filter to do this.

    What I am looking for is an easy way of estimating the position, velocity and acceleration with minimal error. It will be interesting to see how your filter does with this problem.

  27. Posted September 9, 2010 at 1:14 am | #

    You should load the ABG.wxm file into wxMaxima.

    • Posted September 28, 2010 at 5:57 pm | #

      Hi Peter,

      Finally I’ve got time to continue studying ABG filter – sorry for the delay.

      I’ve generated noisy data using your Maxima worksheet, plugged it into C program and tested ABG against SG filters.

      I’ve implemented ABG in C using your algorithm and selected parameters to be optimal for IAE according to formulas in your worksheet.

      Here are my results and thoughts.

      ABG is second order filter in a sense that it uses 2nd degree polynomial for signal modeling with adaptive update of its coefficients on every step in the direction of minimum error. This has significant impact on quality of acceleration modeling.

      I’ve got following comparison of ABG with the second order one-sided SG in terms of IAE of position(X), velocity (V) and acceleration (A) modeling (lower is better):

      SG2(X,10) = 0.3897 ABG(X) = 0.4915
      SG2(V,25) = 2.462 ABG(V) = 4.016
      SG2(A,20) = 35.64 ABG(A) = 35.63

      Second number in SG2() is a filter length. We see that by using long filters SG2 can reach and even outperform ABG. Also both types of filters predict (A) with low quality which is natural since they are based on parabola, which can predict second derivative (A) only by primitive step wise function.

      Results for comparison of ABG with third order SG (on the same data) are:

      SG3(X,40) = 0.3838 ABG(X) = 0.4915
      SG3(V,40) = 2.301 ABG(V) = 4.016
      SG3(A,40) = 26.69 ABG(A) = 35.63

      Here we see that 3rd order SG filter outperforms ABG in prediction of acceleration as we expected since SG3 approximates second derivative (A) by linear function.

      My conclusion.

      ABG is good for signal smoothing (X) and estimating its first derivative (V). However for the second derivative modeling I would recommend to use third order filters (Alpha-Beta-Gamma-Delta?).

      The question of which filter is better has no simple answer. It depends on application. The only disadvantage of SG filters is that they need N previous samples in order to do next prediction. Cases of non-uniform data and estimation between samples are covered by SG (it is just polynomial approximation after all – we can evaluate polynomial at any point, even extrapolate). For uniform sampling SG have simple rational coefficients and can be easily implemented using fixed point arithmetic – which is huge plus.
      If application allows it is good idea to try both approaches.

      I didn’t use SNRD filters in comparison for two reasons

      1. In our application we have Gaussian noise, SG filters are optimal in such conditions. SNRD will give worse results for sure. After much research I found that SG filters have much broader applications in comparison to SNRD, which are needed only in very specific fields. I will reflect this on the page soon.

      2. It is not easy to build SNRD of 3rd order (even 2nd order is a problematic issue) since condition of complete noise rejection force filter to have significant error in pass-band.

      Anyway I am really impressed with the performance of ABG filter. Thank you for introducing it to me.
      I would be grateful if you would give me further references on this topic (optimal parameters selection, etc.)

  28. Roland
    Posted September 29, 2010 at 8:14 am | #

    Hi Pavel.

    I am trying to calculating the acceleration of a wheel.
    Magnets are regularly spaced on the wheel and time is measured when they pass before a Hall sensor.
    In this case, time is x in your formulas and is not regularly spaced if the rotation speed is not constant.
    So I am trying to apply the formula for the irregular spacing.
    The problem is that it yields a second derivative that is always equal to zero, whatever the times that are recorded:
    f(k) = f(0) + k*delta and f(-k) = f(0) – k * delta
    Thus f(k) + f-k) = 2 * f(0), and the sum for the second derivative is equal to zero.

    Am I missing something , or is there another way to proceed ?

    • Posted September 30, 2010 at 9:59 am | #

      I am afraid you have detected flaw in second derivative formulas for irregular spaced data.
      I will develop another generalization for such case.

  29. Posted September 30, 2010 at 2:11 am | #

    “However for the second derivative modeling I would recommend to use third order filters (Alpha-Beta-Gamma-Delta?). ”
    OK, I have done that too.
    http://www.deltamotion.com/peter/Maxima/ABGD.html
    Pavel, now I am estimating a jerk too. In motion control the derivatives are used calculate the velocity, acceleration and even jerk of a master or reference axis that slave axes are geared too. This way feed forward gains can be calculated for the slave axes which results in a much smaller following error.

    I will update the ABGD.wxm to provide IAE coefficients for the gains like the ABG filter. The IAE coefficients seem to work just a little bit better than the Butterworth coefficients.

    I can expand ABG filter to any order I need but after a while there is no point. The most I have ever needed is the position, velocity, acceleration and jerk (PVAJ).

    I have no doubt the Holoborodko filter can beat the ABG filter if you use enough coefficients but now you are doing quite a few multiply and adds. Is your filter still one sided? I haven’t seen how you calculate the one sided second derivative yet. Now I have an idea of what you are doing I can strain my brain and figure it out given time.

    Here is a question that will stump you because I wrestle with this too. How do you initialize 20 or even 40 past values? Problems, problems, problems.

    • Posted September 30, 2010 at 10:07 am | #

      In my previous post I explained comparison of one-sided Savitzky-Golay filters (SG) with ABG. So you can repeat tests easily.

      I didn’t use my filters (SNRD) in tests since it is pretty difficult to design them for such case.

  30. Janusz
    Posted November 6, 2010 at 12:15 am | #

    Hi Pavel.

    Great and interesting stuff that you have worked out here. I’ve been using your smooth noise-robust differentiators in my scientific work. It worked very well. Therefore I would like to refer to you/your work in my paper I am writing on.
    Is there a publication dealing with this differentiators that I can refer to?

    • Posted November 6, 2010 at 12:55 am | #

      Hi!

      I am glad my method helped in your research. Thank you for letting me know.

      I do not have published paper on these filters. But I would be grateful if you would site it using something like:


      P. Holoborodko. (2008) Smooth noise robust differentiators.

      Please send me info on your paper once you finish it – I will reference it on this page.

      Thanks!

  31. Ben
    Posted November 17, 2010 at 10:34 am | #

    Hi Pavel,
    Thanks for an extremely helpful website!

    I’m trying to determine acceleration from heave (position) data for waves in a hydraulics lab using second derivative central difference . (Sampling at 0.02sec, wave period 1-3sec)

    This should be relatively simple and the results seem reasonable, but I’m getting very different results depending on the filter used:
    -N=7 central difference
    -N=9 central difference
    -N=7 Noise robust CD
    -N=9 Noise robust CD
    -N =9 Noise robust CD with ‘higher approximating order’

    They all follow the same basic trend, but individual values vary wildly. And the Noise Robust method gives ~1/5 magnitude of the ordinary method.

    Am I using N far too low given the relatively high sampling rate? Or am I completely off track?
    Thanks

    • Posted November 17, 2010 at 1:43 pm | #

      Hi Ben,

      @”And the Noise Robust method gives ~1/5 magnitude of the ordinary method.”
      Ordinary CD is unstable on noisy data but constant ~1/5 ratio seems strange. Usually difference between CD and SNRD looks like:

      Blue – sine wave data corrupted by noise.
      Red – true derivative of sine.
      Grey – CD estimation.
      Green – Noise robust differentiators.

      CD is a ‘saw’, SNRD is in the middle of that ‘saw’. So ratio is variable.

      Drawing shows first derivative estimation. Situation for second derivative is similar.

      @”Am I using N far too low given the relatively high sampling rate?”
      Maybe it is reasonable to increase filter length N until difference between estimated values from N-1 and N becomes below some threshold. But I didn’t test this idea.

      Although it is hard to guess, I think it is enough to use filters exact on 1,\dots,x^3 – their coefficients can be calculated for any N, and they provide reasonable approximating precision. Just be aware that for large N multiprecision arithmetic is needed for coefficients calculation.

      I would suggest you to try Lanczos (Savitzkiy–Golay) filters too – they might be even better for your application depending on noise characteristics.

  32. Michael Kremliovsky
    Posted November 17, 2010 at 11:43 am | #

    Ben, what is the estimated dimension of your system (the number of degrees of freedom) ?

  33. Ben
    Posted November 17, 2010 at 2:25 pm | #

    Pavel,
    I’ve had some convergence with the SNRD N=19 and N=21 with all values within a few percent, so we’re on the right track. I’ll try a couple orders higher and try to get it below 1%, which will be fine for this application. I’m using excel- any idea what N will be limited by excels accuracy?

    I should have clarified the 1/5 ratio – this is a rough idea of the magnitude of events, it wasn’t an exact 1/5 for each timestep. Your plot above explains it nicely (though 5x is a lot of noise). And it will explain some troubles I’ve had with another application so we’ve solved two things.

    I don’t really understand the phrase “exact on 1,…,x^3″. If the formulation I’ve used applies to the second differential, where does 1,x,x^3 fit in.

    Many thanks Pavel – for the quick reply, for the coefficients algorithm and a great website.

    • Posted November 17, 2010 at 3:40 pm | #

      Assuming Excel is 32-bit, N_{max}=34. Exact on 1,\dots,x^3 means that approximation order is of O(f^{(4)}h^2), so filters give precise derivative on polynomials up to cubic parabola.

  34. Adam
    Posted January 17, 2011 at 2:17 am | #

    Hi Pavel,

    I have just come across your work with much interest.

    I am trying to implement a backward second order derivative – in time – as part of a code I am writing for the numerical simulation of electromagnetic wave propagation.

    The problem I am having is the late time instability with the propagated signal, and think that using your filtered derivative approximation might help in this regard.

    With respect the PDF link you had posted above on (June 22, 2010 at 4:03 pm | Permalink), I have derived the coefficients for N=5 for the second order derivative, which gives the following:

    f”(xo) = 1./4./h/h [ fo - 2. f-2 + f-4)]

    When I had applied it, it does give a better response than the classical backward difference in terms of filtering out the very high frequencies that give rise to unwanted noise, however, the amplitude response has been damped by a factor of about 30% of the expected value. I am not sure if this is expected, the above formula is wrong, or do I have to use more terms to ensure better accuracy.

    Also, as you recommend that the above formula is based on an exact approximation for up to x^2. Is it possible that you can make available second order derivative approximation to a higher degree at all please?

    Thanks for your help with this in advance,
    Adam

    • Posted January 17, 2011 at 9:59 am | #

      Hi Adam,

      Thank you for your question and feedback.

      Formula for N=5 you’ve tried is Ok. It just targeted for low-frequency signals.

      But it seems that your data contains frequencies in mid-range – maybe somewhere in f=\frac{\omega}{2\pi}\in\,[0.1;0.15]. You can check first drawing in the Extensions sectionN=5 gives second derivative with reduced amplitude in that range – as you described. Longer filters will damp amplitude even more.

      Special filter can be (and should be) constructed for your case. We have to use completely different procedure for derivation – one in the PDF is just not applicable.

      Could you estimate major frequency range of your data? Does is change with time?

      Having this information I’ll try to derive suitable filter (for the small reward of cause:-)).

  35. Adam
    Posted January 17, 2011 at 8:48 pm | #

    Hi Pavel,

    Thank you very much for your reply.

    The range of frequency in the signal I had used with respect to my previous post was extending up to 0.125 in comparison to the normalized frequency range in the extensions section you are referring to.

    I have now reduced the maximum frequency range of the signal to 0.075 instead, but the same observation in terms of damping is still preset.

    As for your questions:
    1- I would appreciate it very much if you can construct an alternative filter that can maintain a higher accuracy on the amplitude obtained.

    2- For the example I am using at the moment, the signal frequency range does not change with time. However, I will be using other structures where this is going to be the case. For example no-linear materials will generate higher frequencies to those in the input signal and so this would be extremely useful if can be considered.

    In all cases, I presume the alternative filter you are going to generate will be dependent on the range of frequency used, and so if you can allow for this to be variable within the remit of a generalized formula then that would be ideal. Currently, I am using signals that have a maximum range of frequency that falls inbetween this interval [0.05; 0.2], and extending from zero.

    Thanks again,
    Adam

    • Posted January 18, 2011 at 1:53 pm | #

      I’ve checked N=5,\,7 filters on 0.075 frequency signal – estimation error < 1% for both of them.

      For 0.2 frequency signal estimation error was less than 12%.

      N=5 gave better estimation in both tests – it has wider bandpass interval which is required for your task.

      I think there is some inaccuracy in your implementation (or some misunderstanding between us) – error shouldn’t be so high 30%.

      Could you just take simple pure-frequency signal \cos(2\pi f x) and apply filters for it? That is how I did my tests using f=0.075 and f=0.2.

      Or just test your routine on x^2. Filters should give precise second derivative for it at any point.

      Let me know about results.

  36. Daniel
    Posted February 1, 2011 at 7:21 am | #

    Hey,

    This looks useful. Is the method different than doing a polynomial fit over a window and then differentiating the polynomial? I am interested in the Matlab code mentioned earlier. Does it exist and is it available?

    Thanks,
    Daniel

    • Posted February 1, 2011 at 9:09 am | #

      Low-order smooth differentiators (first table) can be found as least-squares polynomial fit with binomial weights + subsequent differentiation.

      However this is not very helpful, since we know analytical expressions for the coefficients – no need to fit anything.

      High-order approximating filters (and second order derivative) cannot be derived from polynomials.

      Several guys volunteered to write Matlab library – I have no idea of the results (if any) though.

      May be somebody reading this page will respond to you.

  37. Dirk
    Posted March 12, 2011 at 3:24 am | #

    Pavel,

    I was looking for a smoothing method for data from optical sepctroscopy and stumbled upon this site. I’m not directly interested in derivatives but understand that smoothing is an inherent property of the method due to polynomial interpolation. I was wondering if you could provide a recursive formula to calculating coefficients of the noise-robust smoothing function (n=2, any odd N). I could find some hard values of coefficients for N=7 and N=9 in a comment on one of your other pages (about central differences), but not a formula for arbitrary N.

    Thank you,
    Dirk

    • Posted March 14, 2011 at 5:34 pm | #

      Sorry for the late response, Tokyo still having strong aftreshocks.

      Smoothing filters for n=2 are just rows of Pascal’s_triangle, which can be easily calculated (check “Calculating an individual row or diagonal by itself” on wiki page).

      I do not know closed form expression for n>2 though. BTW those formulas are not based on polynomial fit.

  38. Colin
    Posted March 16, 2011 at 1:04 pm | #

    Hello Pavel,

    Thanks for the post, this is just what I was looking for. However, in following it I get a little lost between the 5 equation expansion i.e 2i*sin(0)sum… etc and the formula for ck, i.e ck=1/2^(2m+1)[… etc.

    I am trying to form a method in matlab that can output a matrix of coefficients cf, given n and N. Could you possibly provide a formula that can yield this?

    -Cheers
    Colin

    P.S I hope all is well with you with the earthquake situation, hope you didn’t lose anything.

  39. mark
    Posted March 29, 2011 at 1:02 am | #

    Hi,
    Thanks for your site- I have to work through it properly but a quick question- if the optimal filter for an n’th order derivative is two sided then do you know of a simple way of providing the best one sided approximation to that one sided derivative?
    Any ideas would help
    many thanks,
    Mark

    • Posted March 29, 2011 at 11:24 am | #

      For noisy data you can use: (a) Savitzky-Golay one sided filters, (b) Alpha-Beta-Gamma filter, (c) or Hybrid filters described here …

      (a) optimal if data corrupted by Gaussian noise.
      (b) deliver similar performance to (a) but doesn’t require memory to store previous values.
      (c) good for rejecting high-frequency noise and cascaded filtration.

  40. Ben
    Posted May 3, 2011 at 10:49 am | #

    Hi Pavel,
    I have a problem that I’m hoping you have a method for, or can give some direction.
    My task is to measure the velocity of water currents at the sea bed induced by a manoeuvring boat, to determine if seagrass will be affected. Using a downward looking ADCP (Doppler) I get current vectors at 5cm bins at 0.2Hz, however the data is quite spiky particularly at the lowest bins where the signal is affected by the bed itself. So I need a filter that is one-sided vertically so that it is accurate at the bed, but possibly two-sided in the time domain, ideally it should be vectorised to account for the direction of the flow (in two dimensions, the vertical velocity is ignored)
    Is this unreasonable or very difficult? This is a relatively small project so a simple, less-than-ideal solution rather than a complex one is needed. Can I filter the x and y components of velocity separately? Maybe a one sided filter applied to the vertical bins will be sufficient. Any comments that you have will be welcome.
    Regards, Ben

    • Posted May 6, 2011 at 10:22 am | #

      Hi Ben,
      Sorry for late response – I was on vacation.
      It is really difficult to suggest something without knowing the specifics and seeing the data.
      As far as I understand – you have components of velocity vectors (d/dx, d/dy) already calculated (if not – this is very different task, look for isotropic gradient estimation, Scharr mask).

      If they are noisy just apply some 2D smoothing filter on them, like median filter or Gaussian smoothing filter. And yes, you can filter x and y components separately.

      However if you want some advanced processing (like rejecting outliers, making vector field very smooth, etc) you can look up methods developed for motion vectors handling in video processing. Specifically in time interpolation schemes (insertion of a new frames between existing in a video sequence) – they usually apply sophisticated methods on vector field smoothing.

      • Ben
        Posted May 6, 2011 at 10:33 am | #

        Hi Pavel,
        Thanks for getting back to me. After a bit more research I ended up applying a Gaussian filter to the magnitude only to remove some noise, and the results were fine. I know its not a robust as it could be by separating the components, but will do for this project. I’ll keep your other suggestions in mind for future.
        All the best -Ben

  41. Ali
    Posted May 12, 2011 at 4:27 pm | #

    Hi Pavel,

    I used your one sided filter (of coefficients given in the report paper, with 1, x, x^2 with 15 backward points of coefficient 1/2856) to a sinusoidal (namely 100*\sin(2*PI*0,2*t) ), and see that resultant maximum value should be near 125,66 but it was 158,14. Moreover when you plot, the difference is clearly seen. What do you think the caveat comes from? Maybe I am doing something wrong?

    In the report only the results are being given, I would like to learn how to generate coefficients in the general case for example up to x^3 with 15 or 19 points to try something,

    Thank you, this is a very very illuminating post …

    • Posted May 12, 2011 at 5:09 pm | #

      Hard to say, could you send your plots (or code on how do you apply filters)? My e-mail is pavel@holoborodko.com

      • Ali
        Posted May 12, 2011 at 6:04 pm | #

        Thanks for the answer, I have send you an excel file that should explain what I wanna say in a more clear way I hope, thanks again.

    • Posted May 12, 2011 at 6:42 pm | #

      Thanks for the data.
      Frequency response plots in my report are drawn for h=1 (Time Diff in your worksheet). If we use different h then actual input signal frequency is f*h.

      Signal with f=0.2 and sampling step h=0.1 corresponds to \omega/2\pi=0.02 point on frequency response plot.
      In this point N=15 filter gives greater values than ideal differentiator – so this is correct behavior.

      If you want more precise estimation – (a) decrease sampling step (usual way in numerical analysis). This will shift high frequencies towards origin (low frequency region), where filter is more precise. Or if you cannot change sampling step then you have to (b) design specific filter which is precise on this particular frequency (so called band-pass differentiator).

      Filters in report (and this page) are designed to be precise on lowest frequencies, near zero, for the most common case.

      • Ali
        Posted May 12, 2011 at 7:37 pm | #

        Oops, I missed a critical point …

        Well, I cannot change the values present in the excel file, and that frequency is expected maximum for our data. From your explanation what I understand is I need a smooth low-pass differentiator up to that frequency…

        Many thanks for the clarification,

        @li

  42. delarocca
    Posted June 8, 2011 at 8:31 pm | #

    hi Pavel,

    I’m working on some economical series and test with success your one-sided differentiator hybrid version (which is really really good) and I would like to know how it’s possible to have an N up than 15 for the hybrid formula as i need to test from 21 to 28 ??

    thank you

  43. Ben
    Posted June 15, 2011 at 4:12 pm | #

    The irregular spacing formula doesn’t seem to work. To see this, notice that the derivative doesn’t depend on x_0 so we could place x_0 anywhere we like between x_{-1} and x_1 and still get the same result. Perhaps you could try weighted differences, but I’m not sure that will work either – for example if the weights have linear terms in x_0 it would fail for quadratic functions, etc.

  44. Posted June 15, 2011 at 4:33 pm | #

    Points \{x_{k}\} depend on x^{*}, please check notation under Derivation. They are just distances from x^{*} to points where f_{k} are sampled.

    • Ben
      Posted June 15, 2011 at 5:57 pm | #

      Not sure I follow, because Derivation defines the x_k as equally spaced.

      Anyway, if we set f(x)=(x-a)^2 with x={-2,-1,0,2,4} then the irregular spacing formula seems to imply that f'(0)=3/2-2a. Could you please show how to apply it correctly?

      • Posted June 17, 2011 at 1:47 pm | #

        I’m sorry for late response. The problem with irregular spaced formulas is that they accurate only on 1,x.
        It is extremely difficult to derive high approximation order formulas for arbitrary spaced data.

        I should have written about this fact on the page – sorry….

        • Ben
          Posted June 17, 2011 at 6:30 pm | #

          No worries. The derivation of the regularly spaced formula is a very neat trick though!

  45. Matteo
    Posted June 17, 2011 at 2:14 am | #

    Hi Pavel,
    I use your smooth noise-robust differentiators, for the analysis of time resolution of electromagnetic calorimeter prototype for the SuperB Factory R&D. I will cite your page on my thesis work.

    Really a well done algorithm! ;)

    cheers
    Matteo

  46. Steve
    Posted June 24, 2011 at 6:27 am | #

    I am having a really hard time with the general form for c_k. Specifically, the lower term in the second combination will be negative as k gets closer to M in the sum. I am not aware of a generalization of the binomial coefficient formula to negative k (lower number), only to negative n. Is there a problem with the formula?

  47. Posted July 3, 2011 at 7:04 pm | #

    Hi Holoborodko,
    For some reason, the hybrid differentiators proposed in ‘One
    Sided Noise Robust Differentiators.pdf’ does not comply with
    the tangency constraints proposed above!
    For example, when N=4 and n=2, I’ve taken m=1 and I’ve generated a system
    of (3+2) equations for tangency constraint and I’ve found that
    the coefficients from ‘One Sided Noise Robust
    Differentiators.pdf’ satisfies all equations but not the last one!
    Can you point me to what I’m doing wrong?
    Thanks in advance!

    • Posted July 4, 2011 at 10:55 am | #

      Hi Islam,
      You are doing nothing wrong – hybrid filters from the paper are not derived from tangency constraints.
      They are experimental one – I’ve used very different idea to build them.

  48. Anonymous
    Posted August 10, 2011 at 12:34 am | #

    Pavel, have you published any articles for this work?

    • Posted August 10, 2011 at 11:13 am | #

      No.
      In average 22 visitors read this page every day. Total number of readers is 16,417.
      I think no journal can provide such audience.

  49. Posted September 11, 2011 at 7:39 pm | #

    Hi Pavel,

    I want to estimate velocity from position data at very low low speeds. My input data to the controller is digital in nature (Optical Encoder). How do i make use of your algorithm to estimate velocity information.

    Thanks

  50. Posted September 13, 2011 at 2:11 am | #

    “I want to estimate velocity from position data at very low low speeds. ”
    Do you want to do this real time or in some post processing phase?
    Are you trying to do this with a PLC?
    Calculating velocities real time at low speeds is very difficult. First you must provide yourself good data to work with and use the highest resolution encoder you can find. You may get better results counting the time between counts instead of counting the counts per fixed time. If you end up counting the time between counts then the smooth noise-robust differentiator will not work because the time periods will not be equal.

    • Posted September 13, 2011 at 1:44 pm | #

      I want to estimate velocity in real time. I am doing it in a digital device (FPGA) with optical encoder inputs (4000 cpr). I want to see improvements with the Pavel’s algo if it can help.

  51. Posted September 14, 2011 at 12:59 am | #

    Majid, an algorithm will only do so much to compensate for a poor data. 4000 cpr is not enough. Try 400,000 cpr! I have customers that use encoders that provide over 500,000 cpr!

    I will assume some numbers since you have not provided any. If 1 revolution is 4000 cpr and the distance per revolution is 100mm then you have 40 counts/mm. If the product is moving 1mm/sec then there are 25 milliseconds/count. If you try to compute the speed every millisecond then most of the time you will not even see the counts change. Even at 400,000 cpr the counts will change every 0.25 milliseconds. This would mean that most of the time you would see 4 counts per millisecond with some jitter so some times there will be 3 counts and other time 5 counts in a millisecond. Even this is not very good but now you can consider using the SNRD algorithm. You can see you don’t have a chance with only 4000 cpr.

    One can see that it is better to count the micro seconds between the pulses to estimate speed when the encoder resolution is low. This should be easy if you are working with an FPGA. Then I would use the αβγ algorithm discussed above but allow the time between readings to be variable. This is easy to do.

  52. Posted December 8, 2011 at 5:40 am | #

    Hi Pavel,

    very good work! I use your differentiators for parameter identification of fuel cell models
    and get very fine results.

    Thank you for publishing online.
    Regards, Markus.

  53. Kevin
    Posted February 21, 2012 at 8:26 am | #

    Hi Pavel,

    Have you ever look into the fourth order?

    Best,
    Kevin

    • Posted February 21, 2012 at 2:36 pm | #

      Hi Kevin,

      Not yet – (why) do you need one?

      • Kevin
        Posted February 23, 2012 at 2:58 am | #

        Thanks for responding Pavel,

        I’m looking into high-order image reconstruction, and I think the fourth order would be interesting to play with =D.

  54. Shelly
    Posted March 24, 2012 at 5:59 am | #

    Pavel, I am working with stock price data which is very jagged, up one day down the next. I would like to see the first N derivatives (N<5) side by side with minimal delay and error. What formulas on this site would you recommend?

    • Posted March 24, 2012 at 9:19 pm | #

      Hi Shelly,

      I assume you need real-time processing, when you can use only past values to compute derivative at present moment.
      Filters of such type are called one-sided.

      If you want to compute just first derivative – you can use filters from my report: One Sided Noise Robust Differentiators.pdf .
      Use filters from second table – they give less time delay.

      However, if you want to see several derivatives simultaneously – it is a little bit more challenging task.
      We cannot apply the same filter (1st derivative) several times to get higher order derivative, since results will be de-synchronized in time.

      We need distinct filter to estimate derivative of every order – 1st, 2nd, 3rd, etc. Moreover all the filters should be synchronized with each other by time delay and error. So that estimates will be provided for the same time (e.g. all delayed by one sample).

      Otherwise results of estimation will relate to different moments in time and analysis will be meaningless. Synchronization issue is particularly important if you want to use computed derivatives in further numerical modeling (e.g. PDE or else).

      I have experience in building such special set of synchronized filters (although only for the 1st and 2nd order).
      If you are interested – I can provide you assistance on the task.

      Let me know your proposal by e-mail: pavel@holoborodko.com

  55. Posted March 25, 2012 at 1:48 am | #

    To get the higher derivatives would consider something like the alpha-beta-gamma filter mentioned above. However, differentiating is noisy and since all the derivatives are related by a factor of the sample time there are real problems with numbers getting too big or too small when trying to estimate higher derivatives.
    Here is an example of the alpha-beta-gamma filter as used in motion control. In motion control phase delay must be minimized.
    http://deltamotion.com/peter/Videos/AlphaBetaGamma.mp4

    I only calculated the second derivative in this example but I could expand that to add two more but one needs 64 floats to have any chance of it working properly because the round off errors get to be significant. See this:
    http://deltamotion.com/peter/Maxima/Savitzky-Golay.html
    You should notice that each derivative is a factor of T from the other. Dividing by T^4 can result in very big numbers if the same time is small. If the sample time is big then the numbers can get to be very small.

    • Posted March 25, 2012 at 9:19 am | #

      @…one needs 64 floats…

      I think we can even use arbitrary precision floats, since most likely stock price analysis applications are not limited in computational resources. But most probably they just use sampling step =1.

      Peter, could you give me pointers where to read about ABG analysis.
      As I recall, ABG was equivalent to SG of some particular degree in my tests, which means phase delays for position, velocity and acceleration are not the same, de-synchronized.

  56. Posted March 25, 2012 at 4:23 pm | #

    Paval, there is lots of information if you search for “alpha beta gamma filter”. Most applications involve tracking. In my case I want to track another motor so I can gear another motor to it. There are other applications like tracking images.

    The a-b-g filter isn’t magical. It is the poor cousin of the Kalman filter. The big difference between the Kalman filter and the a-b-g filter is how the gains are computed. After that they are identical. In both cases there is a system matrix that predicts or estimates where the state will be in the next iteration. The actual position is compared to the estimated position and multiplied by a vector of gains that corrects the estimated position, velocity and acceleration.

    The Kalman filter will minimize the error at the cost of a lot of CPU power. One also needs to know about the system noise and measurement noise. Most of my customers have no clue how to calculate that so even many engineers just fudge the numbers to get acceptable results. So if one is going to do that why not simply use a a-b-g filter where the gains are easily calculated but not optimal. In my links in previous threads I believe everything is shown how to implement a a-b-g. I did include three different ways of calculating the a-b-g coefficients. One closes the error using real poles, one uses a Butterworth filter. The B-filter reduces the errors faster than using 3 real poles but it can overshoot when correcting. The third one uses coefficients than minimize the integrated absolute error. It closes the poles faster yet but is still more prone to overshooting .

    What ever method you chose one still selects a bandwidth and increases or decreases it to get the desired results.

    BTW, in the video. In the video the estimated acceleration would sometime not settle down to zero when the motor stopped turning. The acceleration would oscillate a bit around zero. This happened because the estimated position and the quantized measure position would often never be the same so the error was multiplied by the gains so the estimated value always changed unless buy luck the estimated and actual position became identical. The articles don’t go into the quantizing effect of the feed back since they are just academic articles. The motor example is real and the resolution of the feed back is quantized. The same would happen with pixels in any vision system. Stock prices don’t have infinite resolution either.

    • Posted March 27, 2012 at 4:28 pm | #

      Hi Peter, sorry for the late response, I’ve been busy with other tasks.

      I know what a-b-g is, what I asked for was detailed analysis of synchronization between position, velocity and acceleration estimated by a-b-g. Sorry for not making clear myself before. And let me describe my ideas.

      A-b-g assumes that position can be approximated by parabolic polynomial, velocity by linear polynomial and acceleration by constant at any given moment of time.

      In nature this is equivalent to Savitzky-Golay filter of 2nd degree (at least locally). We build parabola based on the past values, sample it at current time (position), and sample its first and second derivatives (for velocity and acceleration):

      x(t) = at^2+bt+c
      v(t) = 2at+b
      a(t) = 2a

      The problem with this approach is that all three quantities are estimated with very different approximation errors. Acceleration is approximated by step function – O(1), while position estimation error is of O(Δt^3).

      In practice this means that phase delay for every parameter (x, v, a) is different. They are de-synchronized in other words. Let’s take a look at following plot for demonstration (I use N=20):

      Left axis is a time delay in samples. As we see filter gives position, velocity and acceleration for different moments in time. For example, velocity is delayed by 4 samples and acceleration by 8 comparing to position at w=0.05.

      From the plot we see that one-polynomial approach tries to estimate position with very high precision at cost of acceleration & velocity. Note the negative phase delay near origin – this is attempt to “predict” position ahead in time, while acceleration is given for 10th sample from the past (this mean 100ms delay for 100Hz rate). As some of my clients reported, such time shifts can be disastrous for math models (PDE or else).

      I’ve been working on designing filters bank which provides synchronized estimation for position, velocity and acceleration. This is what I’ve got so far:

      All three parameters are estimated with the same approximation error, and same time delay in entire pass-band (up to w=0.05).

      Let me know your opinion.

  57. Posted March 31, 2012 at 1:24 pm | #

    This doesn’t surprise me at all in fact it is desired. If you plot a sine wave the acceleration leads the velocity by 90 degrees and the velocity leads the position by 90 degrees.

    • Posted March 31, 2012 at 1:49 pm | #

      Yes, phases of the acceleration, velocity and position are pi/2 apart.

      Of cause, plots above are already normalized in this respect, all shifts are subtracted. We see “true” delay (in samples) which every filter introduces in estimation of characteristic it aimed to. We compare apples to apples :).

      So, a-b-g estimates position with approx. 1.7 samples delay, velocity with 5.5 and acceleration with 10 at frequency w=0.05.

      UPDATE: Pass-band of a-b-g filter is adjustable, so precise numbers may be different for different parameters. Main point is that a-b-g by design approximates position, velocity and acceleration with different accuracy (parabolic, linear, constant), which leads to different phase delay in every characteristics.

  58. Gaetano
    Posted May 31, 2012 at 7:42 am | #

    I need to estimate the time domain second derivative for a time domain finite difference simulator with attenuation (the second derivative is a need for loss calculation). I used a simple finite difference approach with 3, 4 and 5 samples. The one with four samples has the best characteristics of convergence and the 5 coefficients has the worst. I suspect that the poor performance of 5th order estimation of the second derivative is due to numerical noise. I would like to use one of your forward filters with 5 samples (exact on 1, x, and x^2). I cannot use the central differences because I do not have the value of the function in advance and I cannot use too many values because the storage and the reading of the values in the past are already very large with 5 samples. What are the coefficients for a five samples forward second derivative and … do you have any suggestion?
    Very nice work,
    Gaetano

    • Posted May 31, 2012 at 6:15 pm | #

      Here is 5-tap Savitzky-Golay filter (which I believe will be better than smooth differentiators) for quick test:

          \[ \frac{1}{7}\,{\frac {2\,f_{{0}}-f_{{-1}}-2\,f_{{-2}}-f_{{-3}}+2\,f_{{-4}}}{{h} ^{2}}} \]

      It gives 2 samples delay, and provides following noise suppression:

      Good alternative is Alpha-Beta-Gamma filter which needs only 5 variables to store its state. With proper tuning it can provide even better noise suppression.

  59. Posted June 1, 2012 at 3:42 am | #

    I didn’t read your post of Mar 27 very well. Now I understand what you are trying to do. You must mean the phase delay for the estimated values compared to the true values. Yes, but one can increase the order of the αβγ filter to a αβγδ filter and then the acceleration is estimated using a second order equation. The usefulness of this is limited as the quantizing error multiplied by 1/Δt³. Increasing same time helps but only of the resolution of the feed back device is finer. There is no magic.

    The key is to adjust the bandwidth to calculate the K gains to give the desired results. The Kalman filter finds the optimal K gains but often these gains are calculated using inaccurate noise estimates so the engineers fudge the gains anyway. You can see in the video that the phase delay is very small if I keep the bandwidth up high but if I lower it then the phase delay becomes noticeable. I have also tested the Savitsky-Golay and the Smooth noise robust differentiator using the same motor setup. The problem with these two methods is that the quantizing error from the encoder really degrades the performance if the filter is one sided.
    This out weighs the easy of use ( no bandwidth to adjust ) advantage of the SG and SNRD filters.

    Another problem is initializing the history or state. The S-G and SNRD filter one must filling past history to initialize it. The A-B-G filter simply needs the current state. In motion control the current state is almost always known or what it should be ideally.

    The state for the alpha-beta-gamma filter only requires 3 variables. That should be all Gaetano needs to get a good velocity estimate.
    Gaetano, look at my posts above about the αβγ filter.

    • Posted June 8, 2012 at 10:33 am | #

      I didn’t read your post of Mar 27 very well. Now I understand what you are trying to do. You must mean the phase delay for the estimated values compared to the true values. Yes, but one can increase the order of the αβγ filter to a αβγδ filter and then the acceleration is estimated using a second order equation.

      This won’t improve synchronization – x,v,a will be estimated with different phase delay each anyway.

      The usefulness of this is limited as the quantizing error multiplied by 1/Δt³. Increasing same time helps but only of the resolution of the feed back device is finer. There is no magic.

      This problem relates to αβγδ only, since it constructs cubic polynomial on the fly on every step (hence 1/Δt³ is needed).
      However fixed filters have no such a problem. We can use polynomial of any degree in filter design process (using Maxima, Maple or else) – final formula for acceleration will be multiplied by only 1/Δt2 and velocity by 1/Δt. High degree Δtn cancel each other along the way, thanks to equidistant spacing and laws of physics.

      The key is to adjust the bandwidth to calculate the K gains to give the desired results. The Kalman filter finds the optimal K gains but often these gains are calculated using inaccurate noise estimates so the engineers fudge the gains anyway. You can see in the video that the phase delay is very small if I keep the bandwidth up high but if I lower it then the phase delay becomes noticeable. I have also tested the Savitsky-Golay and the Smooth noise robust differentiator using the same motor setup. The problem with these two methods is that the quantizing error from the encoder really degrades the performance if the filter is one sided.
      This out weighs the easy of use ( no bandwidth to adjust ) advantage of the SG and SNRD filters.

      Bandwidth is adjusted by filter length. Phase delay is anti-proportional to bandwidth (as smaller band pass/smoothness as higher time delay) in any filter.

      The state for the alpha-beta-gamma filter only requires 3 variables. That should be all Gaetano needs to get a good velocity estimate. Gaetano, look at my posts above about the αβγ filter.

      I beg to disagree. Besides α, β and γ values we have to store position, velocity and acceleration values from the previous step. In total 6 floating point values should be stored between steps (I was mistaken about 5 values, I forgot acceleration:)).

      My point is that αβγ filter is near-equivalent to SG of some high degree (20), with its own pros and cons. But they both have drawback of de-synchronized estimation – position, velocity and acceleration have different phase delays. This can be alleviated by building family of filters which are synchronized by nature, for any sampling rate.

  60. Jax
    Posted June 20, 2012 at 6:44 pm | #

    Hi Pavel,
    first of all great job, your work is amazing.
    I’m writing you cause facing with some trouble with estimation of coefficients s(k) in “Noise Robust Differentiators for Second Derivative Estimation”.
    I would be gratefull if you could post the extended formulas for numerical estimation of second derivative-backward for 15 and 22 points.
    Thanks a lot Pavel.
    Jax

  61. KT
    Posted July 8, 2012 at 7:07 pm | #

    Pavel,

    I am not very well versed on the math of the subject however by reading your paper, I think I can use your filters in my application.

    I have a signal that I sample at 500khz. I am trying to detect a rise, fall and the peak in the incoming data. The base of the peak could be for 250 usec or 2.5msec, amplitude could be 6db or 15db above the noise floor. I don’t have good snr unfortunately. The dc level of the signal is not constant but move much more slower than the ac component.
    I couldn’t determine what are the N and n I should use so that I pick the right frequencies and suppress the noise.

    At the decision point, I need to know the slope of the rise and fall. This is a hard realtime system and I really need to make a decision in the 100usec after the downward slope reach to dc level.

    A secondary side question is the amplitude, due the signal, I cannot know the exact amplitude of the signal, however I like to compare one signal to the other one, which means I need to normalize the signal. Is there a way to do this using your filters.

    Thx
    KT

  62. Thomas Ong
    Posted July 9, 2012 at 6:54 pm | #

    Hi Pavel,
    It is a great work. By the way, how can I calcualte the frequency BW for a first order smooth noise-robust differentiators?
    Thanks,
    Thomas

  63. Uzair
    Posted September 6, 2012 at 8:35 pm | #

    Hi Pavel,

    As with your original (two-sided) differentiators, can you explain the thinking behind the one-sided differentiators? That is, how the coefficients are derived, etc.?

    Thank you for your excellent work!

    Uzair

  64. Cliff
    Posted September 19, 2012 at 5:10 am | #

    Pavel,

    I have tried to use your formula for the first derivative for irregularly spaced data without success. I can exactly reproduce the derivative of a polynomila (2nd order) when using evenly spaced data, but when the data are irregularly spaced the estimated derivative is quite a bit off. I’m happy to share my data, etc., if it helps to gain an understanding why the method is not working.

    Best,

    Cliff

  65. Posted November 6, 2012 at 8:25 am | #

    Do you know the corresponding smooth filters to replace the asymmetric (and symmetric) filters in the code below — I would like to compare stability within some other code. (It is a straight paste, but should be readable.)

    DO 60 J=4,N-3
    der1(J) = ( f_l(J+3) – f_l(J-3) &
    – 9.D0 * (f_l(J+2) – f_l(J-2)) &
    + 45.D0 * (f_l(J+1) – f_l(J-1)) ) &
    / 60.D0
    60 CONTINUE
    ! Use unsymmetric 6-point formulae for the derivative at the
    ! last 3 N points.
    der1(N-2) = – ( 3.D0*f_l(N ) &
    – 30.D0*f_l(N-1) &
    – 20.D0*f_l(N-2) &
    + 60.D0*f_l(N-3) &
    – 15.D0*f_l(N-4) &
    + 2.D0*f_l(N-5) ) &
    / 60.D0

    der1(N-1) = – (- 12.D0*f_l(N ) &
    – 65.D0*f_l(N-1) &
    + 120.D0*f_l(N-2) &
    – 60.D0*f_l(N-3) &
    + 20.D0*f_l(N-4) &
    – 3.D0*f_l(N-5) ) &
    / 60.D0

    der1(N ) = – (- 137.D0*f_l(N ) &
    + 300.D0*f_l(N-1) &
    – 300.D0*f_l(N-2) &
    + 200.D0*f_l(N-3) &
    – 75.D0*f_l(N-4) &
    + 12.D0*f_l(N-5) ) &
    / 60.D0

  66. Posted November 6, 2012 at 10:38 am | #

    N.B., I am probably interested in N=7 n=4

  67. Kong
    Posted November 16, 2012 at 1:04 am | #

    Very interesting post. I was wondering could you publish the one-sided first derivative on irregular spaced data. Thanks.

  68. Magnus
    Posted December 17, 2012 at 11:29 pm | #

    Thanks for an interesting article.
    I have a question about your formula for the general coefficient c_k in terms of binomial coefficients,
    there appears the binomial coefficient
    \left( \begin{array}{c} 2m \\ m - k - 1 \end{array} \right)
    where k goes from 1 to M. But with the way m and M are defined m -k - 1 will become negative for the larger values of k.
    Is there a typo in the formula, or how do you define the binomial coefficient for a negative argument?

    • Posted December 19, 2012 at 11:04 am | #

      Hi Magnus, thank you for your question!

      This binomial coefficient is zero when m-k-1 \le 0. We can compute this by using general definition of binomial coefficient through \Gamma(n) function, with reflection (+limiting) trick for negative arguments.

      • rmix
        Posted April 13, 2014 at 5:22 am | #

        “<0", not "<=0" I guess.
        Is there a formula using a more common definition of the binomial coefficents so we could use the appropriate functions built in math programs (which fail for neagitive values)

  69. Amit
    Posted February 12, 2013 at 5:58 am | #

    Hi Pavel,

    I am trying to do some pulse sharpening using Y(t)-a*Y”(t)+b*Y””(t) where a and b are real constants. Is there a way to derive a filter for this directly ? If not, is there a formula for the fourth derviative that does not do the second derivative twice ?

    Thanks

    Amit

    • Posted February 13, 2013 at 11:00 am | #

      Hi Amit,

      Yes, this expression can be written as one filter (from superposition of filters for every summand: Y(t), a*Y”(t) and b*Y””(t)).
      And yes, it is possible to derive explicit filter for fourth derivative (e.g. from superposition or directly).

      Hope this helps.

  70. Lech Piotrowski
    Posted February 25, 2013 at 8:31 pm | #

    Could you please add a general expression for coefficients for polynominals up to 4th degree or a way to derive them? I understand how to get the number for each N, but I am not sure how to change them into binomial coefficients…

  71. Martin
    Posted April 4, 2013 at 9:04 pm | #

    Hi Pavel,

    I am interested in a *formula* for 1D “One Sided Hybrid Differentiators” (as described in your pdf: http://www.holoborodko.com/pavel/wp-content/plugins/download-monitor/download.php?id=7 ). Note that I will need filters with N>15. Values for a specific N are thus not that useful. Any help/pointer is much appreciated.

    Martin

  72. Felipe
    Posted September 13, 2013 at 5:38 am | #

    Hi Pavel,

    I have a question about the eigenvalues of the numerical differentiation matrices which I believe is closely related to this post. If you have some time, could you take a look at it?
    http://math.stackexchange.com/questions/491170/1d-schrodinger-laplace-equation-via-finite-differences-incompatible-eigenvalues

    And thanks for this nice and clear post!
    Felipe

    • Posted September 24, 2013 at 8:32 pm | #

      Hi Felipe,

      I answered to your question on math.stackexchange site few days ago.
      Short answer is that you have to use operator convolution instead of product to get higher-order derivative filter from low-order ones.

      In frequency domain you can use product of transfer functions, but not in time domain.

  73. Krzysztof Bosak
    Posted September 24, 2013 at 7:59 pm | #

    If I understand well, taking nth order derivative that is exact for 1, x (either centered or backward)
    is numerically equivalent to
    making least-square fit over the same points with Ax+B then taking A as the derivative, right?

    Consequently, when the formula is exact to 2nd order polynomials, it is equivalent to fitting N points with
    Ax^2+Bx+C then taking df(x_0)=2Ax_0+B at given point as a slope?

    Except that those formulas are faster to calculate than performing general least-squares fit formulas, of course…

    • Posted September 24, 2013 at 8:29 pm | #

      Yes, this is correct, filters designed that way called Savitzky-Golay filters.

      The only difficulty – there is no flexible way to control noise suppression nor phase delay of such filters (nor other properties).
      These properties are fixed in least-squares method + polynomials (time domain).

      If you consider filter design in frequency domain (as above) – you will have much more freedom to build filter with any properties preserving exactness on polynomials if desired.

      It is like having more parameters / degrees of freedom to tune your filters, whereas least-squares approach have all parameters fixed except for filter length and degree of polynomial. (You can use weighted least-squares to have more degrees of freedom)

      BTW, least-squares can be formulated in frequency domain too – it is just one of many possible design criteria accessible in frequency domain.

          \[ \int_0^{\pi/2}H(\omega)^2\,\mathrm{d}\omega \rightarrow \min  \]

      Just imagine – in frequency domain you can add additional conditions to the least-squares (like phase shift minimization) – and get improved Savitzky-Golay filters with smaller delay (if we talking about one-sided filters).

      This topic is so deep, I would be a millionaire if I would get penny for every imaginable criteria & its combinations ;))).

  74. lanosalf
    Posted October 6, 2013 at 9:24 pm | #

    Hi, i’m doing my thesis in physics.
    In one of my experiments i need the second derivative of one of my measurements.
    The data is irregularly spaced.
    Searching for the “best” second derivative algorithm i came up with this site, which is very interesting by the way.
    So, i tried to implement the “second derivative for irregular spaced data ” algorithm
    What are the possibilities that something is wrong in the formula?
    Just by looking at it i realize that there is no S[0] coefficient.
    alpha[k]~k^2*S[k] . So it means alpha[0]=0.
    Also, the sumatories there goes from k=1 to M, so the alpha[k] involved does not cointain S[0].

    So S[0] is not in the formula, but if you choose regularly saced data “h” you need to recover the original “Uniform Spacing” formula in which S[0] is present.

    Could it be that “-2″ the missing S[0]?

    Thanks in advance

    ps: i tried the algorithm “as is” and get all near zero and zero values.
    Nevertheless my code could be wrong i need to chek!!!.
    But i wanted to be sure of the algorithm first

  75. lanosalf
    Posted October 7, 2013 at 6:55 pm | #

    Me again,
    I leave the algorithm as you said with the “-2″ and check with some “test” functions (some polynomials).
    It is working fine with these test functions.
    (still don’t understand when the S[0] enter in the formula)

    Nevertheless, when i use the algorithm in my data measurements i get second derivative near zero values!
    If, on the contrary, i use 5 point stencil the derivative works fine.

  76. Posted December 30, 2013 at 10:30 pm | #

    Thanks for the algorithms; I am using it for the development of a rate time decline analysis software suite of – always noisy – oil and gas production rate versus time data. Very useful; thanks again

  77. michele
    Posted January 10, 2014 at 4:12 pm | #

    Hello,
    I saw your solution and I tried what you suggested to the first user in order to avoid time lag, in particular

    h[0] = 3/8, h[-1] = 1/2, h[-2] = -1/2, h[-3] = -3/4, h[-4] = 1/8, h[-5] = 1/4

    and it seems pretty good. For this filter I noticed that the gain remains pretty high and alligned witht he ideal solution even at high frequencies while the phase starts going out of 90° before. Is there a pattern of coefficient that can do the viceversa? I mean keep the phase higher in frequency and start decresing the gain before?
    thank you very much
    Michele

  78. michele
    Posted January 12, 2014 at 3:18 am | #

    Hello,
    at first I didn’t notice your pdf “One Sided Noise Robust Differentiators.pdf “, I just found it now. I tested the One Sided Hybrid Differentiators and they are fantastic. Thank you very much to share your work.
    Michele

  79. Posted February 6, 2014 at 9:06 am | #

    Pavel,

    I am looking at implementing your equations in MATLAB, as shown below. I need to work out the general form of the one sided Z-transform of your difference equation to directly calculate the coefficients in the transfer function. If I have the coefficients in the transfer function, I can directly use MATLAB’s very fast “filter” function. This is not my area of expertise and frankly I am new to the Z-transform. Do you have any suggestions?

        \[\displaystyle {f'(x^*)\approx\frac{1}{h}\sum_{k=1}^{M}{c_k\cdot(f_k-f_{-k})}}\]

    ,

        \[\displaystyle {c_k = \frac{1}{2^{2m+1}}\left[{2m\choose m-k+1}-{2m\choose m-k-1}\right]},\quad \displaystyle{m=\frac{N-3}{2}},\quad M=\frac{N-1}{2}\]

  80. Posted March 7, 2014 at 10:46 am | #

    A matlab function implement one of Pavel’s formula can be found here:

    http://www.mathworks.com/matlabcentral/fileexchange/45745-smooth-robust-differentiators

  81. Wil Koenen
    Posted April 29, 2014 at 8:55 pm | #

    The links to the On-Line Encyclopedia of Integer Sequences” (A039598, A050166) are broken. The Encyclopedia can now be found at https://oeis.org. The links are https://oeis.org/A039598 and https://oeis.org/A050166

Post a Comment

Your email is never published nor shared.

Use native LaTeX syntax to include formulas: $ ... $, \[ ... \], etc. Do not forget to preview comment before posting.

Also you may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Subscribe without commenting