Low-noise Lanczos differentiators

Common approach on computing derivative of a noisy function f(x) numerically is similar to classic central differences we considered before. We approximate “difficult” f(x) by a polynomial near point of interest x^* and assume that f'(x^*) is equal (with some deviance) to the derivative of that polynomial.

Only one important change is required – method of approximation. In case of noisy data there is no sense to use interpolation as we do for central differences (values are already corrupted by noise, no need to preserve them precisely). Instead smoothing least-squares approximation is used in order to remove noise from the data. Filters derived by this procedure are commonly referenced as Savitzky-Golay digital differentiators. We will consider several particular filters of this class named after Cornelius Lanczos. Detailed procedure is described below (skip to the final formulas).

Analogously to central differences, we fix some step h and sample f(x) at N (odd) points around x^*:

f_k = f(x_k),\: x_k = x^*+kh,\: k=-\frac{N-1}{2},\dots,\frac{N-1}{2}.

Then we construct approximating polynomial P_{M}(x)=\sum_{j=0}^{M}{a_jx^j} for f(x) by minimizing cost function

Z = \sum_{k=-(N-1)/2}^{(N-1)/2}{\left(f_k-P_{M}(x_k)\right)^2}

with respect to unknown coefficients \{a_j\}. To do this we need to solve system of linear equations:

 \left\{ \displaystyle\frac{\partial Z}{\partial a_0}=0,\ldots,\displaystyle\frac{\partial Z}{\partial a_{M}}=0 \right\}.

Once polynomial is constructed we can estimate f'(x^*) following to our assumption:

f'(x^*)\approx P'_{M}(x^*)

Power of the polynomial M should be strictly less than N-1 otherwise least-squares approximation will be equivalent to interpolation (number of unknowns will be equal to number of freedom degrees) and we will arrive to central differences.

Methods derived by this procedure are called low-noise Lanczos differentiators for M=2 and super Lanczos low-noise differentiators for M=4. In general

 \begin{tabular}{|r|c|} \hline $N$& Low-Noise Lanczos Differentiators ($M = 2$)\\ \hline &\\ 5&$\displaystyle\frac{f_{1}-f_{-1}+2(f_{2}-f_{-2})}{10h}$\\ &\\ 7&$\displaystyle\frac{f_{1}-f_{-1}+2(f_{2}-f_{-2})+3(f_{3}-f_{-3})}{28h}$\\ &\\ 9&$\displaystyle\frac{f_{1}-f_{-1}+2(f_{2}-f_{-2})+3(f_{3}-f_{-3})+4(f_{4}-f_{-4})}{60h}$\\ &\\ 11&$\displaystyle\frac{f_{1}-f_{-1}+2(f_{2}-f_{-2})+3(f_{3}-f_{-3})+4(f_{4}-f_{-4})+5(f_{5}-f_{-5})}{110h}$\\ &\\ \hline \end{tabular}

These differentiators have very simple structure and can be written in general form for any N:

\displaystyle{f'(x^*)\approx \frac{3}{h}\sum_{k=1}^{m}{k\frac{f_k-f_{-k}}{m(m+1)(2m+1)}}, \quad m=\frac{N-1}{2}}

Interesting fact is that denominators constitute special sequence of integer numbers appeared in many fields of mathematics and physics. For example, these numbers are coincide with maximum accumulated number of electrons at energy level m. Check this website for more details.

 \begin{tabular}{|r|c|} \hline $N$& Super Lanczos Low-Noise Differentiators ($M = 4$)\\ \hline &\\ 7&$\displaystyle\frac{58(f_{1}-f_{-1})+67(f_{2}-f_{-2})-22(f_{3}-f_{-3})}{252h}$\\ &\\ 9&$\displaystyle\frac{126(f_{1}-f_{-1})+193(f_{2}-f_{-2})+142(f_{3}-f_{-3})-86(f_{4}-f_{-4})}{1188h}$\\ &\\ 11&$\displaystyle\frac{296(f_{1}-f_{-1})+503(f_{2}-f_{-2})+532(f_{3}-f_{-3})+294(f_{4}-f_{-4})-300(f_{5}-f_{-5})}{5148h}$\\ &\\ \hline \end{tabular}

To differentiate a digital signal u_n we need to use h=1/SamplingRate and replace f_k by u_{n+k} in the expressions above. Magnitude responses of low-noise differentiators for M=2,4 are drawn below:

As it can be clearly seen this method is much more robust to noise than plain central differences. However due to least-squares nature Lanczos differentiators cannot guarantee complete noise suppression in high frequencies. Its magnitude response doesn’t go smoothly to zero near \pi. Instead they resemble “wavy” always-non-zero response on high frequencies.

Filters with guaranteed suppression of noisy high frequencies are presented in the next article:


You can use materials from this site in any legal activity (research, teaching, engineering, etc.) freely on the basis of respecting my authorship by giving clear statement and reference to my site or particular page on it in your work. Besides brief description of your project is very welcome in the comments below. One point though, if you use my materials in commercial project please consider supporting my site and me by donation.

1 Star2 Stars3 Stars4 Stars5 Stars (7 votes, average: 5.00)


  1. manoj
    Posted August 9, 2009 at 1:31 pm | #

    Why low-noise Lanczos differentiators do not work with noisy data?

    • Wirbel
      Posted January 1, 2010 at 8:38 am | #

      I suppose of wrong coefficients!
      Change signs of 22 (N=7), 86 (N=9) and 300 (N=11)

      • Posted January 13, 2010 at 8:06 pm | #

        I’ve checked the coefficients and their signs – all are correct.

  2. Sergey Kolonitskii
    Posted April 3, 2011 at 2:19 am | #

    thank you for tutorial. I am right now working on Savitzky-Golay differentiation as applied to differential spectrometry and it seems that you may know how this stuff works. I can manage the algorithm, but there is a question of estimates of error. Do you know any results in this direction? I mean, I derived these estimates but I don’t know if I invented another bicycle.

    • Posted April 7, 2011 at 9:15 am | #

      Hi Sergey,
      What do you mean under error estimation? Asymptotic using sampling step – h?

      • Sergey Kolonitskii
        Posted April 7, 2011 at 5:33 pm | #

        No. Well, in my problem sampling step is given – device’s resolution is such and such and you can’t increase it. My boss says we need 4th derivative and noises are nasty enough to use Savitzky-Golay filters. Problem is, when you use too little points the noises are not filtered out and if you use too much points you can smooth the sample all the way to zero.

        What I talk about are estimates like that:
        let g_i = f(x_0 + i h) + \xi_i, where \xi_i is the noise.

            \[  \mathcal P \left( |f^{(4)}(x_0) - \frac 1 {h^4} \sum_{i=-n}^n a_i g_i| \leqslant R^{(4,M)}_{f,\xi,\alpha} (n,h) \right) \geqslant \alpha  \]

        I mean, if you compute something, you must know the precision of your calculations, i.e. what is the difference between value of derivative you computed and actual value of this derivative? I made some quick calculations on my lap, because I didn’t find any results in this direction. Either my googling skills suck (and it may actually be the case) or nobody came up with such results. I wonder what it is 🙂

        • Posted April 7, 2011 at 6:36 pm | #

          Yep, this is what I was talking about.

          Usually residual term R depends on sampling step and higher derivatives. You can easily derive such kind of error R using Taylor series instead of g.

          However this kind of analysis has major flaw – it is based on assumption that magnitude of the higher derivatives are going to zero. Which is not the case for practical data especially in the presence of noise – derivatives behaves wildly. Hence range of the possible difference from true derivative R becomes huge and meaningless.
          Besides you have to catch your own tail since you have to calculate higher derivatives in order to get accuracy of lower-order estimation.

          All you can do is to study your signal (range of frequencies), can it be presented with polynomial, what degree, noise properties?
          Knowing these characteristics you can build appropriate filter.

          Take a look on paper about Savitzkiy-Golay properties:

          • Sergey Kolonitskii
            Posted April 7, 2011 at 10:26 pm | #

            Thanks for paper. At very least it’s a good place to start.

            Problem is, my data are not signal strength at given times. My data are transparencies of sample at given wavelengths and noises come from limited sensivity of spectrometer, so as far as I know, these derivatives have no physical meaning. Frankly speaking, for now I just hope that derivatives exist and noise is somewhat decent. Anyway mathematics cannot answer these questions without some serious chemistry or physics.

            As for what my data really is, most probably it’s sum of a lot of Gaussian curves with occasional Lorenz curve. If I only knew that they are uniformly wide…

            As I told before, step h is given, and I can’t change it. The only parameters in my control are approximation degree and sample width. And for given function and step there is an optimal sample width that depends on a lot of things including higher derivatives. Problem is, this optimal sample width is a tricky beast – you miss it by ten percent and your errors go through the ceiling, so taking a wild guess and hoping that it will work out isn’t going to work.

            Actually, being a PDE guy I do not see a lot of problem in dependence on higher derivatives. All you have to do is to get a decent equation that your function satisfies, derive some apriori estimates and then improve them with aposteriori estimates. Sounds scary, but it’s still better than to hope blindly that everything goes well. I tried, and it doesn’t. By the way, asymptotics of Fourier image of function are in fact equivalent to estimates on high derivatives, so you must have them one way or another and in either case you can’t do it by mathematics alone. Sigh.

            Anyway, thanks for helping a newbie.

  3. Carlo
    Posted September 15, 2011 at 9:59 pm | #

    Dear Pavel,
    I do Monte Carlo simulations of Self Avoiding Walk on the lattice as model of Polymers.
    I use low-noise differentiator for calculating the derivative of the function Entropy(Energy) obtained from the simulations.
    How I cite the low-noise differentiator method in a scientific paper?
    I think it is not possible to cite a web site.
    Is there an original paper where the method is presented?

    Thank you,

  4. Daniel Walsh
    Posted January 29, 2022 at 5:27 am | #

    Hello, Pavel,

    I am interested in your discussion of the low-noise Lanczos differentiators on http://www.holoborodko.com/pavel/numerical-methods/numerical-derivative/lanczos-low-noise-differentiators/. The idea is very natural to me. I noticed that on your page, you describe results for M=2 (regular) and M=4 (super). I can’t help but wondering why we only consider even M. Lowering M decreases the number of degrees of freedom in the approximating polynomial, so my intuition tells me that we will have a less noisy result because we’re farther from overfitting. Am I missing something?

Post a Comment

Your email is never published nor shared.

Use native LaTeX syntax to include formulas: $ ... $, \[ ... \], etc. Do not forget to preview comment before posting.

Also you may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Subscribe without commenting