A method of noise-robust numerical differentiation was proposed by Pavel Holoborodko in this article. The novel idea is to design differentiator filter in frequency domain; it appears that it’s possible to build a family of numerical differentiators that possess compelling noise suppression and allow very efficient computation scheme.

A closely related mathematical problem is noise-robust smoothing filter. The classical approach is Savitsky-Golay smoothing; however, Savitsky-Golay filters may be suboptimal in terms of noise suppression. Magnitude response plot reveals heavy ripples at high frequencies:

Magnitude response of Savitsky-Golay smoothing filters, quadratic/cubic

It seemed natural to re-use the Pavel’s idea to design smoothing filter which would be more noise-robust. Below I will walk through task statement and derivation down to the filter coefficient formulae.

## Task statement

Just like with differentiator, the following conditions are imposed on the smoothing operator:

- Exactness on polynomials,
- Preciseness on low frequencies,
- Suppression of high frequencies.

I will look for solution in the form of *Type I symmetrical FIR filter:*

where $f_k = f(x_k),\: x_k = x^* + kh,\: \: k=-m,\dots,m,\: c_k = c_{-k},\: N = 2m + 1$ is the filter length.

## Derivation

The frequency response of Type I filter is

Following Pavel’s approach, let’s set up the following conditions on $H(\omega)$ and its derivatives:

- High tangency order with identity filter $I(\omega) = 1$ at low frequencies,
- High tangency order with $\omega$ axis at high frequencies (noise suppression).

Taking symmetrical nature of the filter into account, exactly $m + 1$ equations are needed to form a well-conditioned system. I use two equations at $w = 0$:

and take the remaining conditions from

Doing so makes the smoother exact on $1, x, x^2, x^3, x^4$. Different scheme would yield different balance between degree of polynomial exactness and noise suppression strength. However, in this article I focus on the derivation for one specific case.

So, we get:

(1)

This system of equations can be solved by general methods, but only for small $m$. In particular, R’s *solve* function fails for $m > 7$ which makes it hardly sufficient for practical applications. So, a specific analytical solution is needed.

To calculate $c_k$ directly, one straightforward method is Cramer’s rule. By substituting column $k + 1$ with the vector of constant terms, one gets for $k \geq 1$:

For now, let’s focus on the determinant and denote the preceding multiplier (independent of $k$) as some unknown constant $c$. It saves us from calculating $\det A$ and gives much freedom for further transformations of the matrix. We can multiply the odd columns by $-1$:

Now it’s evident that the rows $2 \ldots m – 1$ form Vandermonde matrix with $x_i = i^2, i \neq k$. Let’s denote such matrix as $V_{k}$ and, analogously, Vandermonde matrix with $x_i = i^2, i \neq l, i \neq k$ as $V_{l, k}$.

Using the top row, one gets

(2)

The determinant of “truncated” Vandermonde matrix $V_{l, k}$ can be expressed as

where $v=\prod_{0 \leq i < j \leq m} \left(j^2 – i^2\right)$ only depends on $m$, not $l$ or $k$.

Going back to (2), we obtain

Luckily enough, this formula holds also for $k = 0$ and even negative $k$.

The direct expression above is more practical as it enables to calculate $c_k$ for $m > 100$. The normalization factor can be estimated using the first equation of (1).

However, much more efficient solution can be derived. Denoting the summand expression as

one can note that

Expanding sum limits and substituting $m + l$ with $i$, we get

(3)

The rest is obtained using the Binomial theorem

(4)

and additionally, upon differentiating (4) with respect to $x$ and subsequent evaluation with $x = 1$:

Applying these equations on (3) and expanding parentheses carefully (a good test for perseverance) finally leads to

The one last thing is to determine the normalization factor $\tilde{c}$. Applying binomial sum expressions to the first equation of (1), it becomes a technical task to show that

## Results

The smoother filter of length $N$ can be written as

(5)

For example:

As expected, filter profiles look very attractively in frequency domain:

Magnitude response of Noise-robust smoothing filters, exact on $1, x, x^2, x^3, x^4$

It is interesting that filter coefficients above all have form $\frac{i}{2^j}$ with $i,j$ integer. In fact, this property holds for any $N$. After expanding

it appears that

Like for Pavel’s differentiator, this formula enables very computationally efficient convolution implementation using only integer addition, multiplication and binary shift. At the same time, expression (5) may be more convenient, e.g. for analysis of asymptotic behavior.

## Open problems

Some applications may demand filters with higher polynomial exactness order. For small $N$, one can get filter coefficients by solving system similar to (1) numerically:

Magnitude response of Noise-robust smoothing filters, exact on $1, x, x^2, x^3, x^4, x^5, x^6$

For larger $N$, computationally efficient solution is to be discovered.

It would be very useful to generalize the method to 2D and higher dimensions; this problem is also waiting for its researcher. Both generalizations will likely require development of more elegant method for closed-form coefficient derivation.

*July 30, 2015*

*Andrey Paramonov, PhD Senior Software Developer @ ACD/Labs Inc. paramon@acdlabs.ru*

## 22 Comments

#Hi Andrey

Thanks for the very nice extension of Pavel’s method. I am enjoying testing it out for a project.

I am not a mathematician and so I don’t follow everything but I seem to run into a possible math error that I would like you to check. You state:

it appears that

Perhaps there is an error in the last term of the above equation. It seems that if m = k that the upper number of the binomial coefficient will be less than the lower number. If I understand this correctly this situation is not allowed.

Please take a look at this. It could easily be that I just don’t understand something.

Thanks,

Larry

#Hi Larry!

I use the usual definition of binomial coefficient as stated in e.g. Wikipedia:

It is the coefficient of the $x^k$ term in the polynomial expansion of the binomial power $(1 + x)^n$.For $k$ not in $0\ldots n$ the binomial coefficient is zero.

Thanks for your feedback!

#Thanks for the quick clarification. Indeed you pointed to something that I did not understand. Now my program gives the same results for this equation and equation (5).

Thanks for your nice work.

#Hi Andrey,

Thanks a lot for all this great information about the noise-robust smoothing filter!

Do you have the extension of the method to two dimensions? I need to numerically determine the cross second order derivative for a noise two dimensional surface.

Very appreciated!

Paul

#Hello Paul!

Thank you for your comment.

The generalization for 2D and higher dimensions is yet to be built; no formulae are available at this time.

For the second order derivative I recommend that you use original Pavel’s low-noise differentiators.

You might also get some inspiration for 2D extensions from this article.

#Hi Andrey,

Thanks a lot to share your knowledge!

I need to filter a signal in real time and so I need a one sided/backward filter. Can the formulas $f(x)\approx\sum_{k\doteq -m}^{m} c_{k}*f_{k}$ and (5) rearranged to achieve my scope? If so, how?

Thanks in advance for your availability!

Regards

Ciriaco Pannese

#Hello Ciriaco!

Thank you for your interest.

I think you can actually use the formulae by summing for $k=-m\ldots0$, with proper normalization. Quick consideration suggests that you will need division in this case which breaks the elegance and probably speed; however if you manage to build Type II filter then you’ll only need to multiply centered coefficients by 2, due to symmetry.

For small $m$ it’s easy to find $c_k$ with the help of modern computer math systems. Following my derivation it’s likely that closed-form expressions for any $N$ can be built; if you go for it and succeed, please don’t hesitate to share!

#Please note, that application of centered/symmetric filters for real-time signals introduce inevitable delay (= half of the filter in symmetric case).

Probably exponential smoothing or similar will be helpful in your case (keywords: moving average with zero-lag).

The entire financial industry is looking for the “Holy Grail” smoothing filter with minimal time lag/delay.

The thing is that time lag minimization affects other properties of a filter , e.g. adds severe overshoot – so that all properties should be balanced which is quite tricky.

Just my few cents…

#Hi Andrey,

Thank you for an inspiring text. It gives a nice and easily understandable method for designing maximally flat FIR filters.

I was especially interested in the cases where the passband and stopband are symmetric. This leads to so called half-band FIR filter structure which is computationally very attractive since all the even index filter coefficients are zero except the center coefficient which has value of 1/2. Because half of the filter coefficient values are known in advance, the design effort is reduced accordingly.

There are simple closed form design equations for calculating the coefficients of such maximally flat half-band FIR filters published in

C. Gumacos, Weighting coefficients for certain maximally flat nonrecursive digital filters. IEEE Trans. Circuits Syst. CAS-25(4), 234-235, (1978).

Best regards,

– Tero Sihvo

#This paper gives explicit formulas for filter coefficients in all cases.

Khan, I. R. & Ohba, R., Explicit formulas for coefficients of maximally flat FIR low/highpass digital filters, Electronics Letters 36.23 (Nov 9, 2000): 1-2.

Regards,

– Tero

#Hello Tero!

Unfortunately, I cannot access the article you pointed, but quick search by MAXFLAT suggests that my filter is indeed MaxFlat one š

Do the numbers in Khan, Ohba coincide with what I derived?

#Here is the paper: http://goo.gl/K7JNoq

The authors use quite different idea to derive the coefficients, and do not focus on fast filters nor on relation to time domain (exactness on polys). But of course, their formula is more general (based on required cut-off frequency) and they go further to presenting the high-pass filters (only sign of the coefficients should be changed).

#Quick question for equation (2), was the lower bound on the second sum suppose to be $l = k+1$ ?

Since your skipping the k-th column to solve for $c_k$?

#Nice catch, the typo is now fixed.

Thank you!

#Yea just realized this doesn’t matter much since $V_{l,k} = 0$ when $l = k$.

#So if I’m not mistaken on the second to third step of ‘solving for the determinant of ātruncatedā Vandermonde matrix $V_{l, k}$’ I think there is a left out factor. Here is my work if you could verify or not that would be great:

likewise for the other product term:

resulting in the net result:

I’ve been steadily working through this **amazing** proof on my own, trying to understand & reproduce your results/derivation down to the ‘t’ with my own logic in MS Word. Very pain staking work with a limited math background, proofs of each of the steps seems necessary, but this stuff for some reason is very intriguing to me. To find your post was honestly like finding the “holy grail” for me a couple of weeks ago (which tells you how slow I am).

But Thanks again for posting this & I hope you don’t mind my questions/corrections being a noob!

#Hello Nate!

It’s been a long time since your comment, but now re-examining my proof with a fresh eye I saw that there was inaccuracy indeed. After correcting Vandermonde determinant product limits to include $i=0$, terms $l^2$, $k^2$ are now successfully reduced (please note that multiplier 4 stays, at least until next step š ).

Thank you for your contribution, it did really help me now!

#Hi Andrey,

thanks for sharing your nice work!!!

I would like to smooth out noise from sensor data sampled at irregular time.

Without this extension (unevenly sampling), on synthetic data, your filter seems to do the dirty job.

Is there any kind of weighting schema to deal with that ?

Best regards.

Marcello

#Hi Andrey,

Nice work.

I to am interested in a formulation for non-equally spaced data.

Thanks in advance,

— Howard.

#Hi Andrey,

Thank you for the work on these filters.

I had a couple of questions for you.

1) About half way through the page, just below equation 5, you give the results for filters of length 5, 7, 9 and 11.

It looks like filter of length 7 is missing the term for (f-2 + f+2)

If these arenāt zero, can you add them / send me a copy?

Iād love to learn how to do this but Iām not sure where to start since my math background is weak.

2) What is the difference between the table in Results (middle of the page) and Open problems?

Thank you again and all the best,

#Hi Andrey,

thank you for your work!

Is there any formula that allows me to calculate the coefficients for larger filters?

I am trying to use the method proposed in your work on a very noisy signal but I think I need quite “larger” filters.

Thank you in advance

#Great intuition! From a mechanical engineering perspective applying the Chebyshev polynomials is solving the Sturm-Liouville problem and minimizing Runge’s phenomenon.

I’m working on an anisotropic composite shell solver that suffers from oscillation with traditional central difference methods, and am interested in applying some higher order schemes.

It appears you are using polynomials of the second kind for their computational efficiency, as the coefficients are powers of two. I would be interested in comparing the frequency response to the first kind of polynomials, which although containing many primes (5,7,11) in the odd expansions appear less asymptotic at the boundaries.