# Noise-robust smoothing filter

A method of noise-robust numerical differentiation was proposed by Pavel Holoborodko in this article. The novel idea is to design differentiator filter in frequency domain; it appears that it’s possible to build a family of numerical differentiators that possess compelling noise suppression and allow very efficient computation scheme.

A closely related mathematical problem is noise-robust smoothing filter. The classical approach is Savitsky-Golay smoothing; however, Savitsky-Golay filters may be suboptimal in terms of noise suppression. Magnitude response plot reveals heavy ripples at high frequencies:

Magnitude response of Savitsky-Golay smoothing filters, quadratic/cubic

It seemed natural to re-use the Pavel’s idea to design smoothing filter which would be more noise-robust. Below I will walk through task statement and derivation down to the filter coefficient formulae.

Just like with differentiator, the following conditions are imposed on the smoothing operator:

• Exactness on polynomials,
• Preciseness on low frequencies,
• Suppression of high frequencies.

I will look for solution in the form of Type I symmetrical FIR filter:

where $f_k = f(x_k),\: x_k = x^* + kh,\: \: k=-m,\dots,m,\: c_k = c_{-k},\: N = 2m + 1$ is the filter length.

## Derivation

The frequency response of Type I filter is

Following Pavel’s approach, let’s set up the following conditions on $H(\omega)$ and its derivatives:

• High tangency order with identity filter $I(\omega) = 1$ at low frequencies,
• High tangency order with $\omega$ axis at high frequencies (noise suppression).

Taking symmetrical nature of the filter into account, exactly $m + 1$ equations are needed to form a well-conditioned system. I use two equations at $w = 0$:

and take the remaining conditions from

Doing so makes the smoother exact on $1, x, x^2, x^3, x^4$. Different scheme would yield different balance between degree of polynomial exactness and noise suppression strength. However, in this article I focus on the derivation for one specific case.

So, we get:

(1)

This system of equations can be solved by general methods, but only for small $m$. In particular, R’s solve function fails for $m > 7$ which makes it hardly sufficient for practical applications. So, a specific analytical solution is needed.

To calculate $c_k$ directly, one straightforward method is Cramer’s rule. By substituting column $k + 1$ with the vector of constant terms, one gets for $k \geq 1$:

For now, let’s focus on the determinant and denote the preceding multiplier (independent of $k$) as some unknown constant $c$. It saves us from calculating $\det A$ and gives much freedom for further transformations of the matrix. We can multiply the odd columns by $-1$:

Now it’s evident that the rows $2 \ldots m – 1$ form Vandermonde matrix with $x_i = i^2, i \neq k$. Let’s denote such matrix as $V_{k}$ and, analogously, Vandermonde matrix with $x_i = i^2, i \neq l, i \neq k$ as $V_{l, k}$.

Using the top row, one gets

(2)

The determinant of “truncated” Vandermonde matrix $V_{l, k}$ can be expressed as

where $v=\prod_{0 \leq i < j \leq m} \left(j^2 – i^2\right)$ only depends on $m$, not $l$ or $k$.

Going back to (2), we obtain

Luckily enough, this formula holds also for $k = 0$ and even negative $k$.

The direct expression above is more practical as it enables to calculate $c_k$ for $m > 100$. The normalization factor can be estimated using the first equation of (1).

However, much more efficient solution can be derived. Denoting the summand expression as

one can note that

Expanding sum limits and substituting $m + l$ with $i$, we get

(3)

The rest is obtained using the Binomial theorem

which gives the well-known

(4)

and additionally, upon differentiating (4) with respect to $x$ and subsequent evaluation with $x = 1$:

Applying these equations on (3) and expanding parentheses carefully (a good test for perseverance) finally leads to

The one last thing is to determine the normalization factor $\tilde{c}$. Applying binomial sum expressions to the first equation of (1), it becomes a technical task to show that

## Results

The smoother filter of length $N$ can be written as

where

(5)

For example:

As expected, filter profiles look very attractively in frequency domain:

Magnitude response of Noise-robust smoothing filters, exact on $1, x, x^2, x^3, x^4$

It is interesting that filter coefficients above all have form $\frac{i}{2^j}$ with $i,j$ integer. In fact, this property holds for any $N$. After expanding

it appears that

Like for Pavel’s differentiator, this formula enables very computationally efficient convolution implementation using only integer addition, multiplication and binary shift. At the same time, expression (5) may be more convenient, e.g. for analysis of asymptotic behavior.

## Open problems

Some applications may demand filters with higher polynomial exactness order. For small $N$, one can get filter coefficients by solving system similar to (1) numerically:

Magnitude response of Noise-robust smoothing filters, exact on $1, x, x^2, x^3, x^4, x^5, x^6$

For larger $N$, computationally efficient solution is to be discovered.

It would be very useful to generalize the method to 2D and higher dimensions; this problem is also waiting for its researcher. Both generalizations will likely require development of more elegant method for closed-form coefficient derivation.

July 30, 2015

Andrey Paramonov, PhD
Senior Software Developer @ ACD/Labs Inc.
paramon@acdlabs.ru

1. Larry Weber
Posted September 3, 2015 at 9:07 am | #

Hi Andrey

Thanks for the very nice extension of Pavel’s method. I am enjoying testing it out for a project.

I am not a mathematician and so I don’t follow everything but I seem to run into a possible math error that I would like you to check. You state:

it appears that

Perhaps there is an error in the last term of the above equation. It seems that if m = k that the upper number of the binomial coefficient will be less than the lower number. If I understand this correctly this situation is not allowed.

Please take a look at this. It could easily be that I just don’t understand something.

Thanks,

Larry

• Andrey Paramonov
Posted September 3, 2015 at 2:20 pm | #

Hi Larry!

I use the usual definition of binomial coefficient as stated in e.g. Wikipedia:
It is the coefficient of the $x^k$ term in the polynomial expansion of the binomial power $(1 + x)^n$.

For $k$ not in $0\ldots n$ the binomial coefficient is zero.

• Larry Weber
Posted September 3, 2015 at 8:42 pm | #

Thanks for the quick clarification. Indeed you pointed to something that I did not understand. Now my program gives the same results for this equation and equation (5).

2. Paul
Posted September 10, 2015 at 12:20 am | #

Hi Andrey,

Thanks a lot for all this great information about the noise-robust smoothing filter!

Do you have the extension of the method to two dimensions? I need to numerically determine the cross second order derivative for a noise two dimensional surface.

Very appreciated!

Paul

• Andrey Paramonov
Posted September 10, 2015 at 5:48 am | #

Hello Paul!

The generalization for 2D and higher dimensions is yet to be built; no formulae are available at this time.

For the second order derivative I recommend that you use original Pavel’s low-noise differentiators.
You might also get some inspiration for 2D extensions from this article.

3. Ciriaco
Posted February 13, 2016 at 12:33 am | #

Hi Andrey,

Thanks a lot to share your knowledge!

I need to filter a signal in real time and so I need a one sided/backward filter. Can the formulas $f(x)\approx\sum_{k\doteq -m}^{m} c_{k}*f_{k}$ and (5) rearranged to achieve my scope? If so, how?

Regards
Ciriaco Pannese

• Andrey Paramonov
Posted February 13, 2016 at 1:51 am | #

Hello Ciriaco!

I think you can actually use the formulae by summing for $k=-m\ldots0$, with proper normalization. Quick consideration suggests that you will need division in this case which breaks the elegance and probably speed; however if you manage to build Type II filter then you’ll only need to multiply centered coefficients by 2, due to symmetry.

For small $m$ it’s easy to find $c_k$ with the help of modern computer math systems. Following my derivation it’s likely that closed-form expressions for any $N$ can be built; if you go for it and succeed, please don’t hesitate to share!

• Posted February 14, 2016 at 3:44 pm | #

Please note, that application of centered/symmetric filters for real-time signals introduce inevitable delay (= half of the filter in symmetric case).
Probably exponential smoothing or similar will be helpful in your case (keywords: moving average with zero-lag).

The entire financial industry is looking for the “Holy Grail” smoothing filter with minimal time lag/delay.
The thing is that time lag minimization affects other properties of a filter , e.g. adds severe overshoot – so that all properties should be balanced which is quite tricky.

Just my few cents…

4. Tero Sihvo
Posted February 14, 2016 at 7:33 pm | #

Hi Andrey,

Thank you for an inspiring text. It gives a nice and easily understandable method for designing maximally flat FIR filters.

I was especially interested in the cases where the passband and stopband are symmetric. This leads to so called half-band FIR filter structure which is computationally very attractive since all the even index filter coefficients are zero except the center coefficient which has value of 1/2. Because half of the filter coefficient values are known in advance, the design effort is reduced accordingly.

There are simple closed form design equations for calculating the coefficients of such maximally flat half-band FIR filters published in
C. Gumacos, Weighting coefficients for certain maximally flat nonrecursive digital filters. IEEE Trans. Circuits Syst. CAS-25(4), 234-235, (1978).

Best regards,

– Tero Sihvo

• Tero Sihvo
Posted February 19, 2016 at 5:31 am | #

This paper gives explicit formulas for filter coefficients in all cases.
Khan, I. R. & Ohba, R., Explicit formulas for coefficients of maximally flat FIR low/highpass digital filters, Electronics Letters 36.23 (Nov 9, 2000): 1-2.

Regards,
– Tero

• Andrey Paramonov
Posted February 19, 2016 at 11:54 pm | #

Hello Tero!
Unfortunately, I cannot access the article you pointed, but quick search by MAXFLAT suggests that my filter is indeed MaxFlat one 🙂
Do the numbers in Khan, Ohba coincide with what I derived?

• Posted March 2, 2016 at 8:49 pm | #

Here is the paper: http://goo.gl/K7JNoq

The authors use quite different idea to derive the coefficients, and do not focus on fast filters nor on relation to time domain (exactness on polys). But of course, their formula is more general (based on required cut-off frequency) and they go further to presenting the high-pass filters (only sign of the coefficients should be changed).

5. Nate Imig
Posted September 14, 2016 at 7:54 pm | #

Quick question for equation (2), was the lower bound on the second sum suppose to be $l = k+1$ ?

Since your skipping the k-th column to solve for $c_k$?

• Andrey Paramonov
Posted September 16, 2016 at 4:48 am | #

Nice catch, the typo is now fixed.
Thank you!

• Nate Imig
Posted October 6, 2016 at 11:24 pm | #

Yea just realized this doesn’t matter much since $V_{l,k} = 0$ when $l = k$.

6. Nate Imig
Posted October 6, 2016 at 10:31 pm | #

So if I’m not mistaken on the second to third step of ‘solving for the determinant of “truncated” Vandermonde matrix $V_{l, k}$’ I think there is a left out factor. Here is my work if you could verify or not that would be great:

likewise for the other product term:

resulting in the net result:

I’ve been steadily working through this **amazing** proof on my own, trying to understand & reproduce your results/derivation down to the ‘t’ with my own logic in MS Word. Very pain staking work with a limited math background, proofs of each of the steps seems necessary, but this stuff for some reason is very intriguing to me. To find your post was honestly like finding the “holy grail” for me a couple of weeks ago (which tells you how slow I am).

But Thanks again for posting this & I hope you don’t mind my questions/corrections being a noob!

• Andrey Paramonov
Posted November 9, 2017 at 11:40 pm | #

Hello Nate!
It’s been a long time since your comment, but now re-examining my proof with a fresh eye I saw that there was inaccuracy indeed. After correcting Vandermonde determinant product limits to include $i=0$, terms $l^2$, $k^2$ are now successfully reduced (please note that multiplier 4 stays, at least until next step 😉 ).
Thank you for your contribution, it did really help me now!

7. Marcello
Posted March 15, 2017 at 5:14 pm | #

Hi Andrey,
thanks for sharing your nice work!!!
I would like to smooth out noise from sensor data sampled at irregular time.
Without this extension (unevenly sampling), on synthetic data, your filter seems to do the dirty job.
Is there any kind of weighting schema to deal with that ?

Best regards.
Marcello

• Howard Lovatt
Posted December 21, 2018 at 10:15 am | #

Hi Andrey,

Nice work.

I to am interested in a formulation for non-equally spaced data.