Decimation

resynth

Hi!

I'm heavily processing audio before a pitch detector and it doubles the cpu I'm spending just in detecting pitch. I tried decimating the audio down to 22050 before the processing. This is fine as I don't need the higher harmonics for detection, this isn't an audio path that will reach ears so don't even need to upsample later.
After trying various approaches and a lot of reading, interspersed with banging my head on my desk, It finally worked and it did save some cpu, woohoo.

I was reading about half band FIIR decimation filters and all these apparently efficient ways of doing things and these huge dB numbers for attenuation in the stopband etc. In the end I just whacked a couple of biquads on and threw out half the samples, I can't help but think there's a better way to go about things but I'm too rubbish at maths to work out what they're on about 😃

Has anyone played around with this on Bela (I've searched forum and docs to no avail). I'd be interested if there's a better way? I'd like to step down again to 11025 and suspect this is a trick I'll want to use wherever possible.

giuliomoro

resynth I was reading about half band FIIR decimation filters and all these apparently efficient ways of doing things and these huge dB numbers for attenuation in the stopband etc. In the end I just whacked a couple of biquads on and threw out half the samples, I can't help but think there's a better way to go about things but I'm too rubbish at maths to work out what they're on abou

That's what I would have done, too. Unless you care about phase distortion you are probably fine. There are some decimation filters in ne10 by the way. This is the same library we use for FFT. Their API is VERY "old-style" embedded C (partly because they use their own data types for everything in order to ensure data alignment), but if you look at the Bela Fft or Convolve library, you will hopefully get a sense of how to approach these and turn them into a useful Bela library (wink wink nudge nudge).

resynth

giuliomoro you are probably fine.

excellent thanks, I thought it might be one of those things where I hadn't realised why what I was doing was wrong! You gotta watch out for those unknown unknowns.

I'll have a look at ne10 decimation filters, I know the Fft and Convolve libraries reasonably well. I'll try and emulate what you've done there and we'll see if I can get it to library standard!

giuliomoro

Or you may just find out that biquads are cheaper and they are good enough for your needs.

resynth

Everytime I've developed something I've tried to put it in a class, everytime I've failed.

This time I think it worked! (though I do have QWERTY imprinted backwards on my forehead ;⁾

https://github.com/resynth/Decimator

I've only done a very dirty test on it so far but audio goes in and come out the other end without notice errors.

resynth

In my excitement I forgot I haven't actually finished it! If you put any factor other than 2 it won't work.

I guess, if it's to be a library, there's a question of what functionality it should be capable of. ie:
how many possible decimation factors?
is it worth including a high quality and low quality options (more or less filter coeffs)?

Otherwise I'm just making it suit my original intended application.

giuliomoro

resynth how many possible decimation factors?

What is the cost of adding more?

Thanks for sharing your code. I had a quick look at it, so some comments:

How are those firFilterCoeffs computed?
Why is numTaps fixed to 9? Wouldn't it be better exposed publicly? (especially if there is an easy way to automate the computation of the firFilterCoeff.
You have double tabs in the class declaration, should be single tab
for the private variables, I'd avoid the tabs after the type to align the names. It is hard to maintain and also to get right the first time (and it depends on tab size).
you have plenty of trailing whitespaces
is there a way to avoid having #define ENABLE_NE10_FIR_DECIMATE_FLOAT_NEON in the header file?

How is this performing for you in terms of CPU and suitability for your application with respect to having a few lowpass biquads?

resynth

giuliomoro How are those firFilterCoeffs computed?

I used the online calculator Tfilter. It's actually the first time and, rather than suggesting they are necessarily the final coeffs they are there now!

giuliomoro Why is numTaps fixed to 9? Wouldn't it be better exposed publicly? (especially if there is an easy way to automate the computation of the firFilterCoeff.

My idea for numTaps was to keep it private but set it in setup depending on the passed in decimation factor (and quality factor if implemented). I just got excited that the thing had actually worked and forgot I hadn't finished 😃

Funnily enough I was actually going to ask you if you know a function for generating the coeffs in setup so they can be bespoke for the given args passed in?

The ne10 library states user is responsible for providing their own coeffs so I think it's a case of looking elsewhere. If you dunno either I'll do some reading...

Tabbing and whitespaces I'll sort then. Some peoples code I've seen is easier to read than others and I guess I'm going through a phase or playing around in that respect but yes, probably best to keep it simple.
Do the trailing whitespaces cause any issues or is it just a case of being neat?

giuliomoro is there a way to avoid having #define ENABLE_NE10_FIR_DECIMATE_FLOAT_NEON in the header file?

I have no idea! Bela wasn't accepting function call ne10_fir_decimate_float_neon (can't remember the specific error message). #define ENABLE_NE10_FIR_FLOAT_NEON was at head of Convolver.h so I copied it over and put DECIMATE in the middle. It worked so I asked no more questions.

I intend to test the performance but have only had a quick glance down at cpu in my quick dirty decimation test in render. cpu is at 11 or 12% so, like the biquad, a single instance isn't really noticeable. I'll try stacking a few up and comparing.
In theory it ought to be better performing than biquad as the function only actually filters the samples it's outputting due to FIR not having feedback, this seems to be the point in FIR for decimation.
It's a shame we have to use memcpy like that, I'm guessing it's not hugely efficient doing that but it's the only way to change the variable type?
In terms of Convolver and ``Fft``` libraries were they tested for performance relative to other methods? It's interesting reading around the forums and seeing so many cases of people saying they slowed their code down by using neon intrinsics, are surprised and want to know why! There are a lot of delicious looking functions in the ne10 library so I'll be interested to see how this fairs performance wise before looking at other desirables.

giuliomoro

resynth It's a shame we have to use memcpy like that, I'm guessing it's not hugely efficient doing that but it's the only way to change the variable type?

memcpy() is normally very well optimised. It's only slightly inconvenient to use.

resynth In terms of Convolver and Fft libraries were they tested for performance relative to other methods?

The ne10 FIR used for Convolve is in the same speed ballpark as some finely hand-crafted assembly. In fact I believe most of the Ne10 is written in assembly. We didn't test the speed of the FFT stuff, but it would be interesting to do it against e.g.: fftw3. However, I expect Ne10 to win or to break even.

resynth I have no idea! Bela wasn't accepting function call ne10_fir_decimate_float_neon (can't remember the specific error message). #define ENABLE_NE10_FIR_FLOAT_NEON was at head of Convolver.h so I copied it over and put DECIMATE in the middle. It worked so I asked no more questions.

I'll look into a neater solution for that.

resynth Funnily enough I was actually going to ask you if you know a function for generating the coeffs in setup so they can be bespoke for the given args passed in?

I wouldn't know...maybe something in libresample? Ultimately if it's just plain old window-based filter design ... it's probably:
a) not too hard to do in C++
b) not worth the time of doing it in C++

It could be a good enough solution to provide a few presets and then point users to the site.

Or if you want to do it by hand in C++ ... I should have some slides that explain the process from a course I was teaching. Those were explaining the theory and perhaps using MATLAB to implement it, but the port shouldn't be hard.

resynth

That's very impressive performance from the Convolver and Fft libraries.

giuliomoro Or if you want to do it by hand in C++ ...

Thanks but I'll keep it simple regarding coeffs I think, already going down enough of a rabbit hole (relative to apparently being a musician)

giuliomoro

Wise choice

resynth

I've tried the decimator out on a couple of applications and it's working well.

Did a couple rough performance test relative to biquads.
Approximated biquad slope with FIR coefficients and ran the function ~900 times at 128 blocksize and ~80% cpu.
I only managed around 300 biquads or so.
As that's not how it'll actually be used I tried a 256 tap decimator filter vs 12 biquads (back of envelope calculation gave a slope in a similar ballpark). At blocksize 4 this was ~35% for Neon Fir decimator and ~45% for biquad.

I guess the concept of only filtering the samples that will be used does payoff 🙂

"Finished" (nearly?) version with example on github https://github.com/resynth/Decimator