Sample rate / pitch correction

Sskjelten · Apr 5, 2024

Hi there, I am working on a synth emulator and have a question regarding proper sample playback when sample rate and pitch changes.

First issue is that the samples used for my emulator is recorded in 32kHz, while playback is in 44,1 kHz. Next issue is that the pitch changes depending on a number of factors. The most prominent factor is probably the difference in MIDI key number from where the original sample was recorded for and the key that you play in the emulator. What I do today, which gives a (surprisingly) accurate pitch is to calculate a float value with accumulated pitch adjustments needed and then read the closes sample value from the original instrument sample. The only problem with this solution, and the reason for my question here, is that the further away from the original pitch I get, the more I can hear artifacts in the sample playback (at low frequencies it almost sounds like it is using 8 bit samples). So clearly there needs to be some proper math / filters behind changing the sample rate / pitch during playback. I suspect this is what the libsamplerate & libsoxr libraries are used for.

I really enjoy watching the Bela youtube lectures and recently watched the episodes about the phase vocoder. It shows how to use FFT to make a pitch shifter, which is pretty similar, but not exactly, what I want. I want samples to play longer at lower frequencies, and to be shorter at higher frequencies (just like a natural string instrument).

Any pointers on how to properly adapt sample rate / pitch for sample based synth playback? I suspect people have asked about this before, but I really could not find any exact matches to my question in the forum archive.

giuliomoro · Apr 7, 2024

In order to read an audio sample back at a different rate than the one it was recorded at, you'd normally use some sort of interpolation. In your case you are using no interpolation, just getting the closest sample, this is often called "nearest neighbour". Common simple interpolation techniques are linear interpolation and cubic interpolation. Pure Data uses nearest neighbour for [tabread~] and cubic interpolation for [tabread4~]. Linear is miles better than nearest neighbour and very simple to implement. Cubic is only slightly harder to implement and much better than linear. Linear interpolation is explained in the Bela C++ course here in the context of wavetable. The exact same approach applies to playing back audio fragments.

Sskjelten · Apr 7, 2024

giuliomoro

Thank you for the input. I am familiar with linear interpolation from the LFO implementation (inspired from that video you linked), so to add cubic interpolation to sample playback does not sound like a big undertaking.

If I may ask one follow up question: Do you know the purpose or benefit for using the libsamplerate, libsoxr or similar libraries? They seem to use sinc filters and FFT to do resampling, and there seems to be much emphasis on avoiding aliasing. Are these libraries and filters for a different use case, or would this approach give even better results than e.g. cubic interpolation?

giuliomoro · Apr 7, 2024

those libraries are for higher quality (or at least adjustable quality) sampling rate conversion. I am not familiar with libsoxr, but I do know quite a bit of libsamplerate. Sinc conversion is more mathematically accurate but its higher computational cost doesn't make it a good choice for real-time implementation on embedded hardware.