hi there,

i tried several times to create a vocoder patch in puredata, approaching it the same way an analog vocoder works but i always failed horribly.

are there any vocoder patches out there that actually work in realtime and could be ported to bela? without fft but with tuned bandpass filters...

I made one in Pd about 7 years ago, that is pre-Bela. I think it worked okay. It was using dynamic patching to generate arbitrary number of bands at runtime. I will look into my old laptop around June 5th to see if I can find it.

It was time-domain, so it should work just fine on Bela in Pd or even Heavy (if you just avoid the dynamic patching bit).

I'm actually developing a digital vocoder right now using a bank of tuned bandpass filters. The filters are the bandpass output of several 2-pole state variable filters that I designed using the zero delay feedback method.

The project is part of my electrical engineering senior capstone. If you're interested I can share the SVF source code and ideas about how to make a vocoder out of them.

    matt yes i am interested!
    i only did puredata stuff on bela though, c on arduino and axoloti. so i might need a little assistance to get the ball rolling.

    • matt replied to this.

      lokki my goal is to have a version of a basic state variable filter program (which I already have) posted on my GitHub (which I need to create) by the end of the week. Making a vocoder out of that without much C++ knowledge may be difficult but not impossible. I'll be working on a vocoder in coming weeks as well as an multi mode auto filter.

      I will keep you posted!

      Yet another voice in the mix...and unfortunately I have anything helpful to say about Pd implementations.

      I know for sure the Sirlabs vocoder uses the analog-style approach using band-pass filter banks:
      https://www.sirlab.de/linux/download_vocoder.html

      I also made one for Rakarrack years ago:
      https://sourceforge.net/p/rakarrack/git/ci/master/tree/src/Vocoder.C
      https://sourceforge.net/p/rakarrack/git/ci/master/tree/src/Vocoder.h
      [audio starts at about 2:35 -- sorry bad video]

      The unique feature was implementation of dynamic range compression in the (voice) side-chain. This helped reduce how much you need to swallow the mic for it to work. It also allows for an arbitrary number of channels.

      Unfortunately this is the same situation as with @matt has offered: It is not straightforward to implement any of these on Bela for somebody not familiar with C programming. The new challenge with Rakarrack or the sirlabs vocoder is (even though these are complete fully-functioning vocoders) pulling the right files from the sources and getting the setup, inputs & outputs correctly configured in Bela.

      As for making a patch in puredata, maybe the best answer is "try, try again". This is fun and rewarding when you do it yourself and succeed. If you understand the basic concept, then the problems in your failed attempts are likely simple. Each channel has two identically-tuned band pass filters (except you might try playing with different Q between detector and carrier filters)
      Voice->Filter1 -> envelope detector
      Carrier->Filter2 -> variable gain cell
      Envelope detector -> sets gain in variable gain cell.

      Putting a dynamic range compressor in the voice channel helps make it a little more sensitive since you can jack up the gain to capture softer speech without overloading the circuit too badly when you scream.

      The "gotchas":

      1) Attack/release times on the envelope detector. If too short then the output will bee too distorted. If too long then the formants will be smeared and you won't get a very prominent sound.
      2) Filter Q: higher resonance generally helps make for more intelligible speech, but if you have insufficient filter bands then this simply sounds bad...well it sounds bad when extreme no matter how many bands you have.
      3) Filter bands: At least 7
      4) Filter tuning: Logarithmically spaced evenly to cover 200 Hz to 4 kHz. Add high- and low-pass filters above min and max bands. If you spread your filters out like a graphical EQ (20 Hz to 20 kHz) then you're wasting the vocoder's resolution on bands that do not contain the formants.

      I found in practice tuning to the Crybaby wah typical range (450 Hz to 2.5 kHz) covers almost all of the interesting formants. This is a hint most of your bands should be focused in this range. This is all a trade-off of number of bands and frequency resolution on the bands where formants are dominant.

      Each of those parameters can be run-time adjustable. If you design your patch to allow you to dial in these parameters then you can find out what works best for your voice and carrier source.


        ryjobil As for making a patch in puredata, maybe the best answer is "try, try again". This is fun and rewarding when you do it yourself and succeed.

        yes! maybe i have to clarify a little... the approach i took (bandpass filters for the modulator and measuring the output of those with an env~ and then applying those levels to bp filters that are equally tuned but filter the carrier signal) worked ok in puredata. the quality was just never even close to the analog vocoder i posess. (mam vf-11)

        but probably you are right, i have to fine-tune it some more. the rakarrak version you posted already sounds quite good!!

        This Music From Outer Space (MFOS) vocoder schematic could be a good start for selecting initial values. The designer was even kind enough to put the center frequency, Q and relative channel gain on each sheet in the schematic:
        http://musicfromouterspace.com/analogsynth_new/VOCODER2013/pdf/VOCODER_BOOK_document.pdf
        Actually his whole document on this is pretty good (makes me want to actually build one):
        http://musicfromouterspace.com/analogsynth_new/VOCODER2013/VOCODER2013.php#SCHEMATICS

        The envelope detect channel circuit values can lead you into the right ballpark for envelope time constants for each channel.

        [EDIT] The gain in the VCA section is linear contrary to my initial assessment. I looked up the datasheet for the LM13600 and realized the CV current-to-transconductance (gain) transfer characteristic is very nearly linear. There may still be something to toy with here in terms of loudness perception, but the analog VCA is not implemented with an exponential generator in the MFOS vocoder.

        19 days later