Home
What is ...
Sirobot
Vocoder
m2n
PHIL
appointment.pl
BackendFS
Download
Sirobot
Vocoder
m2n
PHIL
appointment.pl
BackendFS
Screenshots
Sirobot
Vocoder
Various
Howtos
Songs
Debian on D610
Contact
Links

General

A vocoder is a sound effect that can make a human voice sound synthetic. It is often used to speak like a robot, with a metallic and monotonous voice.

Perhaps you remember the german group Kraftwerk which used a lot of vocoder effects in their songs ("We are the robots" for example). Vocoders become popular again right now.

Here is a sample taken from the song "Right type of mood" by Herbie which was processed by the vocoder available in the download section:

     output.wav (48.6 KB) processed sample
     formant.wav (48.6 KB) original sample
     carrier.wav (33.9 KB) carrier sample (see below)

Background

To put it simple: whenever you speak, your voice consists of two components. The first component is your basic voice type, produced by your vocal chords. It varies in pitch but remains nearly constant in type and is quite unique. That's why you can distinguish between persons when you hear their voices. The second component is how you modulate the basic voice. Modulation means that you dynamically amplify and attenuate frequencies. This is done by the mouth and tongue when you speak.

Example: Say a long "ohh". To accomplish this task, you nearly close your mouth. Next, say a long "ahh". This time, you opened your mouth. Your vocal chords produced the same sound for both, ohh and ahh but the modulation made it sound different.

The modulation signal is called formant, because it forms and shapes the basic voice, which is called carrier due to the fact that it carries the formant signal. The formant signal carries the information and has a much lower frequency than the carrier, a circumstance that can be used to reduce bandwidth consuption for telephone services. This was also the original intention of a vocder.

What does a vocoder do?

A vocoder aims to replace the carrier of your voice with another carrier from another source. Thus, it changes the sound of the voice but not the message when you speak. It takes formant and carrier from external sources and splits them up in bands (a band is a region of frequencies, same thing an equalizer does). Then, the envelope (the modulation) is extraced from each formant band. This part is done by an envelope follower, an extreme low pass filter. Next, formant bands are modulated onto the carrier bands and the resulting bands are mixed together to the output signal.

vocoder block diagram

The benefit of doing this is, you can make the carrier speak or sing. As a side effect, the formant's voice type is absolutely irrelevant to the output so everybody (even those with an ugly voice) can create cool and futuristic samples :-)

You usually use a human voice as the formant and an instrument as the carrier. It makes the instrument speak. Good results can be achived with strings, brasses, flutes or any other sound with nearly constant dynamic. Even chords may be used to give the result more depth.

Input sources each may be file or soundcard (if supported) as for the output. If you use a soundcard as a source, please note: input is sampled in stereo but internally processed as two mono channels. One channel is considered to be the formant, the other the carrier. If your soundcard is duplex aware, you may even use it as source and destination at the same time. You can talk into the microphone and hear yourself. Unfortunally, soundcards have a high latency and thus, the result has a noticeable delay :-(

Each band can be modified in various ways. Volume and panning are two of them that are already implemented.

Last update: Thu, 21 Nov 2013
Design by klHexe, Settel and Gimp. Content by Settel and FTE.