Glossary / Audio, Media and Codecs
🎵 Audio, Media and Codecs covers how voice is captured, compressed, transmitted, and played back in VoIP and Cloud PBX systems. From basic codecs to echo cancellation and encryption, this section contains 27 terms that explain what happens to your voice between speaking and hearing.
On this page: Codec · G.711 · G.711 u-law/PCMU · G.711 a-law/PCMA · G.729 · G.722 · Opus · iLBC · AMR · Speex · RTP · RTCP · SRTP · ZRTP · Transcoding · Sampling Rate · Narrowband Audio · Wideband Audio/HD Voice · Super-Wideband/Full-Band · Comfort Noise/CNG · VAD · Silence Suppression · Echo · Echo Cancellation/AEC · AGC · PLC · Duplex/Half-Duplex/Full-Duplex
Codec
Short for "coder-decoder." A codec compresses your voice into digital data for transmission and decompresses it at the other end. Different codecs offer different trade-offs between audio quality and bandwidth usage. Common VoIP codecs include G.711 (high quality, more bandwidth) and G.729 (lower quality, less bandwidth).
Related: Sampling Rate · Transcoding · G.711
G.711
The most widely used voice codec. It provides toll-quality audio (similar to a traditional phone call) and uses about 64 kbps of bandwidth per call. G.711 is supported by virtually every VoIP device. It comes in two variants: u-law (used in North America) and a-law (used in Europe).
Related: G.711 u-law/PCMU · G.711 a-law/PCMA · Codec
G.711 u-law / PCMU
The North American variant of G.711. It uses a specific method of compressing the audio signal called "mu-law companding." If you are connecting to American or Canadian phone systems, they will typically use this variant. European systems usually use the a-law variant instead.
Related: G.711 · G.711 a-law/PCMA
G.711 a-law / PCMA
The European variant of G.711. It uses "A-law companding" to compress the audio signal. This is the standard codec for phone networks in Europe, including Luxembourg. When your Cloud PBX connects to European carriers, it will most likely use G.711 a-law.
Related: G.711 · G.711 u-law/PCMU
G.729
A codec that compresses voice to only about 8 kbps, using much less bandwidth than G.711. The trade-off is slightly lower audio quality. G.729 was popular when internet connections were slow and expensive. Today, with fast broadband widely available, most providers prefer G.711 or HD codecs like G.722.
Related: Codec · G.711 · Bandwidth
G.722
A wideband codec that delivers HD Voice quality, making calls sound noticeably clearer and more natural than standard G.711. It uses the same 64 kbps bandwidth as G.711 but captures a wider range of sound frequencies. Most modern IP phones support G.722, and many Cloud PBX providers enable it by default.
Related: Wideband Audio/HD Voice · Codec · Opus
Opus
A modern, open-source codec designed for both voice and music. Opus adapts its quality and bandwidth usage in real time based on network conditions. It supports everything from narrowband voice to full-band audio. Opus is widely used in WebRTC (browser-based calling) and is considered one of the best codecs available today.
Related: Codec · G.722 · WebRTC
iLBC (Internet Low Bitrate Codec)
A codec designed to perform well even on networks with high packet loss. It uses about 13 to 15 kbps and produces acceptable voice quality. iLBC is useful for connections where network conditions are unpredictable, though Opus has largely replaced it in modern systems.
Related: Codec · PLC · Packet Loss
AMR (Adaptive Multi-Rate)
A codec primarily used in mobile phone networks (GSM, 3G). AMR adjusts its bitrate based on network conditions. It is rarely used in office VoIP systems, but you may encounter it when calls pass between mobile networks and your Cloud PBX, requiring transcoding.
Related: Transcoding · Codec
Speex
An older open-source codec designed for VoIP. Speex supports narrowband, wideband, and ultra-wideband audio. It was popular before Opus was created. Today, most systems have replaced Speex with Opus, which offers better quality and flexibility. You may still find Speex in older VoIP equipment.
Related: Opus · Codec
RTP (Real-time Transport Protocol)
The protocol that carries the actual voice audio during a VoIP call. While SIP sets up and manages the call, RTP delivers the sound. Each RTP packet contains a small piece of compressed audio (typically 20 ms). RTP usually runs over UDP because speed is more important than guaranteed delivery for real-time audio.
Related: RTCP · SRTP · UDP
RTCP (RTP Control Protocol)
A companion protocol to RTP that sends statistics about call quality, such as packet loss, jitter, and latency. RTCP does not carry voice audio. Instead, it provides feedback that helps devices adjust their behaviour (for example, switching to a lower bitrate codec if the network is congested).
Related: RTP · Jitter · Packet Loss
SRTP (Secure RTP)
An encrypted version of RTP that protects voice audio from being intercepted during transmission. SRTP encrypts each audio packet so that anyone capturing network traffic cannot listen to the conversation. Most modern Cloud PBX systems use SRTP by default to keep calls private.
Related: RTP · ZRTP · TLS
ZRTP
A protocol for negotiating encryption keys for voice calls without needing a central server. ZRTP performs a key exchange at the start of the call and then encrypts the audio using SRTP. It was designed to provide end-to-end encryption even when the PBX or provider does not manage encryption centrally.
Related: SRTP · RTP
Transcoding
The process of converting audio from one codec to another during a call. Transcoding happens when two devices do not support the same codec. For example, if your phone uses G.722 but the other party only supports G.711, a server in between must decode the G.722 audio and re-encode it as G.711 in real time. This adds processing load and can slightly reduce quality.
Related: Codec · G.711 · G.722
Sampling Rate
How many times per second an audio signal is measured and converted to digital data, expressed in Hertz (Hz). Standard phone calls use 8,000 Hz (8 kHz). HD Voice codecs use 16,000 Hz (16 kHz) or higher. A higher sampling rate captures more detail, making voices sound clearer and more natural.
Related: Narrowband Audio · Wideband Audio/HD Voice · Codec
Narrowband Audio
Audio captured at a sampling rate of 8 kHz, covering frequencies from about 300 Hz to 3,400 Hz. This is the quality of a traditional phone call. It is understandable but can sound muffled compared to wideband audio. Standard codecs like G.711 and G.729 produce narrowband audio.
Related: Wideband Audio/HD Voice · Sampling Rate
Wideband Audio / HD Voice
Audio captured at 16 kHz, covering a wider frequency range (typically 50 Hz to 7,000 Hz). Voices sound richer and more natural because more of the original sound is preserved. G.722 and Opus are common wideband codecs. Many modern IP phones display an "HD" indicator when a wideband call is active.
Related: Narrowband Audio · G.722 · Opus
Super-Wideband / Full-Band
Audio quality levels above standard wideband. Super-wideband samples at 32 kHz, while full-band samples at 48 kHz, approaching music quality. The Opus codec supports full-band audio. In practice, most business calls use wideband (HD Voice) at most, but super-wideband is available for high-quality conferencing.
Related: Wideband Audio/HD Voice · Opus · Sampling Rate
Comfort Noise / CNG (Comfort Noise Generation)
Artificial background noise generated during silent moments in a call. Without comfort noise, complete silence during pauses can make callers think the call has dropped. CNG produces a low, natural-sounding hiss that reassures both parties the connection is still active.
Related: VAD · Silence Suppression
VAD (Voice Activity Detection)
Technology that detects when someone is speaking and when they are silent. During silence, the system can stop sending audio packets to save bandwidth. VAD works together with comfort noise generation to ensure the call still sounds natural during quiet moments.
Related: Silence Suppression · Comfort Noise/CNG · Bandwidth
Silence Suppression
A technique that reduces bandwidth by not transmitting audio packets when no one is speaking. Instead of sending empty audio frames, the system stops transmission during silence and relies on comfort noise to fill the gap at the receiver's end. This can save 30% to 50% of bandwidth on a typical call.
Related: VAD · Comfort Noise/CNG
Echo
When a speaker hears their own voice reflected back with a slight delay. In VoIP, echo is typically caused by the far-end device's speaker audio leaking into its microphone. Short echoes (under 25 ms) are not noticeable, but longer echoes are very distracting. Echo cancellation technology is used to remove it.
Related: Echo Cancellation/AEC
Echo Cancellation / AEC (Acoustic Echo Cancellation)
Technology that removes echo from a voice call in real time. The system learns the acoustic characteristics of the room and the device, then subtracts the reflected sound from the microphone signal. AEC is built into virtually all modern IP phones, headsets, and softphone applications.
Related: Echo · AGC
AGC (Automatic Gain Control)
A feature that automatically adjusts the microphone volume so that quiet speakers are boosted and loud speakers are toned down. AGC helps maintain a consistent volume level throughout a call, regardless of how close or far the speaker is from the microphone.
Related: Echo Cancellation/AEC
PLC (Packet Loss Concealment)
A technique used by codecs and VoIP devices to hide the effect of lost audio packets. When a packet is missing, PLC generates a short segment of audio based on the packets that arrived before and after the gap. This makes small amounts of packet loss nearly unnoticeable to the listener.
Related: Packet Loss · Codec · iLBC
Duplex / Half-Duplex / Full-Duplex
Duplex describes whether both parties can speak at the same time. Full-duplex means both people can talk simultaneously, like a normal phone conversation. Half-duplex means only one person can speak at a time, like a walkie-talkie. All modern VoIP phones and headsets are full-duplex.
Related: Speakerphone · Echo
Related Sections
🔗 Networking for VoIP — Bandwidth, latency, QoS, and network infrastructure
📡 SIP Protocol — The signalling protocol that sets up and manages calls
🔒 Security — TLS, encryption, and VoIP fraud prevention
🎧 Devices and Hardware — IP phones, headsets, and conference equipment
📅 Ready to explore Cloud PBX for your business?
Start with the provider comparisons or feature guides. If you want expert help, book a short call with a consultant.