Given a discrete probability distribution , we define the modified cumulative distribution function as: where is the usual cumulative distribution function.
Shannon-Fano-Elias Coding
For each outcome ,
Calculate the value of modified cumulative distribution function .
Calculate the length of the required binary string .
Convert the value to binary .
Trim the binary string to the required length , assign the trimmed binary string to the outcome as its codeword.
e.g. Suppose has outcomes and probabilities . Then we have the following table:
Proof Denote the Shannon-Fano-Elias code as for the th outcome . Then we have That is the codeword lies entirely in the interval between and , and these intervals are disjoint. Therefore, the code is lossless.
Proof According to the previous argument, we know that . Additionally, Hence The intervals for each codeword are thus trivially disjoint, since we know each of the interval is disjoint, thus the code is prefix free.
Shannon-Fano-Elias Decoding
Shannon-Fano-Elias Decoding
To decode a given bitstring,
start with the first bit, compute the corresponding binary interval;
if the interval is strictly contained within that of a codeword:
output the corresponding outcome
skip over any redundant bits for this codeword
repeat the process with the remaining bits
otherwise include next bit and compute the corresponding interval.
e.g. Suppose we have with and we want to decode :
We could actually stop the process once we see since .
Expected Length
Theorem
The expected length of a Shannon-Fano-Elias code on ensemble satisfies: