Task: encoding strings ("messages") of symbols (digits)
Suppose we want to obtain an optimal set of codes for messages. Each code is a binary string that will be used for transmission of the corresponding message.
At the receiving end the code will be decoded using a decode tree.
A decode tree is a binary tree in which the external nodes represent messages. The binary bits in the code word for a message determine the branching needed at each level of the decode tree to reach the external node.
| 1 | 01 | 1 | 001 |
| M4 | M3 | M4 | M2 |
To decode efficiently, the goal is to have the lowest possible average message
length
(minimum weighted external path length)
Let P(i) = probability of ith message
L(i) = length of ith message
Therefore what we want to minimize is
Intuitively, the most common message should be the shortest
Notice how no message begins with a 1 except M4
This is the "trick"
What would message M4M4 be?
Suppose we had messages of yes, no, maybe and repeat and consider the following
codes:
P(i) L(i)
| Message | P(i) | code | P(i)L(i) |
| yes | .50 | 00 | 1.00 |
| no | .40 | 01 | .80 |
| maybe | .09 | 10 | .18 |
| repeat | .01 | 11 | .02 |
The
P(i) L(i) = 2.00 this is not very low, but the messages are easy to
parse
YNYYMRN would be ?
Messages are delimited in a fixed manner.
| Message | P(i) | code | P(i)L(i) |
| yes | .50 | 0 | 1.00 |
| no | .40 | 10 | .80 |
| maybe | .09 | 110 | .27 |
| repeat | .01 | 111 | .03 |
The
P(i) L(i) = 1.60 smaller and still easy to parse.
Consider the same message: YNYYMRN
What is it? Pictorially:
The idea here is to have instantaneous codes. Messages are delimited in a
variable manner however no good digit sequence can be the prefix of a
longer good one.
| yes | P =.50 |
| no | P = .40 |
| maybe/repeat | P = .10 (.09 +.01) |
Do it again (i.e., repeat until one message keeping track of P)
yes .50 yes .50 yes .50 y/n/m/r 1.0 no .40 no .40 n/m/r .50 maybe .09 m/r .10 repeat .01
So
yes 0
so what is the message 1000111110?
Probabilities with the same value can go either way - they will result in codes
with the same average length.
The most compression (obviously) occurs when probabilities are lopsided.
Consider a file with predominately numeric data and some variables:
The text notes are on page 534 and discuss the diagram Figure 13.11 for Constructing a Huffman tree.
yes .50 yes yes 0
no .40 no --0 n/m/r 1
maybe .09 ---0 m/r --1
repeat .01 ---1

is
no 10
maybe 110
repeat 111
_ .30 _ .30 _ .30 _ .30 _ .30 023 .40
0 .20 0 .20 0 .20 0 .20 1abc .30 - .30
1 .15 1 .15 1 .15 23 .20 0 .20 1abc .30
2 .10 2 .10 abc.15 1 .15 23 .20
3 .10 3 .10 2 .10 abc .15
a .08 a .08 3 .10
b .04 bc .07
c .03
Finally
_1abc .60 0
023 .40 1