# Understanding Cell Phone Technology: Information Content

Every message begins with an information source, like someone talking on a cell phone, or a computer transmitting a file across the internet.

A mathematician, Claude Shannon, came up with the idea in the 1940’s of the information content of a message, the essential elements of the message that have to be preserved in order to convey the original content intended.

The information content of a message is expressed in terms of the probabilities of its different possible symbols, and involves their logarithms. Shannon showed that the only suitable measure of information was, in fact, the thermodynamic entropy that we met two weeks ago.

Suppose I want to send one of eight letters. I could encode the letters as: 000, 001, 010, 011, 100, 101, 110, and 111. The log (base 2) of 2^3 is 3, that is, three bits of information need to be sent to denote each of the eight letters. If the probability of each letter is (p0, p1, p2, p3, p4, p5, p6, p7), then the measure of the amount of information is given by its entropy:

H = -[p0*log p0 + p1*log p1 + p2*log p2 + p3*log p3 + p4*log p4 + p5*log p5 + p6*log p6 + p7*log p7].

What’s that negative sign doing there? Probabilities are measured as values between 0 and 1. The log of a number between 0 and 1 is negative, and therefore the log of each probability above is negative. So the negative sign outside the expression just makes the overall sum positive.

Remember that entropy measures randomness. Things appear most random when all the probabilities are equal, which is when the entropy is maximized. The more equally likely symbols there are to choose from, the greater the entropy. Entropy is thought of as being measured in bits per symbol.

That’s a lot of words and complicated ideas, so let’s consider another example. Suppose that the letters A, B, C, D occur in a symbol stream with probabilities ½, ¼, 1/8 , and 1/8 respectively. Then the entropy is:

H = – (½ log ½ + ¼ log ¼ + 2/8 log 1/8) = -(-1/2 – 2/4 – 6/8) = 7/4 bits per symbol.

The maximum possible entropy would be achieved when each symbol was equally likely, i.e.

-( ¼ log ¼ + ¼ log ¼ + ¼ log ¼ + ¼ log ¼) = 2 bits per symbol.

We can use these ideas about entropy to design efficient codes for our alphabet. By efficient, we mean two things. First, the codes will compress the information to be sent using the statistics of the alphabet, just like Morse code did for telegraphic codes. Second, the codes can add back in some redundancy so can detect and correct errors in transmission. We’ll talk more about these ideas another day.

Closely related to information content or entropy is the idea of a channel capacity C, the amount of information a particular communications channel can carry, measured in units of bits per second. In his fundamental theorem for a noiseless channel, Shannon proved that there exists a way to encode the output of a source so as to transmit at a rate of C/H – epsilon (where epsilon is a very small number) symbols per second, but never at a rate exceeding C/H.

I’ve mentioned a lot of complicated ideas here. This is the field of information theory, and I’m only going to touch on the basic concepts, so don’t worry! We will use them in subsequent weeks as we continue to talk about cell phone technology.