|
|
|
|
 | |  |
| The amount of information a symbol contains is related to its probability of occurence, and hence its entropy. If you process a lot of text, you can work out the relative probability of the letters. | |
 | |  |
|
 |
|
|
|
|
|
|
 | |  |
| Once you know this, you can calculate their maximum possible compression. The perfect compression algorithm for the English language could compress a letter down to just over one bit. | |
 | |  |
|
 |
|
|
|
|
|
|
 | |  |
| So then I headbutted him in the crotch and ran away in tears. | |
 | |  |
|
 |
|
|