PDF version of this document

next up previous contents index
Next: MDL MODELS Up: INFORMATION THEORY Previous: Entropy   Contents   Index


MDL

Minimum description length [] provides a measure of the minimal amount of information necessary to encode some data. Any data can be transformed in a particular way so that it becomes a sequence of symbols (numbers or signals even, to be less general)4.1. The transition between one symbol to another can be encoded by some transition table which holds the probabilities of all possible transitions. For $n$ symbols, up to $n^{2}$ transition need be defined. Markov chains are one such model type which is a convenient way of explaining the nature of MDL. Markov chains of a high order can accommodate for data of more awkward and unpredictable variance. MDL infrequently defaults to higher-order models (see examples at [WWW-14]) which are superiorly expressive, although they require a far greater number of transitions to be specified.

With proper use of models as in the case above, data can and should be represented solely by all transitions and can then essentially replicated from these transitions. Unless the data is peculiar and shows no patterns, such a description would be compact for data large enough in bulk. MDL attempts to describe the extent to which some data is capable of diminishing in bulk (with or without loss being a separate issue) or rather the minimal amount of information that needs to be available to describe and reconstruct that data. In most cases, the more uncertainty present (i.e. higher entropy), the greater the minimal description length would be.

As an example, here is a vector representation of some arbitrary data: $v=\{3,4,3,1,1,3,4,2,2\}.$ There is an alternative way of representing this data. By using a first-order transition table, e.g. $3\rightarrow4:$1, $3\rightarrow2:0$..., the likelihood or probability of transition from every element to its successor is revealed. Observable patterns are merely meaningless in this data example. Encoding of transitions will also be an inefficient approach as a result of the small vector size and the low sequential correlation4.2. On the sharp contrary, vectors such as $v=\{0\}$ or $v=\{1,1,1,1,1,1,1,1,1,1,1\}$ bear a very small measure of uncertainty. In the second of these4.3, only one transition exists so it can be represented by a tiny model and the entropy is 0.

To summarise, MDL is a measure of the minimal amount of information that expresses a sequence4.4. By inspecting transitions it is possible to get an insight into the complexity of some model. A heart beat pattern, for instance, is rather predictable and repetitive in comparison with the positions of a person's fingers over time. This means that the description length of the heart state should be shorter than that of the hand. Less information is required to capture the behaviour of the heart in motion (heart beats are not sporadic).

Issues of transitions and understanding of data patterns will later be explained in a different context. Rather than reconstructing vectors, images need to be reconstructed by the least number of bits. These bits are permitted to permute quantised natural numbers as the above arguments suggest.


next up previous contents index
Next: MDL MODELS Up: INFORMATION THEORY Previous: Entropy   Contents   Index
2004-08-02