E ntropy is a measure of uncertainty, which can often be used to assess the complexity among data patterns. In the context of our experiments, entropy is used to estimate the complexity of clouds of data by treating and interpreting them as a connected graphs.
Distance between point in the data clouds form a matrix. That matrix can in turn be evaluated for its complexity using an extended idea that is related to Shannon's entropy. In the context of data clouds, we seek to identify the level of point dispersion, as well as the correlation of that dispersion when two Gaussian distributions are involved.
As a simplistic example, we consider a spherical normal distribution in hyperspace. We consider yet another such distribution and let it gradually drift away from the first. We can then estimate the entropy of the two distributions, the joining of these, and come up with a certain measure of similarity. We observe a well-behaved decrease as the clouds gradually differ, as expected.
The formulation we use to calculate entropy involves the notion of
a graph and and a symbol for entropy,
. It also involves
the two data clouds, which in the name of simplicity, we shall refer
to as
and
. For the two clouds, we may assume for the sake
of the argument, that we have obtained the distances between all points
which they comprise of.
Our estimation of overall entropy is as follow:
More latterly, we replaced our antiquated and confusing notation.
In practice, we ought to replace with
, which corresponds
to the word ``synthetic''. In our experiments, we tend to deal
with synthetic images that are generated from a model of appearance
(combining shape and intensity).
is also known as
,
whose size is arbitrary and can be extended at will.
used
to be merely a subset of the full set
, but it must not be
contained in
. It is only derived from the same model as
so it is not the case that
. Likewise,
and even more strictly, no instance in
should be contained
in
.
We can extend the number of 's to consider in order to improve
our estimations, whenever/if time permits. Ultimately, we are left
with a graph which shows the formulation to be rather helpful. The
calculation of entropy itself is as follows:
where is the distribution (of varying density) ,
is the length of the graph,
is a value that lies between
0 and 1 and const is an unimportant constant, at least at this
stage.