PDF version of this entire document


Roy Schestowitz

T HE G-H-inspired method strives to identify and then calculate minimal distances for a group of geometric points with commonality in a more rigid space, wherein harmonic variation occurs in inherently non-orthogonal spaces. One way to model this type of variation and then explain its nature would be high-dimensional decomposition, which evidently requires that data be represented in a high-dimensional form such as vector of coordinates, intensities, energy, or discrete/quantised G-H distances (geometric terms). Figure [*] provides an example of that subtle point.

Figure: Crude visual example of how typical PCA and GMDS relate to one another, approach-wise
Image hyperspace

Depending on the circumstance, different measurable attributes can be added to the space, even a hybrid of them (e.g. shape and texture, so as to reconstruct/recover the relationship between image intensity and the image shape in 2- and 3-D). For synthesis of images belonging to a particular class/subspace, e.g. a canonical form (bar embedding error), one requires that the model should be specific and generic. Specific - for the fact that it need preferably not be confused with similar images belonging to another class, and generic - for the fact that it must span a sufficiently large cloud in hyperspace in order to capture the variation of all images of the same class.

As opposed to models that build upon a texture of pixels1, the GMDS approach largely discards the notion of geometry in the way the human eye perceives it; just as PCA/SVD facilitate the modeling of heat, quantifiable recipe ingredients, strings of letters and just about any parameterisable and measurable element, it should be possible to encode or at least translate a given image into a set of properties of some significant role; this is particularly attractive in 3-D, where the size of the given set can be vast - far greater than the actual entropy of the set. Considerable reduction in size can be achieved by considering distances between corresponding points or geodesic distances between neighbouring points, whereupon the image can be reconstructed by merely plugging in the modeled parameters and scaling accordingly. In that respect, it is a pseudo-dimensionality reduction problem. The elasticity of observed objects is implicitly modeled by a collection of metric measurements - measurements taken not in Euclidean space but in a more inherent space, more robust to the external viewer (judging an object from the outside and not relative to its neighbourhood (e.g. landmark points in its vicinity). Intrinsic similarities are also more resistant to error due to some topological changes in the sense that, assuming there is awareness of the topological changes typically introduced (e.g. hand touches leg), it should be possible to define 'sanity' ranges within which the distances do make sense or alternatively use diffusion distance, or intrinsic symmetry tests for something like animals where preservation of this property is expected. Euclidean distances are hardly enough for determining if two points/objects are just close to one another or actually connected.

Isometric embedding is somewhat analogous to putting an image in a reference frame from which to consistently sample parameters. As all comparisons are better off done in a spatially-neutral reference (such as a sphere onto which a more complex image is mapped/flattered), this method seems to follow the same intuition as Davies et al. who find parameters to model shape by (in 2- and 3-D) by mapping everything onto a sphere and then applying kernel functions to move those around consistently, for PCA to automatically use the ``best'' points that encode a complex shape. When dealing with modeling in this context it is typically used for segmentation, non-rigid registration, and synthesis. The problem of recognition and automated classification (e.g. telling apart extrinsic and intrinsic differences) hardly arises in this area.

In GMDS, numerical analysis and multidimensional scaling can be thought of as an iterative, concurrent search for directions and axes that better distinguish between pairs (or groups) of shapes; the general optimiser optimises over image parts and also over corresponding points, which relate to the former to an extent. When dealing with images in a metric space, the situation is merely identical or at least analogous to how image registration problems are solved, by embedment in a high-dimensional space, where the difference between them is estimated as per the vector in the manifold (Euclidean, shuffle distance, etc.), depending on the problem domain.

It ought to be possible to make the problems at hand complementary and not just ,mere surrogates.

Roy Schestowitz 2011-08-19