The two components and (the vectors above which are a function of generative models) need to be merged to establish a new model. That more expressive model accounts for both types of variability (shape and intensity) and holds within it the correlation between the two.
The parameters and are aggregated to form a single column vector
It is in some sense, a simple concatenation of the two. However, since the values of intensity and shape can be quite different in their nature and granularity, some weighing is needed to attain equilibrium under which both shape and intensity reserve a noticeable effect. The danger is that if no weighing of any sort is applied, intensity values may supercede these of shape or vice versa. In less practical terms, if the extent of data values differs greatly, then the spread of the points in space is quite undesirable. The components to be identified by PCA are not as beneficial as they otherwise would have been. If some values are far greater than others, point vicinity takes a turn for the worse and the cloud might be elongated instead of nearly spherical (lending a 3-D analogy)2.21. For rather spherical spreads (or those of almost homogeneous variation), a greater number of large components will be available for selection. Consequently, the variation expressed by a fixed and constant number of principal components will be higher.
A weighing matrix that resolves the problem introduced above is by convention named (the symbol corresponds to shape as by default this matrix scales the shape parameters only. It gives logically equivalent results to these of applying the factor to intensities). The form in which coordinates are stored in depends on the accuracy required (e.g. integers and floating-point numbers), the image size and the number of dimensions, whereas for grey-level values, this form is dependent on the number of allocated bits per pixel2.22. With weighing in place, the aggregation would take a form such as
where is chosen to minimise inconsistencies due to scale. Lastly, by applying a further PCA stage to the aggregated data, the following combined model is obtained:
The appearance (shape and brightness levels) is now purely controlled by the parameters and there is no need to choose values for two families of distinct parameters as before. This combined model has the benefits of the dimensionality reduction performed, which is based on shape as well appearance. This means that it now encompasses all the variation learned and the correlation between these two distinct components. Since PCA was applied, the number of parameters is expected to be smaller than (or in extremity equal to) the number of parameters in and put together.