Methods for interpreting a self-organized map in data analysis

The Self-Organizing Map (SOM) can be used for forming overviews of multivariate data sets and for visualizing them on graphical map displays. Each map location represents certain kinds of data items and the value of a variable in the representations can be visualized in the corresponding locations on the map display. Such component plane displays contain all the information needed for interpreting the map but information about the relations of the variables remains implicit. We have developed methods that visualize explicitly the contribution of each variable in the organization of the map at diierent locations. It is also possible to measure the contribution of each variable in the cluster structure within an area of the map to summarize, for instance, the characteristics of clusters. 1. Introduction The SOM algorithm 2, 3] forms a mapping of a usually two-dimensional map lattice into the high-dimensional data space. There is a model vector connected to each point of the discrete lattice. The model vectors are situated in the data space; they act as an ordered set of models of diierent types of data items. The map can be used as an ordered groundwork for illustrating diierent aspects of the data set. In addition to visualizing the values of the original variables as component planes (examples are shown in Fig. 2a) the map can be used to visualize the clustering tendency of the data in diierent regions of the data space. The model vectors follow the distribution of the data items and therefore the distances between the model vectors connected to neighboring points on the map lattice are shorter in clustered areas than in sparser regions. The so-called U-matrix display 4], an example of which is shown in Fig. 1, depicts the distances between neighboring model vectors as gray levels.