Spotting Relevant Information in Extremely Large Document Collections

The self-organizing map (SOM) converts statistical relationships between high-dimensional data into geometric relationships on a low-dimensional grid. It can thus be regarded as a projection and a similarity graph of the primary data. The most important topological and metric relationships of the primary data elements are shown on the display. The SOM may also be thought to produce some kinds of abstraction. These two aspects, visualization and abstraction, can be utilized in many complex tasks: in this presentation the automatic organization of very large document collections and searching relevant information from them are discussed.