Contextually Self-Organized Maps of Chinese Words

Contextual SOMs of Chinese words have been constructed in this work. Differing from previous approaches, in which individual words were mapped onto the SOM, in this work histograms of various word classes or otherwise defined subsets of words were formed on the SOM array. It was found that the words are not only clustered according to the word classes, but joint or overlapping clusters of words from different classes can also be formed according to the role of the words as sentence constituents. A further new effect was found. When the histograms were formed using test words restricted to certain intervals of word frequencies, the histograms were found to depend on the frequency, and the corresponding partial clusters were often very compact.