The discovery of segments in natural language

The performance of a computer model of linguistic segmentation is described and evaluated when it is used with natural language. It identifies words quite successfully and seems to have some sensitivity to morphs but it performs poorly with structures larger than words. From the language samples, the program extracts most of the sequential redundancy and some of the redundancy due to the unequal frequencies of elements. This accords with the principle of economical coding in cognition (Attneave, 1954; Oldfield, 1954). The process seems also to model certain aspects of how children's vocabularies grow and the increasing lengths of the words which children acquire. It may have a bearing on the explanation of infantile amnesia and the word transformation effect.