A Multifont Word Recognition System for Postal Address Reading

This paper describes the basic design principles of a multifont word recognition system developed for postal address reading. Of the three main subsystems, image preprocessing, single character recognition, and contextual postprocessing, the last two will be considered in detail. A multiple-channel/multiple-choice approach is taken in designing the overall system. The character images produced by the image preprocessing subsystem are fed into three parallel single character recognition (SCR) channels. Each channel classifies the raster image according to one of the three character types: capital letter, small letter, or numeral. A second degree polynomial classifier is required in order to satisfy the multifont requirements of address reading. Each SCR channel outputs a rank-ordered list of potential character meanings for the character type being processed and a channel-specific figure of confidence. This figure of confidence serves a twofold purpose. First, it is used to determine the number of alternatives in the rank-ordered list, and secondly, it is used by the subsequent contextual postprocessor in calculating a word-specific discriminant function designed to discriminate between four different kinds of words: numeric, all upper case, all lower case, and lower case with upper case initial. Based on this discriminant function for every character position, only one channel output is passed on to-the word recognition system. From the list of alternatives for each character position, a set of alternative words can be constructed which, with a high probability, contains the correct word.