DWITE Online Computer Programming Contest

Kind of like OCR

March 2010
Problem 4

Optical character recognition (OCR) is the process of extracting textual information from images. While the current technology is mostly software-based, rather than using optical devices, the term has stuck around.

For the problem, we’ll work with a very simple alphabet, with each letter strictly defined by a 2×2 or 3×2 bitmap.

A: x.

B: xx
C: x.x

D: xx

E: xxx

The input file DATA4.txt will contain 5 sets, each 2 lines long, each line at least 2 and no more than 30 characters long. A set will spell out some word in the above alphabet. It will always be a valid word, and it will not be ambiguous (in a way that only one possible word could make the design pattern).

The output file OUT4.txt will contain 5 lines of output – the recognized words.

Note: You would need to take a word as a whole to distinguish between some of the cases. For example: In the sample below, while the first character could be read as C, the rest of the word would not be made of valid characters.

Sample Input (first two shown):
Sample Output (first two shown):