When symbols are connected or overlap, there is the problem of separating them. This is more of a problem with scanned input, as the input is simply an image. For handwritten input, there is information available on the timing and order of the strokes drawn.
Figure 2.4 shows another problem, where the tail of
the y overlaps the fraction bar and the x. This makes it hard to
determine the geometric relationship between these symbols reliably.
Anderson
initially represents the position of each symbol with a rectangular
bounding box, the smallest rectangle that contains all of the symbol's
original pixels. He then shrinks these bounding boxes to a single
centre point to avoid problems when considering the relationships
between symbols. The position of the centre point is dependent on the
identity of the character. The way Anderson defines the centre points
of bounding boxes is covered in more detail in the discussion of his
equation parser in Section .