next up previous
Next: Stochastic Grammars Up: Formula Parsers Previous: Projection Profile Cutting

Procedurally Coded Math Syntax

Procedurally coded math syntax uses a collection of rule-of-thumb observations about formulae. These observations are coded into a formula processing program. An example of a rule, quoted in Blostein's paper , ``A length threshold of 20 pixels is used to classify a horizontal line as a long bar or a short bar. If a long bar has symbols both above and below it, it is treated as a division. If there are no symbols above it, it is treated as boolean negation. If a short bar has no symbols above or below it, it is treated as a minus sign. If it has characters above or below it, then combination characters (e.g.: =, $\geq$, $\leq$) are formed.''

A collection of rules such as these are used to parse formulae. The use of thresholds is sufficient for the processing of typeset input taken from a uniform source, but the high variability of handwritten input could make it fail.

The rules that are hard coded into the system perform essentially the same function as the rules in a box language, described in Section [*]. The only difference is that in procedurally coded syntax, they are built into the system as part of the code for the formula processor. In a box language, the rules are provided through a modifiable external data file.

Using procedurally coded math syntax means that it may be easier to write more complex or ``intelligent'' rules that use extra processing that could not be encoded as part of box language rules. However, the major disadvantages arise from the fact that rules are coded into the system itself, so that changing them involves rewriting parts of the program. It may be impossible for the end user to make modifications or extensions. The ability to modify a handwriting based formula parser is important due to the variability of notations, and the need to allow advanced users to create new notations.

Rules are typically added to the system as necessary throughout its development, correcting errors as they occur. As a result, the set of rules for a system progressively grow, with each new rule addressing the current problem. Systems end up with a large number of rules, with specialised sections of code for dealing with particular situations and problems.


next up previous
Next: Stochastic Grammars Up: Formula Parsers Previous: Projection Profile Cutting
Steve Smithies
1999-11-13