Creating off-line parsers or state machines with parselr.exe

To convert a grammar file mygrammar.g into its equivalent parser or finite state machine source code, you should use this command line tool, either in a command prompt window or as part of a script. The resulting output source file is added to the project that needs the parser or state machine, and a reference to the ParserGenerator and Parsing library DLLs made from the project.

The command-line arguments to this program behave as follows:

parselr [-ptdfrgh?] input.g [output.cs]

ARGUMENT DESCRIPTION
[-r] Tells the parser to use a version of the parsing algorithm that attempts error recovery when an input token or symbol is received that is not appropriate for this point in the input grammar. The parsing algorithm is slightly slower and more complex. It employs a similar algorithm to that used in parsers like Bison or Yacc, where the parser state stack is unwound until a state is uncovered whose next valid input token is the error symbol. The parser looks at the token following the error symbol, and throws away input tokens until a matching token (or end of input) is encountered. Parsing then attempts to resume at that point. The keyword 'error' is used at strategic points in the grammar input file to indicate where acceptable recovery points are allowed.
[-f] This flag is used if the grammar is a description of a state machine rather than an LR(1) grammar. Creates an offline state machine rather than a parser. When this flag is included the -t and -p flags do nothing and are ignored. The -r flag causes the state machine to ignore unrecognised input events rather than abort the state machine at the first unrecognised input event.
[-t] Tells the parser to generate an output text file that contains a description in human-readable form of all the grammar rules (productions) and all the states and transitions generated in the output parser source code. The file to which this description is written has the name input.table.txt if the input grammar file had the name input.g.With a little understanding of how the parser works, this can be used to follow the parser as it reads and processes input tokens. The specification of this flag also turns on detailed/verbose mode for reporting messages within the parser source code itself.
[-d] This flag is only useful if you need to debug the parser itself, and would not be used on a day to day basis. It outputs to a file whose name is input.debug.txt if the input grammar file was named input.g, and contains detailed step by step descriptions of exactly what the parser did with each input token.
[-p] Tells the parser to perform some table optimisation. This flag uses a variation of David Pager's algorithm for folding states in an LR(1) parser, thereby reducing the size of the finite state machine that performs the parsing. The algorithm reduces the size of the state machine without introducing a reduction in the ability to recognise an input LR(1) grammar. The more traditional LALR(1) parsers recognised by programs such as Yacc have a reduced set of grammars that they are able to parse because of ambiguities introduced by the LALR(1) algorithm.
[-g] Generate a version of the parser tables that will support a generalised LR parser (GLR), that is capable of tracking multiple valid parses of the input for an ambiguous input grammar. In particular this option causes the -p and -r options to be ignored, as well as no longer reporting shift/reduce and reduce/reduce errors. Instead, a GLR parser will track both routes emerging from each conflict.
[-h] Generates an output message on the standard output listing an abbreviated version of the help information given here.
[-?] The same as -h.
input.g The input file containing the grammar to be parsed, in the correct format for the parser. This filename is mandatory. The parsing program will not read from the standard input if this argument is omitted. An error message is given if this file is omitted.
[output.cs] The optional name of an output file to which the C# source code for the parser will be emitted. If this optional file argument is not included, the output filename is synthesied from the input grammar filename. For an input file whose name is mygrammar.g the corresponding synthesised output filename would be mygrammar.designer.cs. The reason for this unusual choice of name is because the generated output source file includes a partial class that implements the parser. It is expected that the developer may also want to implement the other half of the partial class, putting application-specific methods and fields into the parser class, and that this would be placed in a file called mygrammar.cs. Visual Studio manages two files with these naming conventions as a pair.