To convert a grammar file mygrammar.g
into its
equivalent parser or finite state machine source code, you should use this command line tool, either in
a command prompt window or as part of a script. The resulting output source file
is added to the project that needs the parser or state machine, and a reference to the ParserGenerator
and Parsing library DLLs made from the project.
The command-line arguments to this program behave as follows:
parselr [-ptdfrgh?] input.g [output.cs]
ARGUMENT | DESCRIPTION |
[-r] |
Tells the parser to use a version of the parsing algorithm that attempts
error recovery when an input token or symbol is received that is not appropriate
for this point in the input grammar. The parsing algorithm is slightly slower
and more complex. It employs a similar algorithm to that used in parsers
like Bison or Yacc, where the parser state stack is unwound until a state
is uncovered whose next valid input token is the error symbol. The parser
looks at the token following the error symbol, and throws away input tokens
until a matching token (or end of input) is encountered. Parsing then attempts
to resume at that point. The keyword 'error '
is used at strategic points in the grammar input file to indicate where
acceptable recovery points are allowed. |
[-f] |
This flag is used if the grammar is a description of a state machine rather than an LR(1)
grammar. Creates an offline state machine rather than a parser. When this
flag is included the -t and -p flags do
nothing and are ignored. The -r flag causes the state
machine to ignore unrecognised input events rather than abort the state
machine at the first unrecognised input event. |
[-t] |
Tells the parser to generate an output text file that contains a description
in human-readable form of all the grammar rules (productions) and all the
states and transitions generated in the output parser source code. The file
to which this description is written has the name
input.table.txt if the input grammar file
had the name input.g. With a little understanding of how the parser works,
this can be used to follow the parser as it reads and processes input tokens.
The specification of this flag also turns on detailed/verbose mode for reporting
messages within the parser source code itself. |
[-d] |
This flag is only useful if you need to debug the parser itself, and
would not be used on a day to day basis. It outputs to a file whose name
is input.debug.txt if the input grammar
file was named input.g, and contains
detailed step by step descriptions of exactly what the parser did with
each input token. |
[-p] |
Tells the parser to perform some table optimisation. This flag uses a variation of David Pager's algorithm for folding states in an LR(1) parser, thereby reducing the size of the finite state machine that performs the parsing. The algorithm reduces the size of the state machine without introducing a reduction in the ability to recognise an input LR(1) grammar. The more traditional LALR(1) parsers recognised by programs such as Yacc have a reduced set of grammars that they are able to parse because of ambiguities introduced by the LALR(1) algorithm. |
[-g] |
Generate a version of the parser tables that will support a generalised LR parser (GLR), that is capable of tracking multiple valid parses of the input for an ambiguous input grammar. In particular this option causes the -p and -r options to be ignored, as well as no longer reporting shift/reduce and reduce/reduce errors. Instead, a GLR parser will track both routes emerging from each conflict. |
[-h] |
Generates an output message on the standard output listing an abbreviated version of the help information given here. |
[-?] |
The same as -h . |
input.g |
The input file containing the grammar to be parsed, in the correct format for the parser. This filename is mandatory. The parsing program will not read from the standard input if this argument is omitted. An error message is given if this file is omitted. |
[output.cs] |
The optional name of an output file to which the C# source code for
the parser will be emitted. If this optional file argument is not
included, the output filename is synthesied from the input grammar
filename. For an input file whose name is
mygrammar.g the corresponding synthesised output filename would
be mygrammar.designer.cs . The reason for
this unusual choice of name is because the generated output source file
includes a partial class that implements the parser. It is expected that
the developer may also want to implement the other half of the partial
class, putting application-specific methods and fields into the parser
class, and that this would be placed in a file called
mygrammar.cs . Visual Studio manages two
files with these naming conventions as a pair. |