Here we describe how you structure your source code and make references to libraries so that you can create a parser that is generated in-line, in the same program that runs that parser. The grammar description is provided as either a character string in source code, or as an input stream, meaning it could be read from a database or from a file.
Any in-line parser project will need to include two DLLs among the
references. These are Parsing.DLL
that contains the classes in the
namespace Parsing
, and ParserGenerator.DLL
that
contains the classes in the namespace ParserGenerator
. Naturally
the addition of using
statements to the top of source files that
refer to members of these DLLs will simplify your source code.
First your source code will need to have an application specific parser class
that you have written. The actual parser class already exists as Parsing.Parser
in Parsing.DLL
, so all you need to ensure is that your application specific
parser class inherits from Parsing.Parser
.
An example of the structure of your source file containing your application specific parser class might be:
using Parsing;
// ... other using statements ...
namespace MyApplication
{
public class MyParser : Parser
{
// ... Application specific class members here ...
}
}
Your parser will represent a single object that acts as the interface to any data structures that are being built or manipulated while the parser is executing. You would typically include any data members and methods to access them within the application specific parser class.
With both inline and offline parsers,
you may have elected for the action code executed on rule reduction
to be added to the source code for the parser. These
action functions will have had their names listed in the actions
section of the grammar description, and will all have the same standard
function signature. For example, if we have an action called BumpPlural
it should have its implementation written inside your application-specific parser
class as follows:
using Parsing;
// ... other using statements ...
namespace MyApplication
{
public class MyParser : Parser
{
// Application specific class members
public int PluralCount { get; set; }
public void BumpPlural(object[] args)
{
if(someErrorOccurred)
args[0] = new ParserError
{
Message = "Nature of error"
}
else
PluralCount++;
}
}
}
Note the signature of the action method. It has a return type
of void
, as the parsing
engine makes no attempt to pick up a return value. The object array passed as the
single argument to the action function will have been filled in with a selection of
the values from each of the tokens on the right hand side of the production being
reduced when this action was called. The array always has a size of at least one
element. This zeroth element of the array is used to return a value back to the
parser that will be used as the value for the single LHS token that replaces all
the RHS tokens that are being removed from the top of the parser stack.
If you construct an instance of a Parsing.ParserError
object and place
that into args[0]
before returning from any action function, this
indicates to the parser that your action function identified an error in the
parsing process. The Message
property of the ParserError
object should be filled in with an explanation of the fault, as this field is
used to report back to the parser error output channels the nature of the fault.
On receipt of a ParserError
object in slot zero of the argument array, the
parser will either abort the current parse, or if error recovery has been
enabled, will attempt to pop the parser state stack and shift input tokens until
a suitable resumption point has been reached. A description of how this error
recover mechanism functions is
given elsewhere.
If your grammar requires guard functions to be evaluated for some of the
tokens as they are parsed, you might have decided to write
these guard functions in your
application specific parser class. If so, their names will also have been listed in
the guards
section of the grammar.
Guard functions take a single object argument that contains the value of the most recently received input token. This is usually the terminal token the guard expression is positioned next to in the grammar. However, it is possible for a guard expression to be placed alongside a non-terminal token, in which case the object argument is the value of the terminal token the guard is being tested against. For more details of how guards on non-terminals are mapped back to their respective terminal tokens, see the detailed article describing this mapping.
Guard functions should return a boolean result. This should be set to
true
if the token should be accepted at this point in the parse, and
false
if it is not appropriate at this point. An example of a guard
function named PluralNoun
is given in the parser class below:
using Parsing;
namespace MyApplication
{
public class MyParser : Parser
{
// Application specific class members
public int PluralCount { get; set; }
public void BumpPlural(object[] args)
{
if(someErrorOccurred)
args[0] = new ParserError
{
Message = "Nature of error"
}
else
PluralCount++;
}
public bool PluralNoun(object arg)
{
return arg.ToString().EndsWith("s");
}
}
}
As a guide to writing boolean guard functions, try to arrange that each function evaluates a single truth value about the state of the parse or its data. Avoid combining several evaluable boolean items with boolean operators within a single function. The reason for this is that the grammar description allows compound boolean expressions to be constructed using the boolean guard functions and the operators 'and', 'or' and 'not'. By keeping the guard functions as primitive as possible, you can reuse them in different combinations for different guard expressions at different points in the grammar.
Your in-line parser class should either have no constructor, or should be
written with a default (parameterless) constructor. If you need to initialise
members with values, this should be done via a separate initialisation method,
or via property assignments after the parser instance has been created. The
reason for this is that the ParserFactory<T>.InitializeFromGrammar
and the ParserFactory<T>.CreateInstance
methods used to create an instance of the resulting parser expect the
application-specific parser class to have a default constructor. An
example of some suitable creation and initialisation code is given below. Note that the
ParserFactory<T>.InitializeFromGrammar
and
ParserFactory<T>.CreateInstance
methods are described elsewhere.
ParserFactory<MyParser>.InitializeFromGrammar( ... args ... );
MyParser p = ParserFactory<MyParser>.CreateInstance();
p.SomeMemberProperty = someValue;