Enhanced C#
Language of your choice: library documentation
|
The recommended base class for lexers generated by LLLPG, when not using the inputSource
option.
More...
The recommended base class for lexers generated by LLLPG, when not using the inputSource
option.
If you are using the inputSource
and inputClass
options of, LLLPG, use LexerSource{CharSource} instead. If you want to write a lexer that implements ILexer{Tok} (so it is compatible with postprocessors like IndentTokenGenerator and TokensToTree), use BaseILexer{CharSrc,Tok} as your base class instead.
This class contains many methods required by LLLPG, such as NewSet, LA(int), LA0, Skip, Match(...), and TryMatch(...), along with a few properties that are not used by LLLPG that you still might want to have around, such as FileName, CharSource and SourceFile.
It also implements the caching behavior for which ICharSource was created. See the documentation of ICharSource for more information.
All lexers derived from BaseLexer should call AfterNewline() at the end of their newline rule, in order to increment the current line number. Alternately, your lexer can borrow the newline parser built into BaseLexer, which is called Newline() and calls AfterNewline() for you. It is possible to have LLLPG treat this method as a rule, and tell LLLPG the meaning of the rule like this:
The extern
modifier tells LLLPG not to generate code for the rule, but the rule must still have a body so that LLLPG can perform prediction.
By default, errors are handled by throwing FormatException. The recommended way to alter this behavior is to change the ErrorSink property. For example, set it to MessageSink.Console to send errors to the console, or use MessageSink.FromDelegate to provide a custom handler.
CharSrc | A class that implements ICharSource. In order to write lexers that can accept any source of characters, set CharSrc=ICharSource. For maximum performance when parsing strings (or to avoid memory allocation), set CharSrc=UString (UString is a wrapper around System.String that, among other things, implements ICharSource ; please note that C# will implicitly convert normal strings to UString for you). |
CharSrc | : | ICharSource |
Nested classes | |
struct | SavePosition |
A helper class used by LLLPG for backtracking. More... | |
Public static fields | |
static readonly IMessageSink | FormatExceptionErrorSink |
Throws FormatException when it receives an error. Non-errors are sent to MessageSink.Current. More... | |
Properties | |
IMessageSink | ErrorSink [get, set] |
Gets or sets the object to which error messages are sent. The default object is FormatExceptionErrorSink, which throws FormatException if an error occurs. More... | |
int | LA0 [get, set] |
CharSrc | CharSource [get] |
string | FileName [get] |
int | InputPosition [get, set] |
LexerSourceFile< CharSrc > | SourceFile [get] |
int | LineNumber [get] |
Current line number. Starts at 1 for the first line, unless derived class changes it. More... | |
int | LineStartAt [get] |
Index at which the current line started. More... | |
Public Member Functions | |
BaseLexer (CharSrc chars, string fileName="", int inputPosition=0, bool newSourceFile=true) | |
Initializes BaseLexer. More... | |
virtual void | Reset (CharSrc chars, string fileName="", int inputPosition=0, bool newSourceFile=true) |
Reinitializes the object. This method is called by the constructor. More... | |
SourcePos | IndexToLine (int index) |
Returns the position in a source file of the specified index. More... | |
Protected Member Functions | |
void | Reset () |
int | LA (int i) |
void | Skip () |
Increments InputPosition. Called by LLLPG when prediction already verified the input (and caller doesn't save LA(0)) More... | |
virtual void | AfterNewline () |
The lexer must call this method exactly once after it advances past each newline, even inside comments and strings. This method keeps the LineNumber and LineStartAt properties updated. More... | |
void | Newline () |
Default newline parser that matches ' ' or '' unconditionally. More... | |
void | Spaces () |
Skips past any spaces at the current position. Equivalent to rule Spaces @[ (' '|'')* ] in LLLPG. More... | |
int | MatchAny () |
int | Match (HashSet< int > set) |
int | Match (int a) |
int | Match (int a, int b) |
int | Match (int a, int b, int c) |
int | Match (int a, int b, int c, int d) |
int | MatchRange (int aLo, int aHi) |
int | MatchRange (int aLo, int aHi, int bLo, int bHi) |
int | MatchExcept () |
int | MatchExcept (HashSet< int > set) |
int | MatchExcept (int a) |
int | MatchExcept (int a, int b) |
int | MatchExcept (int a, int b, int c) |
int | MatchExcept (int a, int b, int c, int d) |
int | MatchExceptRange (int aLo, int aHi) |
int | MatchExceptRange (int aLo, int aHi, int bLo, int bHi) |
bool | TryMatch (HashSet< int > set) |
bool | TryMatch (int a) |
bool | TryMatch (int a, int b) |
bool | TryMatch (int a, int b, int c) |
bool | TryMatch (int a, int b, int c, int d) |
bool | TryMatchRange (int aLo, int aHi) |
bool | TryMatchRange (int aLo, int aHi, int bLo, int bHi) |
bool | TryMatchExcept () |
bool | TryMatchExcept (HashSet< int > set) |
bool | TryMatchExcept (int a) |
bool | TryMatchExcept (int a, int b) |
bool | TryMatchExcept (int a, int b, int c) |
bool | TryMatchExcept (int a, int b, int c, int d) |
bool | TryMatchExceptRange (int aLo, int aHi) |
bool | TryMatchExceptRange (int aLo, int aHi, int bLo, int bHi) |
virtual void | Check (bool expectation, string expectedDescr="") |
virtual void | Error (int lookaheadIndex, string message) |
This method is called to handle errors that occur during lexing. More... | |
virtual void | Error (int lookaheadIndex, string format, params object[] args) |
This method is called to format and handle errors that occur during lexing. The default implementation sends errors to ErrorSink, which, by default, throws a FormatException. More... | |
virtual void | Error (bool inverted, int range0lo, int range0hi) |
virtual void | Error (bool inverted, params int[] ranges) |
virtual void | Error (bool inverted, IList< int > ranges) |
virtual void | Error (bool inverted, HashSet< int > set) |
string | RangesToString (IList< int > ranges) |
Converts a list of character ranges to a string, e.g. for input list {'*','*','a','z'}, the output is "'*' 'a'..'z'". More... | |
void | PrintChar (int c, StringBuilder sb) |
Prints a character as a string, e.g. 'a' -> "'a'" , with the special value -1 representing EOF, so PrintChar(-1, ...) == "EOF". More... | |
Static Protected Member Functions | |
static HashSet< int > | NewSet (params int[] items) |
static HashSet< int > | NewSetOfRanges (params int[] ranges) |
Protected fields | |
int | CachedBlockSize = 128 |
int | _lineStartAt |
int | _lineNumber = 1 |
|
inline |
Initializes BaseLexer.
chars | A source of characters, e.g. UString. |
fileName | A file name associated with the characters, which will be used for error reporting. |
inputPosition | A location to start lexing (normally 0). Careful: If you're starting to lex in the middle of the file, the LineNumber still starts at 1, and (if newSourceFile is true) the SourceFile object may or may not discover line breaks prior to the starting point, depending on how it is used. |
newSourceFile | Whether to create a LexerSourceFile{C} object (an implementation of ISourceFile) to keep track of line boundaries. The SourceFile property will point to this object, and it will be null if this parameter is false. Using 'false' will avoid memory allocation, but prevent you from mapping character positions to line numbers and vice versa. However, this object will still keep track of the current LineNumber and LineStartAt (the index where the current line started) when this parameter is false. |
|
inlineprotectedvirtual |
The lexer must call this method exactly once after it advances past each newline, even inside comments and strings. This method keeps the LineNumber and LineStartAt properties updated.
Reimplemented in Loyc.Syntax.Lexing.BaseILexer< CharSrc, Token >, and Loyc.Syntax.Lexing.LexerSource< CharSrc >.
|
inlineprotectedvirtual |
This method is called to handle errors that occur during lexing.
lookaheadIndex | Index where the error occurred, relative to the current InputPosition (i.e. InputPosition + lookaheadIndex is the position of the error). |
message | An error message, not including the error location. |
Reimplemented in Loyc.Syntax.Lexing.LexerSourceWorkaround< CharSrc >, and Loyc.Syntax.Lexing.LexerSource< CharSrc >.
|
inlineprotectedvirtual |
This method is called to format and handle errors that occur during lexing. The default implementation sends errors to ErrorSink, which, by default, throws a FormatException.
lookaheadIndex | Index where the error occurred, relative to the current InputPosition (i.e. InputPosition + lookaheadIndex is the position of the error). |
format | An error description with argument placeholders. |
args | Arguments to insert into the error message. |
Reimplemented in Loyc.Syntax.Lexing.LexerSourceWorkaround< CharSrc >, and Loyc.Syntax.Lexing.LexerSource< CharSrc >.
|
inline |
Returns the position in a source file of the specified index.
If index is negative, this should return a SourcePos where Line and PosInLine are zero (signifying an unknown location). If index is beyond the end of the file, this should retun the final position in the file.
Implements Loyc.Syntax.IIndexToLine.
|
inlineprotected |
Default newline parser that matches '
' or '' unconditionally.
You can use this implementation in an LLLPG lexer with "extern", like so: extern rule Newline @[ '' + '
By using this implementation everywhere in the grammar in which a newline is allowed (even inside comments and strings), you can ensure that AfterNewline() is called, so that the line number is updated properly.
'? | '
' ];
|
inlineprotected |
Prints a character as a string, e.g. 'a' -> "'a'"
, with the special value -1 representing EOF, so PrintChar(-1, ...) == "EOF".
|
inlineprotected |
Converts a list of character ranges to a string, e.g. for input list {'*','*','a','z'}, the output is "'*' 'a'..'z'".
|
inlinevirtual |
Reinitializes the object. This method is called by the constructor.
See the constructor for documentation of the parameters.
This method can be used to avoid memory allocations when you need to parse many small strings in a row. If that's your goal, you should set the newSourceFile
parameter to false if possible.
Reimplemented in Loyc.Syntax.Lexing.BaseILexer< CharSrc, Token >, and Loyc.Syntax.Lexing.LexerSource< CharSrc >.
|
inlineprotected |
Increments InputPosition. Called by LLLPG when prediction already verified the input (and caller doesn't save LA(0))
|
inlineprotected |
Skips past any spaces at the current position. Equivalent to rule Spaces @[ (' '|'')* ]
in LLLPG.
|
static |
Throws FormatException when it receives an error. Non-errors are sent to MessageSink.Current.
|
getset |
Gets or sets the object to which error messages are sent. The default object is FormatExceptionErrorSink, which throws FormatException if an error occurs.
|
get |
Current line number. Starts at 1 for the first line, unless derived class changes it.
|
getprotected |
Index at which the current line started.