Contains classes related to lexical analysis, such as the universal token type (Loyc.Syntax.Lexing.Token) and Loyc.Syntax.Lexing.TokensToTree. More...

Classes
class	BaseILexer< CharSrc, Token >
	A version of BaseLexer{CharSrc} that implements ILexer{Token}. You should use this base class if you want to wrap your lexer in a postprocessor such as IndentTokenGenerator or TokensToTree. More...

class	BaseLexer
	Alias for BaseLexer{C} where C is ICharSource. More...

class	BaseLexer< CharSrc >
	The recommended base class for lexers generated by LLLPG, when not using the `inputSource` option. More...

interface	ILexer< Token >
	A standard interface for lexers. More...

interface	ILllpgApi< Token, MatchType, LaType >
	For reference purposes, this interface is a list of the non-static methods that LLLPG expects to be able to call when it is generating code. LLLPG does not actually need lexers and parsers to implement this interface; they simply need to implement the same set of methods as this interface contains. More...

interface	ILllpgLexerApi< Token >
	For reference purposes, this interface contains the non-static methods that LLLPG expects lexers to implement. LLLPG does not actually expect lexers to implement this interface; they simply need to implement the same set of methods as this interface contains. More...

class	IndentTokenGenerator
	A preprocessor usually inserted between the lexer and parser that inserts "indent", "dedent", and "end-of-line" tokens at appropriate places in a token stream. More...

class	IndentTokenGenerator< Token >
	A preprocessor usually inserted between the lexer and parser that inserts "indent", "dedent", and "end-of-line" tokens at appropriate places in a token stream. More...

interface	ISimpleToken
	Alias for ISimpleToken{int}. More...

interface	ISimpleToken< TokenType >
	Basic information about a token as expected by BaseParser{Token}: a token Type, which is the type of a "word" in the program (string, identifier, plus sign, etc.), a value (e.g. the name of an identifier), and an index where the token starts in the source file. More...

interface	IToken< TT >
	The methods of Token in the form of an interface. More...

class	LexerSource
	A synonym for LexerSource{C} where C is ICharSource. More...

class	LexerSource< CharSrc >
	An implementation of the LLLPG Lexer API, used with the LLLPG options `inputSource` and `inputClass`. More...

class	LexerSourceFile< CharSource >
	Adds the AfterNewline method to SourceFile. More...

class	LexerSourceWorkaround< CharSrc >
	This class only exists to work around a limitation of the C# language: "cannot change access modifiers when overriding 'protected' inherited member Error(...)". More...

class	LexerWrapper< Token >
	A base class for wrappers that modify lexer behavior. Implements the ILexer interface, except for the NextToken() method. More...

struct	Token
	A common token type recommended for Loyc languages that want to use features such as token literals or the TokensToTree class. More...

class	TokenListAsLexer
	Adapter: converts `IEnumerable(Token)` to the ILexer{Token} interface. More...

class	TokensToTree
	A preprocessor usually inserted between the lexer and parser that converts a token list into a token tree. Everything inside brackets, parens or braces is made a child of the open bracket. More...

class	TokenTree
	A list of Token structures along with the ISourceFile object that represents the source file that the tokens came from. More...

class	WhitespaceFilter
	Alias for `WhitespaceFilter{Token}` More...

class	WhitespaceFilter< Token >
	Filters out tokens whose `Value` is WhitespaceTag.Value. More...

class	WhitespaceTag
	WhitespaceTag.Value can be used as the Token.Value of whitespace tokens, to make whitespace easy to filter out. More...

Enumerations
enum	TokenKind { TokenKind.Spaces = 0x0000, TokenKind.Comment = 0x0100, TokenKind.Id = 0x0200, TokenKind.Literal = 0x0300, TokenKind.Dot = 0x0600, TokenKind.Assignment = 0x0700, TokenKind.Operator = 0x0800, TokenKind.Separator = 0x0900, TokenKind.AttrKeyword = 0x0A00, TokenKind.TypeKeyword = 0x0B00, TokenKind.OtherKeyword = 0x0C00, TokenKind.Other = 0x0F00, LParen = 0x1000, RParen = 0x1100, LBrack = 0x1200, RBrack = 0x1300, LBrace = 0x1400, RBrace = 0x1500, Indent = 0x1600, Dedent = 0x1700, LOther = 0x1800, ROther = 0x1900, KindMask = 0x1F00 }
	A list of token categories that most programming languages have. More...

Detailed Description

Contains classes related to lexical analysis, such as the universal token type (Loyc.Syntax.Lexing.Token) and Loyc.Syntax.Lexing.TokensToTree.

Enumeration Type Documentation

enum Loyc.Syntax.Lexing.TokenKind

A list of token categories that most programming languages have.

Some Loyc languages will support the concept of a "token literal" which is a TokenTree, and some DSLs will rely on these token literals for input. However, tokens differ between different languages; for instance the set of operators varies between languages. On the other hand, most languages do have some concept of "an operator" and "an identifier", and the TokenKind reflects this fact.

When you are using Token to represent tokens in your language, it is recommended to define every value of your "TokenType" enumeration in terms of TokenKind using integer offsets, like this:

enum MyTokenType {
    EOF         = TokenKind.Spaces,
    Id          = TokenKind.Id,
    IfKeyword   = TokenKind.OtherKeyword,
    ForKeyword  = TokenKind.OtherKeyword + 1,
    LoopKeyword = TokenKind.OtherKeyword + 2,
    ...
    MulOp   = TokenKind.Operator,
    AddOp   = TokenKind.Operator + 1,
    DivOp   = TokenKind.Operator + 2,
    DotOp   = TokenKind.Dot,
    ...
}

Using TokenKind is only important if you intend to support DSLs via token literals (e.g. LLLPG) in your language.

A DSL that just needs simple tokens like "strings", "identifiers" and "dots" can write a parser based on values of Token.Kind alone; if it needs certain specific operators or "keywords" that do not have a dedicated TokenKind, such as + and %, it can further check the Value of the token; meanwhile, the host language put a global Symbol in the Token.Value to represent operators, keywords and identifiers.

Enumerator
Spaces	Spaces, tabs, non-semantic newlines, and EOF Spaces and comments are typically filtered out before parsing and will not appear in token literals.
Comment	Single- and multi-line comments Spaces and comments are typically filtered out before parsing and will not appear in token literals.
Id	Simple identifiers
Literal	Literals, such as numbers and strings.
Dot	Scope operator (dot and dot-like ops such as :: in C++)
Assignment	Simple or compound assignment
Operator	All operators except assignment, dot, or separators
Separator	e.g. semicolon, comma (if not considered an operator)
AttrKeyword	e.g. public, private, static, virtual
TypeKeyword	e.g. int, bool, double, void
OtherKeyword	e.g. sizeof, struct
Other	For token types not covered by other token kinds.

Documentation moved to ecsharp.net

Classes

Enumerations

Detailed Description

Enumeration Type Documentation