Enhanced C#
Language of your choice: library documentation
|
Contains Precedence objects that represent the precedence levels of LES. More...
Contains Precedence objects that represent the precedence levels of LES.
In LES, the precedence of an operator is decided based simply on the text of the operator. The precedence of each one-character operator is predefined; the precedence of any operator with two or more characters is decided based on the last character, or the first and last character; the middle characters, if any, do not affect precedence.
The LES precedence table is designed to match most programming languages.
As a nod to functional languages, the arrow operator "->" is right- associative and has a precedence below '*' so that int * int -> int
parses as (int * int) -> int
rather than int * (int -> int)
as in the C family of languages.
An operator consists of a sequence of the following characters:
~ ! % ^ & * \ - + = | < > / ? : . $
Or a backslash () followed by a sequence of the above characters and/or letters, numbers, underscores or #s. Or a string with backtick quotes
.
"@" is not considered an operator. It is used to mark a sequence of punctuation and/or non-punctuation characters as an identifier, a symbol, or a special literal. "#" is not an operator; like an underscore, the hash sign is considered to be an identifier character, and while it is conventionally used to mark "keywords", the parser does not assign any special meaning to it.
"," and ";" are not considered operators; rather they are separators, and they cannot be combined with operators. For example, "?,!" is parsed as three separate tokens.
The following table shows all the precedence levels and associativities of the "built-in" LES operators, except backtick
and the "lambda" operator =>, which is special. Each precedence level has a name, which corresponds to a static field of this class. All binary operators are left-associative unless otherwise specified.
backtick
Not listed in table: binary => ~ <> backtick
; prefix / \ < > ? =
Notice that the precedence of an operator depends on how it is used. The prefix operator '-' has higher precedence than the binary operator '-', so for example - y * z
is parsed as (- y) * z
, while x - y * z
is parsed as x - (y * z)
.
The Lambda operator =>, which is right-associative, has a precedence level above Multiply on the left side, but below Assign on the right side. For example, the expression a = b => c = d
is parsed as a = (b => (c = d))
, and similarly a + b => c + d
is parsed as a + (b => (c + d))
, but a ** b => c ** d
is parsed (a ** b) => (c ** d)
. The idea of two different precedences on the two sides of an operator may seem strange; see the documentation of Precedence for more explanation.
In addition to these, the binary backtick
operators have a "precedence range" that is above Compare and below Power. This means that they are immiscible with the Multiply, Add, Arrow, AndBits, OrBits, OrIfNull, PrefixDots, and Range operators, as explained in the documentation of Precedence.
After constructing an initial table based on common operators from other languages, I noticed that
I also wanted to have a little "room to grow"–to defer the precedence decision to a future time for some operators. So the precedence of the binary operators ~ and <> is constrained to be above Compare and below NullDot; mixing one of these operators with any operator in this range will produce a "soft" parse error (meaning that parsing still proceeds but the exact precedence is undefined.)
The operators / \ < > ? = can be used as prefix operators, but their precedence is is similarly undefined (but definitely above Compare and below NullDot).
The way that low-precedence prefix operators are parsed deserves some discussion... TODO.
Most operators can have two roles. Most operators can either be binary operators or prefix operators; for example, !*!
is a binary operator in x !*! y
but a prefix operator in x + !*! y
.
The operators ++ –
also have two roles, but different roles: they can be either prefix or suffix operators, but not binary operators. For example, -*-
is a suffix operator in x -*- + y
and a prefix operator in x + -*- y
. Please note that x -*- y
is ambiguous (it could be parsed as either of two superexpressions, (x -*-) (y)
or (x) (-*- y)
) and it is illegal.
Operators that end with $ can only be prefix operators (not binary or suffix). Operators that start and end with \ can only be suffix (not binary or prefix) operators. Having only a single role makes these operators unambiguous inside superexpressions.
An operator cannot have all three roles (suffix, prefix and binary); that would be overly ambiguous. For example, if "-" could also be a suffix operator then x - + y
could be parsed as (x -) + y
as well as x - (+ y)
. More subtly, LES does not define any operators that could take binary or suffix roles, because that would also be ambiguous. For example, suppose |?|
is a binary or suffix operator, but not a prefix operator. Clearly x |?| y
and x |?| |?| y
are unambiguous, but x |?| + y
is ambiguous: it could be parsed as (x |?|) + y
or x |?| (+ y)
. It turns out that a computer language can contain operators that serve as binary and prefix operators, OR it can contain operators that serve as binary and suffix operators, but a language is ambiguous if it has both kinds of operators at the same time.
To determine the precedence of any given operator, first you must decide, mainly based on the context in which the operator appears and the text of the operator, whether it is a prefix, binary, or suffix operator. Suffix operators can only be derived from the operators ++, –, \
("derived" means that you can add additional operator characters in the middle, e.g. +++
and -%-
are can be prefix or suffix operators.)
If an operator starts with a backslash (), the backslash is not considered part of the operator name and it not used for the purpose of choosing precedence either (rather, it is used to allow letters and digits in the operator name). A backquoted
operator always has precedence of Backtick and again, the backticks are not considered part of the operator name.
Next, if the operator is only one character, simply find it in the above table. If the operator is two or more characters, take the first character A and the last character Z, and apply the following rules in order:
The first two rules are special cases that exist for the sake of the shift operators, so that ">>=" has the same precedence as "=" instead of ">=".
Please note that the plain colon ':' is not treated as an operator at statement level; it is assumed to introduce a nested block, as in the languages Python and boo (e.g. in "if x: y();" is interpreted as "if x { y(); }"). However, ':' is allowed as an operator inside a parenthesized expression. ([June 2014] Python-style blocks are not yet implemented.)
The double-colon :: has the "wrong" precedence according to C# and C++ rules; a.b::c.d
is parsed (a.b)::(c.d)
although it would be parsed ((a.b)::c).d
in C# and C++. The change in precedence allows double colon to be used for variable declarations in LeMP, as in x::System.Drawing.Point
. The lower precedence allows this to be parsed properly, but it sacrifices full fidelity with C#/C++.
There are no ternary operators in LES. '?' and ':' are right-associative binary operators, so c ? a : b
is parsed as c ? (a : b)
. The lack of an official ternary operator reduces the complexity of the parser; C-style conditional expressions could still be parsed in LEL with the help of a macro, but they are generally not necessary since the if-else superexpression is preferred: if c a else b
.
I suppose I should also mention the way operators map to function names. In LES, there is no semantic distinction between operators and functions; x += y
is equivalent to the function call @+=(x, y)
, and the actual name of the function is "+=" (the @ character informs the lexer that a special identifier name follows.) Thus, the name of most operators exactly matches the operator; the + operator is named "+", the |*| operator is named "|*|", and so forth. There are a couple of exceptions:
backquotes
, the backquotes are not part of the name either; > and >
and > differ only in precedence. Public static fields | |
static readonly Precedence | Substitute = new Precedence(106, 105) |
static readonly Precedence | Primary = new Precedence(100) |
static readonly Precedence | NullDot = new Precedence(95) |
static readonly Precedence | DoubleBang = new Precedence(91, 90) |
static readonly Precedence | Prefix = new Precedence(85) |
static readonly Precedence | Power = new Precedence(80) |
static readonly Precedence | Suffix2 = new Precedence(75) |
static readonly Precedence | Multiply = new Precedence(70) |
static readonly Precedence | Arrow = new Precedence(65) |
static readonly Precedence | Add = new Precedence(60) |
static readonly Precedence | Shift = new Precedence(55, 55, 55, 70) |
static readonly Precedence | PrefixDots = new Precedence(50) |
static readonly Precedence | Range = new Precedence(45) |
static readonly Precedence | OrIfNull = new Precedence(40, 40, 40, 76) |
static readonly Precedence | Backtick = new Precedence(40, 40, 40, 75) |
static readonly Precedence | Reserved = new Precedence(40, 40, 40, 90) |
static readonly Precedence | Compare = new Precedence(35) |
static readonly Precedence | AndBits = new Precedence(30, 30, 25, 50) |
static readonly Precedence | OrBits = new Precedence(25, 25, 25, 50) |
static readonly Precedence | And = new Precedence(20) |
static readonly Precedence | Or = new Precedence(15) |
static readonly Precedence | IfElse = new Precedence(11, 10) |
static readonly Precedence | Assign = new Precedence(6, 5) |
static readonly Precedence | Lambda = new Precedence(77, 0, -1, -1) |
static readonly Precedence | PrefixOr = new Precedence(0) |
static readonly Precedence | SuperExpr = new Precedence(-5) |