Goodbye, ANTLR

This post was imported from blogspot.

Three days ago, after finding workarounds for the ANTLR3 (C#) bugs detailed here, I immediately ran into even more bugs. For instance I had a rule that said

SL_COMMENT: '#' (~NEWLINE_CHAR)*;

Somehow the generated code for this rule included a check (during the matching stage, if I remember correctly) that said, in essense, "if the comment contains a slash character, generate a syntax error". What the hell? And there was another bug besides that which I've forgotten. My bug report on the first batch of bugs went mostly unacknowledged, so I didn't bother to try isolating this new problem.

Instead, I'm planning to try another approach: I'll make my own ANTLR. I bought the ANTLR book May 26, and I've been unable to get the thing to work for me since then. I'm getting impatient. I know how a LL parser generator should behave, so I ought to be able to make one... right?

Of course, I would like a parser generator done the Loyc way - as an extension to Loyc. But it'll be a little bit tricky to do this, because Loyc does not actually exist yet. It's still in the planning stages! There are no AST classes, no ONEP. So what will I do?

Well, my initial goal will be a translator from boo to boo. I'll make some AST classes and give them the ability to print themselves out as source code. Then I'll create a lexer and tree parser by hand; as for the main parser, I'm not sure how to approach it. But after I've done those things, I'll write some routines for printing out AST nodes as text. So it will be able to read source code and spit it back out.

At this point I've already written a lot of the lexer by hand. I've taken it as an opportunity to figure out how a parser generator should work, by attempting to write the lexer the way a machine would do it. I started by writing the lexer grammar in a hypothetical boo-style syntax; then I translated that grammar--mechanically, by hand--to C# source code.

There is so much work I have to do before I start making the parser generator, though. I fear that by the time I'm done with the prerequisites, I will have forgotten the lessons I'm now learning about making a parser generator. We'll see.