Go 2.0 feature?
One thing I like about Rust and Ruby is how you can enhance these languages
to be more useful for a specific problem domain. It's called Domain
Specific Language - or DSL now days. I think this ability in Ruby is one
of the main reasons it gained popularity as an HTML generation language.
I've toyed with syntax extensibility before. My first one was a Scheme
interpreter with rewriting rules, which allowed a lot of Scheme to be
written in itself. I believe the same could be done for Go. For example,
the parenthesized version of import (and other) statements could be
auto-rewritten to the non-parenthesized form. Go could have a "core"
simpler syntax, with all the syntactic sugar we want written in Go
rewriting rules, and for improved speed, implementations could be free to
directly parse some or all of the larger syntax.
This was easy in Scheme because everything was just a list, and there was
little concern about speed. My first approach for applying this to a
modern high performance language was to write a GLR parser, which made
everything easy, but it was way too slow. Last year, I wrote a parser that
assumes the input is made of statements and expressions, where the
statements are newline or semicolon separated, and grouped into blocks with
curly braces or by indentation. I looked up statements in constant time in
a hash table once the keyword signature was found (a linear pass over the
input tokens). An enhanced precedence parser then parsed the expressions
in the statement. It was very fast, but not quite powerful enough. In
particular, it could parse Python, but not Javascript, because it could not
handle expressions containing statements.
I haven't finished writing the new parser yet, but I think I've finally got
the spec about right. I dropped statements all together, and now
everything is just an expression. Conceptually, it operates in two passes
(but mostly will combined into one):
1) Parse the input into a node tree with the precedence parser
2) Match the node tree to yacc-like matching rules to find syntax errors
not found by the precedence parser
To be able to parse languages like Go, I have to allow the operator
precedence tables to change while parsing sub-expressions. For example, in
parameter declarations, comma is just a separator - the lowest precedence
operator forming the root of the parameter expression tree. In block
statements, it has a different precedence, and in the parenthesized form of
import statements, commas aren't allowed at all. Clearly, a fixed
operator-table based precedence parser wont work. Instead, sub-expression
preceded by an operator keyword can switch the parser to a different
operator table (the keyword is "(" in the parameter parsing example). It
switches back once the largest possible sub-expression has been matched.
So far, it this to be powerful enough to parse everything I've had trouble
with before, including Python, Javascript, and and Go.
Also, the matching phase is quite a bit more powerful than bison/yacc.
Most expression parsers I've written in bison force me to make everything a
generic expression, and then I have to check that boolean expressions are
not being improperly mixed with integer expressions, and I have to verify
constant expressions are actually constant in the semantic checker. This
is because bison can't have multiple legal interpretations for a set of
input tokens. If I'm trying to verify that the expression is valid where
we require a Boolean in a bison parser, I would try to write a rule called
boolExpr which matches only Boolean expressions, and arithExpr which
matches only arithmetic expressions. Similarly, I might want to have a
constExpr rule for matching constant expressions. However, when matching
an identifier, bison doesn't what it is. It has to pick, and the best it
can do is call it a generic expression, leaving it up to the programmer to
write code to check all the cases for all the types. I just checked the Go
bison parser, and sure enough, it calls everything a generic expression...
Is this a good place to bounce around dumb ideas like this?
Thanks,
Bill
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.