Win a copy of The Little Book of Impediments (e-book only) this week in the Agile and Other Processes forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Using regex in a Recursive Descent Parser

 
Ryan Carter-Reich
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
First post.

Anyway, I've got the following grammar I'm trying to write an RDP for:

<prog> -> ["prog" [id] ][ "(" {id = (id | number) ")"] "{"<sl>"}"
<sl> -> <s> {";" <s> } ";"
<s> -> id = <expr> | <expr>
<expr> -> <term> {("+" |"-") <term>}
<term> -> <pow> {("*"|"/") <pow>}
<pow> -> <fact> {"^" <fact>}
<fact> -> "(" <expr>")" | id | number | trig "(" expr ")"

I posted this on DevShed yesterday, but it doesn't look like anyone knew what to do.

The grammar's EBNF notation. "trig" is a terminal with lexemes "sin" "cos" or "tan". "id" is a terminal that's any character followed my any number of numbers or characters, and number is any signed number.

Here's my progress on this so far:

<expr>, <term>, <pow> are all completed methods, since they're basically the same thing. Ex:



Driver.nextToken() pulls the next token, and Driver.match compares it to what's expected.

I'm having a plethora of issues with this one, so I'll just list them, and any help is greatly appreciated:

With <prog>, here's what I have not:



p() in this one parses and checks for the string "prog"

In prog I don't know how to look that sequence of ids. I don't know whether the loop should be in the switch or out of it or I should just make a new method and loop it, or just recurse. Also, should I recurse after I call id in the 'p' case?
In <sl> (not posted because there's nothing to do):
How should I determine whether a semi-colon is the last in the line or if there's an <s> after it?
And finally, in <fact> how do I differentiate between an id taht starts with a trig token and a plan trig token?

I know regex goes in here somewhere. I'm already going to use it to see if the next character is an number, then loop in a number method, etc.

I know that's a lot of questions, but thanks in advance to any help.
 
David Newton
Author
Rancher
Posts: 12617
IntelliJ IDE Ruby
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm not sure how a regex is involved; how are you doing your lookahead?
 
Campbell Ritchie
Sheriff
Posts: 51453
87
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What does the nextToken() method return?
 
Ryan Carter-Reich
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
David Newton wrote:I'm not sure how a regex is involved; how are you doing your lookahead?


The only way I implement it at the moment is in <fact>, it looks while the next character matches the pattern for a digit (though this could have been done without regex). I'm not sure where else to put it, but I was told that it might be helpful.

And apologies about nextToken(); it was late and I wasn't thinking. In my original file, I input the text, then scan it, removing comments, etc, and then parse each character into an ArrayList. getNextToken() updates the nextToken by getting the first character from that ArrayList. Right now I have it removing that index, but I think I might need to just have it iterate over the list.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic