Hello group, I recently had the idea:
"A rule system (like Drools) is ideal for making programs with complex
rules simpler. Writing a lexer or parser can be non-trivial. So, is it
possible and also meaningful to express such a task with rules?"
Anyone here who maybe tried that already?
The two big questions for me are:
1) how easy is it to express a lexer with rules?
2) how bad (good?) will it perform?
If you happen to have a good idea of how to do it, could you please give
me an example for a simple lexer?
Let's say it will get natural language (a string, such as this email) as
input and should return a sequence (say, ArrayList) of Tokens, which may
look like this:
public class Token {
public String value;
public String category;
Token(String value, String category) {
this.value = value;
this.category = category;
}
}
We could have three categories:
"word", "numeric" and "whitespace".
An input String could be:
"We can see 500 cars"
And it should produce an ArrayList with the contents:
[
Token("We", "word"),
Token(" ", "whitespace"),
Token("can", "word"),
Token(" ", "whitespace"),
Token("see", "word"),
Token(" ", "whitespace"),
Token("500", "numeric"),
Token(" ", "whitespace"),
Token("cars", "word")
]
At the moment I have difficulties to see if/how this could be achieved.
If you find this easy, please post a solution.
I am aware that JavaCC is really good for such tasks and will also
perform extremly well.
Greetings,
André