antlr4 - Lexers w/ Phrase Tokens -


I am using w / ANTL4 on a grammar that would be best in phrases rather than words (i.e., token blank In some cases) In some cases, however, I want to catch the specific phrases mentioned above as separate tokens. Consider the following example:

  Display of the context of encounter   

The phrase "one of the events" is special - whenever I see it, I want to Get out of it The rest of the statement ("encounter exposure") is quite arbitrary and for the purposes of this example, anything can happen.

For this example, I have suppressed this quick grammar:

  Grammar testing; State: Okrerence Ferra; Inauguration: 'Incident' LABEL ''; Piece LABEL: [A-Z]; PHRASE: (WORD '') * WORD; Section WORD: [a-zA-Z \ -] +; WS: [\ t \ n \ r] + - - Skip;   

If I test it against the above statement, then it fails ("Line 1: 0 '" Remembering Opener "during the incidence of advocacy enforcement" I believe that this is because the Lezzer will match the token which can consume the letters (PHRASE, in this case) continuously.

So ... I have this problem I think - I'm not clear right now, is it possible for the best solution? Or am I just a Should be with the dictionary that matches the bounds of the word and a parser that puts them together in phrases? I prefer to do it in lasers because the phrase (such as "encounter performance") is actually a unit.

I am new to ANLR (and usually laxers / parsers), please if I am a solution, then forgive me! Until now, though, I have not found any answers.Thanks!

However, what you want to do in Lexar **, on such a simple grammar It is unlikely to be worth the effort. Also, by packing it all in a token, you are being forced to set yourself up so that you can manually move around in the token string, to choose the value of LABEL.

You still reflect meaningful rules - rules that you consider to be 'tokens' - just as simple, 'lower level' parser rule:

  stat: occurrence phrase; Event: Occasion Label = Word; Phrase: WORD +; Inauguration: 'Incident'; Off: 'key'; WORD: [a-zA-Z \ -] +; WS: [\ t \ n \ r] + - - Skip;   

** If you really want, you can apply a lexer mode and, using the 'more' operator, the oprinis in a token ... string is used can do. It's unused - I think "more" will work as shown, but if you do not need to pack the token text yourself. In any event, it shows the potential complexity of the work you have intended to do.

  Start: 'Event' - & gt; Pushmode (content), more; Mode accessories; Off: 'K' - & gt; PopMode, more; other:. - & gt; more ;    

Comments

Popular posts from this blog

Java - Error: no suitable method found for add(int, java.lang.String) -

java - JPA TypedQuery: Parameter value element did not match expected type -

c++ - static template member variable has internal linkage but is not defined -