Description
I'm building a parser that accepts custom token stream.
I've made TokenStream (from lexer-applicative
) an instance of Stream
instance Stream (TokenStream (L tok)) where
And that's wonderful, everything worked as expected, until a "lexcial error" appear in my token stream
-- | A stream of tokens
data TokenStream tok
= TsToken tok (TokenStream tok)
| TsEof
| TsError LexicalError
The parser complained about unexpected end of input
, that's because I had no choice but had to treat TsError
like TsEof
.
I think there are 3 ways of solving this:
- Make
Stream
"aware" of these lexical errors: for example, lettake1_
return aEither
value instead of just aMaybe
value. - Make the parser incremental: so that users can check if the next token is
TsError
, before feeding it to the parser. - The "happy" way, something between 1. and 2.
I'll explain more about how it can be done in happy
:
Happy also allows user to choose their own type token stream (usually with alex
). As long as we tell happy
what is the token for eof
:
%lexer { <lexer> } { <eof> }
and what to do when a token comes in:
lexer :: (Token -> P a) -> P a
For example, this is how to deal with a token stream from lexer-applicative
:
lexer :: (Token -> P a) -> P a
lexer f = scanNext >>= f
scanNext :: P Token
scanNext = do
stream <- gets tokenStream
case stream of
TsToken (L _ tok) stream -> return tok
TsEof -> return TokenEOF
TsError (LexicalError pos) -> throwError $ Lexical pos
I think this is the best among the 3 solutions, because it allows users to handle lexical errors the way they like, and it's not an overkill like making megaparsec
incremental.
But I'm still not sure about how to incorporate this into the Stream
class, if we are going to do this.