How to report "lexical" errors?

I'm building a parser that accepts custom token stream.
I've made [TokenStream](http://hackage.haskell.org/package/lexer-applicative-2.1.0.2/docs/Language-Lexer-Applicative.html#t:TokenStream) (from `lexer-applicative`) an instance of `Stream`

```haskell
instance Stream (TokenStream (L tok)) where
```

And that's wonderful, everything worked as expected, until a "lexcial error" appear in my token stream


```haskell
-- | A stream of tokens
data TokenStream tok
  = TsToken tok (TokenStream tok)
  | TsEof
  | TsError LexicalError
```

The parser complained about `unexpected end of input`, that's because I had no choice but had to treat `TsError` like `TsEof`.

I think there are 3 ways of solving this:

1. Make `Stream` "aware" of these lexical errors: for example, let `take1_` return a `Either` value instead of just a `Maybe` value.
2. Make the parser incremental: so that users can check if the next token is `TsError`, before feeding it to the parser.
3. The "happy" way, something between 1. and 2.

I'll explain more about how it can be done in `happy`:

[Happy also allows](https://www.haskell.org/happy/doc/html/sec-monads.html#sec-lexers) user to choose their own type token stream (usually with `alex`). As long as we tell `happy` what is the token for `eof`:
```
%lexer { <lexer> } { <eof> }
```
and what to do when a token comes in:

```haskell
lexer :: (Token -> P a) -> P a
```

For [example](https://github.com/scmlab/cp/blob/a92c21b3c7db7a3e95610804f15e150f6faed4a0/src/Syntax/Parser/Lexer.hs#L115-L131), this is how to deal with a token stream from `lexer-applicative`:

```haskell
lexer :: (Token -> P a) -> P a
lexer f = scanNext >>= f

scanNext :: P Token
scanNext = do
  stream <- gets tokenStream
  case stream of
    TsToken (L _ tok) stream -> return tok
    TsEof -> return TokenEOF
    TsError (LexicalError pos) -> throwError $ Lexical pos
```

I think this is the best among the 3 solutions, because it allows users to handle lexical errors the way they like, and it's not an overkill like making `megaparsec` incremental.

But I'm still not sure about how to incorporate this into the `Stream` class, if we are going to do this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to report "lexical" errors? #387

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

How to report "lexical" errors? #387

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions