Skip to content

[Question] How to propagate line number from lexer to parser #560

Open
@leana8959

Description

@leana8959

Hello,

I'm writing a lexer and a parser for a custom language. I would like to propagate the line numbers of each token from the lexing stage (should it succeed) to the parsing stage, so that parsing errors can be more friendly.

The problem is, parsing tokens composed with SourcePos seems to be significantly harder.
I can't instruct token what's expected (of type MyTokenType), because it wants a MyToken (the input type of my parser) which is the wrapped type with the line number. It doesn't make sense to include the number in the expected token.

Even if I were to use [TokenType] (the type without line number) as the input stream directly, other variants of the same issue crops up. For example: if I were to parse a sequence of TInt Int tokens, it doesn't make sense to say TInt 123 is expected, because any token of type TInt with any value that the lexer produces is valid.

Is there an easy way out of this, or is there a better, more idiomatic alternative ?

Lexer:
-- Wrapped token type with position in the source file
data MyToken = MyToken {tokenPos :: SourcePos, tokenType :: MyTokenType} deriving (Eq, Ord, Show)
data MyTokenType = TInt Int deriving (Eq, Ord, Show)

type Lexer = Parsec Void String

-- Wrap with position
withPos :: Lexer MyTokenType -> Lexer MyToken
withPos p = MyToken <$> getSourcePos <*> p

spaceConsumer :: Lexer ()
spaceConsumer = (skipMany . oneOf) ['\t', ' ']

lexer :: Lexer [MyToken]
lexer = do
  between spaceConsumer (spaceConsumer <* eof) (many (lInt <* spaceConsumer))
  where
    lInt = withPos $ TInt . read <$> takeWhile1P Nothing isDigit
Parser:
type Parser = Parsec Void [MyToken]

-- Primitive over `token` to parse a lexical item
token'
  :: (MyTokenType -> Maybe a)
  -- ^ Predicate
  -> MyTokenType
  -- ^ Expected
  -> Parser a
token' p t =
  token
    (p . tokenType)
    (S.singleton . Tokens . NE.singleton $ t {- type error, expected `Token` but got `TokenType` -})

Thank you for maintaining the library, megaparsec is awesome :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions