Megaparsec - Haskell

Welcome to the Functional Programming Zulip Chat Archive. You can join the chat here.

Adam Flott

Anyone have an example project (with source) that has Megaparsec parsers that preserve comments in their structure? Trying to find a way to avoid doing something like

data ADTPiece
    = ADTPieceX (Either Comment X)
    | ADTPieceY (Either Comment Y)
data ADT = ADT [ADTPiece]

Notice: order is preserved

Adam Flott

Additionally, if I'm going to use Lexer's lexeme do I attach that to every parser or just from the top level?

Simon Michael

I'm afraid I don't fully understand the question, but here's a real world example that might give ideas: https://github.com/simonmichael/hledger/blob/master/hledger-lib/Hledger/Data/Types.hs#L352, https://github.com/simonmichael/hledger/blob/master/hledger-lib/Hledger/Read/JournalReader.hs#L686

easy-to-use command-line/curses/web plaintext accounting tool; a modern and largely compatible Haskell rewrite of Ledger - simonmichael/hledger
easy-to-use command-line/curses/web plaintext accounting tool; a modern and largely compatible Haskell rewrite of Ledger - simonmichael/hledger
Adam Flott

Ha! I was just looking at hledger for inspiration....
If my input looks like

# outside comment
some_section {
    # a comment
    var1 = whatever
}

I'm not sure how to encode that... I want an ordered list of things. Eventually I want to fully parse and pretty print the input.

Adam Flott

If I throw away the comments I have no problem, but instead I'm not sure how to preserve them in a data type. So that printing it a comment comes before var1

Georgi Lyubenov // googleson78

ghc seems to take the approach of having a list of found comments along with their locations instead of directly attaching them to the AST:
http://hackage.haskell.org/package/ghc-lib-parser-8.10.1.20200412/docs/Lexer.html#t:PState

Georgi Lyubenov // googleson78

I guess you can go

data CommentInfo
data Com a = Com a CommentInfo

wrap everything in your ast in a Com
tokenise comments into CommentInfos
and then attach every CommentInfo to the next valid ast object during parsing?

Simon Michael

Adam, here's how I handle a similar need in hledger. You'll see in my Transaction data type the tpreceding_comments field - that's where I'd store your "outside comment" (not yet implemented). And the tcomments field is where I'd store "a comment". The main thing is to design your data type to hold all the parts you care about, with a field for each one. Then you write your parser to parse and save each of those parts. (This is the dumb, obvious, no-lexer way, at least.)