hanging indent with symbolic operators - Haskell

Welcome to the Functional Programming Zulip Chat Archive. You can join the chat here.

Torsten Schmits

can anyone explain why this parses:

a :: IO Int
a = do a <- pure 5
       pure a
 >>= pure
       pure 5

is there a rule about operators that overrides normal layouting rules? it appears that the indent of that line must be larger than the parent layout (like in case of a nested do, it's the outer do's layout indent). it also only works for expression statements, not binders.

if someone can point me to documentation, that would be great

TheMatten

https://www.haskell.org/onlinereport/haskell2010/haskellch10.html#x17-17800010.3 provides algorithmic description of layout resolution - if you look at lines mentioning <n> (line fold), in your case they'll add } before the operator, because indent is lower than one set by do, and pop the <n>, because it's bigger than outer context

TheMatten

Oh, wait, I've now realized you mean line after that :sweat_smile:

Georgi Lyubenov // googleson78

ghc parser confirmed that in fact the whole do is the arg to the bind

Georgi Lyubenov // googleson78

I would have guessed this requires BlockArguments tbh

TheMatten

Okay, but how's

pure
  pure 5

valid?

TheMatten

Ah, I see now :joy:

Georgi Lyubenov // googleson78

maybe we should assassinate that instance from the ghc codebase

Torsten Schmits

now how to parse this:

a :: IO Int
a =
  do
       pure 5
       >>= f
Georgi Lyubenov // googleson78

whoever is using it can't be up to any good

Torsten Schmits

(inspired by a file from HLS breaking my parser)

Torsten Schmits

apparently it might be (do pure 5) >>= f

Torsten Schmits

Georgi Lyubenov // googleson78 said:

maybe we should assassinate that instance from the ghc codebase

hacker in hoodie gif

TheMatten

I would expect GHC to do do (pure 5 >>= f) because of NonDecreasingIndentation

Torsten Schmits

if you move the >>= one char to the right, it's like you wrote

TheMatten

Oh, it's for {n}, not <n>, right

Georgi Lyubenov // googleson78

https://giphy.com/gifs/theoffice-the-office-tv-frame-toby-vyTnNTrs3wqQ0UIvwE

Discover & share this The Office GIF with everyone you know. GIPHY is how you search, share, discover, and create GIFs.
TheMatten

We need-Wimplicit-layout in GHC, so than we can safely forget about it and write braces everywhere :sweat_smile:

Torsten Schmits

doesn't help if you want to parse other people's code!

Georgi Lyubenov // googleson78

or be more strict about what is allowed :sweat:

Torsten Schmits

that will never happen :cry:

Torsten Schmits

two spaces indent enforced everywhere. newline after layout open

TheMatten

Thing is though, we often write stuff that seems completely reasonable to humans but requires this flexibility, like

if _ then do
  _
else do
  _
Torsten Schmits

looks like it matches my rules

Georgi Lyubenov // googleson78

then opens a layout, but no newline after it?

Torsten Schmits

then don't open no layout

TheMatten

only do does for expressions

Torsten Schmits

only let, of, where, do

TheMatten

@Georgi Lyubenov // googleson78 what you see with newlines after tokens like = is indentation rule for surrounding layout

TheMatten
L (< n >: ts) (m : ms) = ; : (L ts (m : ms)) if m = n
                       = } : (L (< n >: ts) ms) if n < m
L (< n >: ts) ms = L ts ms -- this line
Torsten Schmits

a :: IO Int
a =
  do
       pure 5
       >>= f

my question now is: what is the condition in the layout algorithm that detects that >>= closes the do instead of only starting a new statement? is it because the op is symbolic?

Torsten Schmits

or are there more cases that cause this?

Torsten Schmits

it can't be type-directed, right?

TheMatten
 L (t : ts) (m : ms) = } : (L (t : ts) ms) if m∕ = 0 and parse-error(t)

parse-error(>>=) should be true, shouldn't it?

TheMatten

it expects expression, but gets some operator

TheMatten

Yeah - reason why lexer and parser have to be entangled in Haskell

Torsten Schmits

guess I'll have to use the symbolic character as a condition for layout_end and hope for the best

TheMatten

One more case is then BTW

TheMatten

You can do if do True then ...

Torsten Schmits

indeed, that confuses my parser as well

TheMatten

Or

if do True
      then True else False
Torsten Schmits

does that apply to other layouts or only do?

Torsten Schmits

nope that's a parse error

Torsten Schmits

right, where isn't allowed after just any expression

TheMatten

Hmm, let is always guarded by something, and where could only appear inside of it or in top level block

Torsten Schmits

I'm beginning to think that I should track do separately from the other layouts

TheMatten

I mean, if you did the same trick for all of them, what would happen? It will end up with parse error anyway, won't it?

Torsten Schmits

a = if case 5 of 5 -> True then 1 else 1

this is weird

TheMatten

But if you want to be GHC-complaint, you do have to support NonDecreasingIndetation, so do actually ends up being different after all

Torsten Schmits

I mean, if you did the same trick for all of them, what would happen? It will end up with parse error anyway, won't it?

:thinking:

Torsten Schmits

TheMatten said:

But if you want to be GHC-complaint, you do have to support NonDecreasingIndetation, so do actually ends up being different after all

why does that make it different?

TheMatten

You accept layout of do starting at the same level as the previous one

TheMatten
do _
   if _ then _ else do
   _
TheMatten

This doesn't apply to other layouts AFAIK

Torsten Schmits

welp, guess it's time to rewrite the layout handler

Torsten Schmits

do you have a link for NonDecreasingIndentation?

Torsten Schmits

or is that encoded in the layout section of the syntax reference?

TheMatten

In principle it's matter of switching > for >= in case of do layout

Torsten Schmits

"but not in Haskell2010 mode"?

Torsten Schmits

does that mean it's impossible to tell whether it's legal to use the same indent?

Torsten Schmits

(just by parsing the file)

TheMatten

I guess that applies to all syntactic extensions

Torsten Schmits

well it's simple to assume that a \case means we want LambdaCase

TheMatten

But what about static keyword? :big_smile:

TheMatten

That's basically why I've decided to work with Haskell2010 in Hask - there's just too much stuff in GHC :sweat_smile:

Torsten Schmits

well if you're lucky the parser will just decide the right variant of rec being keyword or identifier

Torsten Schmits

in my case, the parse tree isn't used for typing afterwards, so it doesn't matter