I think the idea was just to use the mergeSort function he already defined earlier. In an actual library, yeah, you'd define the smart constructor as toSort = Sort . sort

The specific implementation of sort you get out is related to the implementation of foldMap.

You could define a variant of foldMap that recursively splits a list in half and merges them together with mconcat or <>.
This is normally a bad idea because we have to

Traverse the list to find its length

Automatically fails here on infinite lists, even if <> can short circuit

Juggle around multiple intermediate values instead of 1 single accumulator (bad for efficiency/stack overflows)

but for certain monoids, we can get an improvement.

-- WARNING: UNTESTED CODE, DON'T USE THIS IN PRACTICEimportData.List(splitAt)foldMap'::Monoidb=>(a->b)->[a]->bfoldMap'fxs=reduce.fmapf$xsreduce::Monoidb=>[b]->breducexs=reduceFirstN(lengthxs)xsreduceFirstN::Monoidb=>Int->[b]->breduceFirstN0xs=memptyreduceFirstN_[x]=xreduceFirstNn[x,y]|n>1=x<>yreduceFirstNlenxs=left'<>right'wherehalfLen=len`quot`2(left,right)=splitAthalfLenxsleft'=reduceFirstNhalfLenleftright'=reduceFirstN(len-halfLen)right

If you use this toSort = foldMap' (Sort . pure) you basically get classical merge sort something much more like classical merge sort.
(I'm still worried that my looping over the left half of the list messes up the nlog n time because splitting linked lists is slow)

Your choice of what foldMap to use is like your choice of what sort to use.
It shouldn't affect the answer you get but it can impact the performance of your code.
(Modulo non-termination because you made your code too strict)

I'm confused daniel, you said the problem was you need to traverse the list. But your implementation knows the length.

Efficiency in sort depends on pre existing knowledge of the type and collection.

I don't yet read Haskell so I'm inferring a lot.

There are 2 problems with my foldMap'.
They are caused by the following function calls

length xs and

splitAt halfLen xs

The problem with length xs is that this only works for non-infinite lists.
For example

countingNums::[Integer]countingNums=[1..]

Is a list of all Positive Integers.

If you try to compute its length, you'll take infinite time because you'll never reach the end of the list where you would stop and return how long it is.

We can still do useful things with this list, for example

allevencountingNums

immediately returns False because even 1 is False so it can stop without inspecting the rest of the list.

(Fun-fact: all ??? infiniteList will never return True, it will either return False when it finds the first False element or it will take forever as it tries to find a False element or the (non-existent) end of the list)

The second issue is about what it means to be "classical merge-sort".

My foldMap' implementation splits the list in half.
This is close to what you would do in a traditional language, but in those languages, you would normally have either

A pointer to the beginning of the array + a length or

A start and stop index to that section of the array

The problem here is that [a] is a linked list in Haskell.
If I make a left and right pair of linked lists (which is what I do), I have to loop over each element in the left half of the list, to copy it to a new shorter list.
But, this means that my foldMap' merge-sort as defined here will probably take O(n^2) time because it is making copies of our linked list.

If I try using offsets into the list to avoid the copying, I'll suffer from the O(n) indexing time for random elements in a linked list, which is probably even worse.

Can I solve both of these and traverse the linked list in a way that avoids the indexing time and splitAt time cost, almost certainly yes but it would dramatically increase the complexity of my answer.

Likewise, there are ways to avoid asking for the length of the list but my answer would get much more alien in exchange for better handling long/infinite linked lists.

p.s. The simplest way to fix the second problem (but still fail the infinite list one) is to do a one-time conversion to a Data.Vector and use that for O(1) time splitAts and then convert back to a linked list at the end.

So pow x n is x <> ... <> x n times, then we can compute pow x n in lg n <>s by rewriting pow with the Exponentiation by squaring algorithm instead of the naive one I wrote above.

The most famous use-cases for this are fast exponentiation of modular arithmetic and matrix powers.

Side-note:
If you don't require the ability to compute the 0th power of a value, we can relax the Monoid a restriction to a Semigroup a restriction.
That's because the only difference between Monoid and Semigroup is whether or not mempty needs to be defined and we only need to have mempty for the 0 case.

Here's another interesting monoid from chris penner:

https://chrispenner.ca/posts/monoid-sort

the funny thing is that

seems to essentially be an insertion sort

He mentions them in his post, but anybody who wants to see a really compelling use-case for monoids should look at finger trees.

That's the thing behind Data.Seq, right?

Data.Sequence rather

Yes,

`Data.Sequence`

is basically a finger tree where our monoid is`Sum`

and we map each element to`1`

.looks like he has a post on those too

https://chrispenner.ca/posts/intro-to-finger-trees

I'm not sure that tosort function justifies itself vs concat/join-collection with sort.

E.g why not just define what sort means over collection at that point? Is tosort in addition to sort in some way?

I guess this is more of an exploratory thing than a practical one :smile:

~~This is parallelizable! :)~~nvm got confused

You could do

`Sort . sort`

right?Oh wait no the concat isn't going to work :thinking:

Oh no that's fine

I think the idea was just to use the

`mergeSort`

function he already defined earlier. In an actual library, yeah, you'd define the smart constructor as`toSort = Sort . sort`

James Sully said:

The specific implementation of

`sort`

you get out is related to the implementation of`foldMap`

.You could define a variant of

`foldMap`

that recursively splits a list in half and merges them together with`mconcat`

or`<>`

.This is normally a bad idea because we have to

Traverse the list to find its length

Automatically fails here on infinite lists, even if

`<>`

can short circuitJuggle around multiple intermediate values instead of 1 single accumulator (bad for efficiency/stack overflows)

but for certain monoids, we can get an improvement.

If you use this

`toSort = foldMap' (Sort . pure)`

you~~basically get classical merge sort~~something much more like classical merge sort.(I'm still worried that my looping over the left half of the list messes up the

`nlog n`

time because splitting linked lists is slow)Huh, true, I guess I was assuming that foldMap works left to right, but associativity of monoids means we clearly don't need to necessarily.

Your choice of what

`foldMap`

to use is like your choice of what`sort`

to use.It shouldn't affect the answer you get but it can impact the performance of your code.

(Modulo non-termination because you made your code too strict)

As a corollary, because

`foldMap'`

should produce the same results as`foldMap`

does, you can`foldMap`

s with`foldMap'`

s`foldMap'`

isn't as lazy as`foldMap`

isand you don't need to worry about your answers changing.

I'm confused daniel, you said the problem was you need to traverse the list. But your implementation knows the length.

Efficiency in sort depends on pre existing knowledge of the type and collection.

I don't yet read Haskell so I'm inferring a lot.

drew verlee said:

There are 2 problems with my

`foldMap'`

.They are caused by the following function calls

`length xs`

and`splitAt halfLen xs`

The problem with

`length xs`

is that this only works for non-infinite lists.For example

Is a list of all Positive Integers.

If you try to compute its length, you'll take infinite time because you'll never reach the end of the list where you would stop and return how long it is.

We can still do useful things with this list, for example

immediately returns

`False`

because`even 1`

is`False`

so it can stop without inspecting the rest of the list.(Fun-fact:

`all ??? infiniteList`

will never return`True`

, it will either return`False`

when it finds the first`False`

element or it will take forever as it tries to find a`False`

element or the (non-existent) end of the list)The second issue is about what it means to be "classical merge-sort".

My

`foldMap'`

implementation splits the list in half.This is close to what you would do in a traditional language, but in those languages, you would normally have either

The problem here is that

`[a]`

is a linked list in Haskell.If I make a

`left`

and`right`

pair of linked lists (which is what I do), I have to loop over each element in the left half of the list, to copy it to a new shorter list.But, this means that my

`foldMap'`

merge-sort as defined here will probably take O(n^2) time because it is making copies of our linked list.If I try using offsets into the list to avoid the copying, I'll suffer from the

`O(n)`

indexing time for random elements in a linked list, which is probably even worse.Can I solve both of these and traverse the linked list in a way that avoids the indexing time and

`splitAt`

time cost, almost certainly yesbutit would dramatically increase the complexity of my answer.Likewise, there are ways to avoid asking for the length of the list but my answer would get much more alien in exchange for better handling long/infinite linked lists.

p.s. The simplest way to fix the second problem (but still fail the infinite list one) is to do a one-time conversion to a

`Data.Vector`

and use that for O(1) time`splitAt`

s and then convert back to a linked list at the end.In addition to

`foldMap`

s for different performance profilesAnother way to benefit from monoids is the famous Exponentiation by Squaring algorithm.

You can use this algorithm to solve many problems in lg n time, pretty quickly.

If you define

So

`pow x n`

is`x <> ... <> x`

n times, then we can compute`pow x n`

in lg n`<>`

s by rewriting`pow`

with the Exponentiation by squaring algorithm instead of the naive one I wrote above.The most famous use-cases for this are fast exponentiation of modular arithmetic and matrix powers.

You can also use it to write a lg n time Fibonacci function like I wrote in this reddit comment or solve the Knight's dialer in lg n time.

Side-note:

If you don't require the ability to compute the

`0`

th power of a value, we can relax the`Monoid a`

restriction to a`Semigroup a`

restriction.That's because the only difference between

`Monoid`

and`Semigroup`

is whether or not`mempty`

needs to be defined and we only need to have`mempty`

for the`0`

case.