Welcome to the Functional Programming Zulip Chat Archive. You can join the chat here. Hi guys, I'm having trouble understanding what things I should do to optimize my matrix multiplication algorithm.

Here's the code:

```data Matrix e cols rows where
Empty :: Matrix e Void Void
One :: e -> Matrix e () ()
Junc :: Matrix e a rows -> Matrix e b rows -> Matrix e (Either a b) rows
Split :: Matrix e cols a -> Matrix e cols b -> Matrix e cols (Either a b)

comp :: (Num e) => Matrix e cr rows -> Matrix e cols cr -> Matrix e cols rows
comp Empty Empty            = Empty
comp (One a) (One b)        = One (a * b)
comp (Junc a b) (Split c d) = comp a c + comp b d         -- Divide-and-conquer law
comp (Split a b) c          = Split (comp a c) (comp b c) -- Split fusion law
comp c (Junc a b)           = Junc (comp c a) (comp c b)  -- Junc fusion law
```

This is for my LAoP matrix library. I'm doing some benchmarks and I think it can/should go faster even though it's O(n³) and cache-oblivious. Is there any black magic that can be done? I'm also compiling with the following flags:

```    - -threaded
- -rtsopts
- -with-rtsopts=-N
- -O2
``` Should I use StrictData on the module where the data type is defined or the module where the benchmarks are? `StrictData` puts strictness annotations implicitly on every field - that means that during construction of datatype, values put into it are forced to WHNF - if they're strict in fields too, evaluation continues recursively