Do we need to separate domain types and db access type? - Haskell

Welcome to the Functional Programming Zulip Chat Archive. You can join the chat here.

Rizary

Joel McCracken said:

yep. I agree with Kris fwiw, its better to have your business logic in pure types, and have a monad transformer stack "at the edges". I'd like to keep PersistT out of things as much as possible just because of the principle of least power!

@Joel McCracken to avoid discussing different topic in the same stream, I dig a little about this and found MaxGabriel's comment on this issue https://github.com/yesodweb/persistent/issues/1115. looking at this, I'm not sure if what Kris's said is true in all cases.

It has been reported that persistent encourages you to use your database types in your domain. We should investigate the documentation for persistent and ensure that is not the case. We should inst...
Georgi Lyubenov // googleson78

it may be possible to share the types sometimes, but I think it's a much safer to always separate them by default

Georgi Lyubenov // googleson78

and only merge them if you have to optimise (although I don't imagine this deconstructing/constructing of types with one constructor to really cost much in most cases)

Fintan Halpenny

I think another issue for said database was that the representation of the domain was tied to how you could formulate it in terms of the types the database could handle.
In reality, we could have had a better representation in the Haskell domain and have boundary conversions to/from the database representation.

Joel McCracken

Well, what Kris said is certainly an opinion; i don't think its the only valid one. I do agree though, its much easier to reason with pure code instead of code that is interspersed with a monad transformer stack, so that take that for what its worth

Joel McCracken

There are two separate issues here though and im not sure which one is being discussed exactly; are we talking about having separate domain logic and persistence types, or are we discussing if it makes sense to have PersistT interspersed throughout your business code?

Georgi Lyubenov // googleson78

we were discussing the domain vs persistence types, as far as I can tell

Georgi Lyubenov // googleson78

ofcourse it's also better to not have SqlPersistT around in your business logic, the same way it's better not to have IO directly

Joel McCracken

yeah so i mean there are several things here; generally the preferred domain type you might have is not going to be be the same as the preferred type for persistence

Joel McCracken

https://github.com/yesodweb/persistent/issues/1115#issuecomment-690094112 @Georgi Lyubenov // googleson78 has a great example there

It has been reported that persistent encourages you to use your database types in your domain. We should investigate the documentation for persistent and ensure that is not the case. We should inst...
Joel McCracken

now, you can actually use them sometimes, but you just need to be aware that you have implicit coupling, and you may need to split them out at some point

Joel McCracken

would an example help illuminate the issue @Rizary ?

codygman

I bet an example would help. Seperating domain types and persistent entities are a hard sell if many things end up like:

[persistLowerCase |
    PersonDBO
      age Int
]

data Person = Person { _id :: Int64, age :: Int}

personDBOToPerson :: PersonDBO -> Person

If the two types end up being the same, it seems kind of pointless.

James King

It can seem like extra work. But it does give you some flexibility to change representation at the database layer without having to change your business logic.

It doesn't happen a lot in straight-forward CRUD web applications in my experience... but it can be nice for lazy loading related data, changing underlying representations for performance reasons, sharding, etc.

James King

I usually do this stuff in the database itself with physical tables being separated by views and materialized views. :shrug:

James King

You might also have some meta-information on the DB type that you could discharge to Either in the personDBOToPerson function... :thinking:

James King

That's pretty random. Never mind me. :sweat_smile:

Fintan Halpenny

Well, the Person example above is already broken. A Person shouldn't be able to have negative age :stuck_out_tongue_wink:

codygman

I think sometimes it can be good to separate, but for CRUD apps I can't make a good argument for separating all the time.

codygman

Right, so making smart constructors for person from personDbo justifies separation

Fintan Halpenny

You might also want migration paths. If your DB representation changes doesn't necessarily mean your logic needs to

codygman

Thing is, the lazier thing is to say "eh that's not a big deal". I think using entities directly discourages type safety, esp in the "parse don't validate" sense

Rizary

Joel McCracken said:

would an example help illuminate the issue Rizary ?

it will if you don't mind!! thanks, always appreciate your response.

Joel McCracken

lemmie think on it; i think the issues would be best illustrated in a reasonably sizable example

Joel McCracken

one of the abasolutely most useful techniques in haskell is having your domain types be correct-by-construction, like is referenced above

codygman

If your team isn't already sold on correct by construction or the default is "just use the entity", you'll need some pretty good motivating examples though.

codygman

I think I'll add the person example and how it can go wrong to the GitHub thread when I'm at a computer. Then after it's motivated show how in the domain type we parse instead of validating

codygman

I posted a pretty good sized comment with many examples talking about this in the issue:

https://github.com/yesodweb/persistent/issues/1115#issuecomment-690504252

It has been reported that persistent encourages you to use your database types in your domain. We should investigate the documentation for persistent and ensure that is not the case. We should inst...
Joel McCracken

Can I just say, i'm really glad this conversation is happening

Rizary

Thank you @codygman and @Joel McCracken for the discussion. Although sometime I really wish I have a bigger project so I can learn more about this.

I really like about the lexi's blog on "parse don't validate" and I think that the way I directly access my entities is against what advised in that blog

Joel McCracken

So I have an example -- its not for a DB, but it is for another data layer.

Right now at work I am writing a system for pulling data in to then be reuploaded to our internal services. This thing should run on a regular basis, etc. the system acts on what is contained in a configuration file. Here is an example file:

api:
  version: "0.1.0"
init:
  name: "getFileForProcessing"
  module:
    tag: "file.getFileForProcessing"
    contents:
      fileLocation: "test/fixtures/appTest.csv"
      validate: True
  logLevel: error
  next: "mapFileForOutput"
actions:
- name: "makeOutput"
  module:
    tag:  "file.output"
    contents:
      destination: "test/fixtures/result2.out"
  next: log
- name: "log"
  module:
    tag:  "log"
    contents:
      level: error
      text: this is log output in worker
- name: "mapFileForOutput"
  module:
    tag:  "mapping.basic"
    contents:
      strict: False
      mappings:
        inputColumnName: output-column-name
    next: "makeOutput"
  logLevel: debug
  next: "makeOutput"

In this example, this process would read a file from the filesystem (presumably put there by another process)
and it would output it to the filesystem (to also be sent along to something that expects it there).
This is just to give you a simple idea how how the system works; its not a realistic example, in reality we are
doing pulling data in via HTTP, modifying it, and then submitting it to graphql.

Anyway, so imagine this scenario:

api:
  version: "0.1.0"
init:
  name: "getFileForProcessing"
  module:
    tag: "file.getFileForProcessing"
    contents:
      fileLocation: "test/fixtures/appTest.csv"
      validate: True
  logLevel: error
  next: "mapFileForOutput"
actions: []

notice that next does not match any action name. However, it IS valid yaml. And the way our data types that
the yaml parses into works is that next is a Maybe Text, because it may or may not be there (the final
step in an process will not have a next value set).

The problem with this data representation is that it correctly parses this example. The only way this
issue is discovered is while the process is executing the steps, and it looks up the next step in actions, and
it finds that no action with that name exists. If we imagine that this is a very long running process, a user
might not get an alert that the error has occurred until hours after initiating the process with this config.

Now, we could write a post-parsing validation step that indicates if there is an error or not. But this still
leaves us with an annoying data model. Any time you want to figure out what the next step is in a process, you
have to do a lookup which might fail, which is just following the types. If we have a list of actions, then
looking for one with a matching name might not return any, or it might return multiple with the same name.

You can choose to ignore that possibility, but that is considered bad haskell style. Instead, whenever possible
we try to make illegal states unrepresentable. So lets say that the data model is like this today:

data ConfigAction
  = ConfigAction
  { nextActionName :: Maybe Text
  ...
  }

We would rather have a type that looks like this:

data Action
  = Config
  { actionConfigAction :: ConfigAction
  , actionNextAction :: Maybe ConfigAction
  ...
  }

That way our code can use nextAction to access the next action; of course this might be the last one (hence the Maybe),
but this removes the possibiliby of there incorrectly being a named action that doesn't exist, or
multiple actions named the same thing.

So basically the idea is to "make illegal states unrepresentable", but also its just more convenient to have datatypes that
map to your business rules. This is essentially the same as this https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/

Rizary

Interesting, I still need time to digest it at first, but that's make sense. One question to clarify my understanding, this nextAction help us to access the next action in "next" ConfigAction, right?