yep. I agree with Kris fwiw, its better to have your business logic in pure types, and have a monad transformer stack "at the edges". I'd like to keep PersistT out of things as much as possible just because of the principle of least power!
@Joel McCracken to avoid discussing different topic in the same stream, I dig a little about this and found MaxGabriel's comment on this issue https://github.com/yesodweb/persistent/issues/1115. looking at this, I'm not sure if what Kris's said is true in all cases.
It has been reported that persistent encourages you to use your database types in your domain. We should investigate the documentation for persistent and ensure that is not the case. We should inst...
and only merge them if you have to optimise (although I don't imagine this deconstructing/constructing of types with one constructor to really cost much in most cases)
I think another issue for said database was that the representation of the domain was tied to how you could formulate it in terms of the types the database could handle.
In reality, we could have had a better representation in the Haskell domain and have boundary conversions to/from the database representation.
Well, what Kris said is certainly an opinion; i don't think its the only valid one. I do agree though, its much easier to reason with pure code instead of code that is interspersed with a monad transformer stack, so that take that for what its worth
There are two separate issues here though and im not sure which one is being discussed exactly; are we talking about having separate domain logic and persistence types, or are we discussing if it makes sense to have PersistT interspersed throughout your business code?
yeah so i mean there are several things here; generally the preferred domain type you might have is not going to be be the same as the preferred type for persistence
It has been reported that persistent encourages you to use your database types in your domain. We should investigate the documentation for persistent and ensure that is not the case. We should inst...
now, you can actually use them sometimes, but you just need to be aware that you have implicit coupling, and you may need to split them out at some point
It can seem like extra work. But it does give you some flexibility to change representation at the database layer without having to change your business logic.
It doesn't happen a lot in straight-forward CRUD web applications in my experience... but it can be nice for lazy loading related data, changing underlying representations for performance reasons, sharding, etc.
Thing is, the lazier thing is to say "eh that's not a big deal". I think using entities directly discourages type safety, esp in the "parse don't validate" sense
If your team isn't already sold on correct by construction or the default is "just use the entity", you'll need some pretty good motivating examples though.
I think I'll add the person example and how it can go wrong to the GitHub thread when I'm at a computer. Then after it's motivated show how in the domain type we parse instead of validating
It has been reported that persistent encourages you to use your database types in your domain. We should investigate the documentation for persistent and ensure that is not the case. We should inst...
Thank you @codygman and @Joel McCracken for the discussion. Although sometime I really wish I have a bigger project so I can learn more about this.
I really like about the lexi's blog on "parse don't validate" and I think that the way I directly access my entities is against what advised in that blog
So I have an example -- its not for a DB, but it is for another data layer.
Right now at work I am writing a system for pulling data in to then be reuploaded to our internal services. This thing should run on a regular basis, etc. the system acts on what is contained in a configuration file. Here is an example file:
In this example, this process would read a file from the filesystem (presumably put there by another process)
and it would output it to the filesystem (to also be sent along to something that expects it there).
This is just to give you a simple idea how how the system works; its not a realistic example, in reality we are
doing pulling data in via HTTP, modifying it, and then submitting it to graphql.
notice that next does not match any action name. However, it IS valid yaml. And the way our data types that
the yaml parses into works is that next is a Maybe Text, because it may or may not be there (the final
step in an process will not have a next value set).
The problem with this data representation is that it correctly parses this example. The only way this
issue is discovered is while the process is executing the steps, and it looks up the next step in actions, and
it finds that no action with that name exists. If we imagine that this is a very long running process, a user
might not get an alert that the error has occurred until hours after initiating the process with this config.
Now, we could write a post-parsing validation step that indicates if there is an error or not. But this still
leaves us with an annoying data model. Any time you want to figure out what the next step is in a process, you
have to do a lookup which might fail, which is just following the types. If we have a list of actions, then
looking for one with a matching name might not return any, or it might return multiple with the same name.
You can choose to ignore that possibility, but that is considered bad haskell style. Instead, whenever possible
we try to make illegal states unrepresentable. So lets say that the data model is like this today:
That way our code can use nextAction to access the next action; of course this might be the last one (hence the Maybe),
but this removes the possibiliby of there incorrectly being a named action that doesn't exist, or
multiple actions named the same thing.
Interesting, I still need time to digest it at first, but that's make sense. One question to clarify my understanding, this nextAction help us to access the next action in "next" ConfigAction, right?
Joel McCracken said:
@Joel McCracken to avoid discussing different topic in the same stream, I dig a little about this and found MaxGabriel's comment on this issue https://github.com/yesodweb/persistent/issues/1115. looking at this, I'm not sure if what Kris's said is true in all cases.
it may be possible to share the types sometimes, but I think it's a much safer to always separate them by default
and only merge them if you have to optimise (although I don't imagine this deconstructing/constructing of types with one constructor to really cost much in most cases)
I think another issue for said database was that the representation of the domain was tied to how you could formulate it in terms of the types the database could handle.
In reality, we could have had a better representation in the Haskell domain and have boundary conversions to/from the database representation.
Well, what Kris said is certainly an opinion; i don't think its the only valid one. I do agree though, its much easier to reason with pure code instead of code that is interspersed with a monad transformer stack, so that take that for what its worth
There are two separate issues here though and im not sure which one is being discussed exactly; are we talking about having separate domain logic and persistence types, or are we discussing if it makes sense to have PersistT interspersed throughout your business code?
we were discussing the domain vs persistence types, as far as I can tell
ofcourse it's also better to not have
SqlPersistT
around in your business logic, the same way it's better not to have IO directlyyeah so i mean there are several things here; generally the preferred domain type you might have is not going to be be the same as the preferred type for persistence
https://github.com/yesodweb/persistent/issues/1115#issuecomment-690094112 @Georgi Lyubenov // googleson78 has a great example there
now, you can actually use them sometimes, but you just need to be aware that you have implicit coupling, and you may need to split them out at some point
would an example help illuminate the issue @Rizary ?
I bet an example would help. Seperating domain types and persistent entities are a hard sell if many things end up like:
If the two types end up being the same, it seems kind of pointless.
It can seem like extra work. But it does give you some flexibility to change representation at the database layer without having to change your business logic.
It doesn't happen a lot in straight-forward CRUD web applications in my experience... but it can be nice for lazy loading related data, changing underlying representations for performance reasons, sharding, etc.
I usually do this stuff in the database itself with physical tables being separated by views and materialized views. :shrug:
You might also have some meta-information on the DB type that you could discharge to
Either
in thepersonDBOToPerson
function... :thinking:That's pretty random. Never mind me. :sweat_smile:
Well, the
Person
example above is already broken. APerson
shouldn't be able to have negative age :stuck_out_tongue_wink:I think sometimes it can be good to separate, but for CRUD apps I can't make a good argument for separating all the time.
Right, so making smart constructors for person from personDbo justifies separation
You might also want migration paths. If your DB representation changes doesn't necessarily mean your logic needs to
Thing is, the lazier thing is to say "eh that's not a big deal". I think using entities directly discourages type safety, esp in the "parse don't validate" sense
Joel McCracken said:
it will if you don't mind!! thanks, always appreciate your response.
lemmie think on it; i think the issues would be best illustrated in a reasonably sizable example
one of the abasolutely most useful techniques in haskell is having your domain types be correct-by-construction, like is referenced above
If your team isn't already sold on correct by construction or the default is "just use the entity", you'll need some pretty good motivating examples though.
I think I'll add the person example and how it can go wrong to the GitHub thread when I'm at a computer. Then after it's motivated show how in the domain type we parse instead of validating
I posted a pretty good sized comment with many examples talking about this in the issue:
https://github.com/yesodweb/persistent/issues/1115#issuecomment-690504252
Can I just say, i'm really glad this conversation is happening
Thank you @codygman and @Joel McCracken for the discussion. Although sometime I really wish I have a bigger project so I can learn more about this.
I really like about the lexi's blog on "parse don't validate" and I think that the way I directly access my entities is against what advised in that blog
So I have an example -- its not for a DB, but it is for another data layer.
Right now at work I am writing a system for pulling data in to then be reuploaded to our internal services. This thing should run on a regular basis, etc. the system acts on what is contained in a configuration file. Here is an example file:
In this example, this process would read a file from the filesystem (presumably put there by another process)
and it would output it to the filesystem (to also be sent along to something that expects it there).
This is just to give you a simple idea how how the system works; its not a realistic example, in reality we are
doing pulling data in via HTTP, modifying it, and then submitting it to graphql.
Anyway, so imagine this scenario:
notice that
next
does not match any action name. However, it IS valid yaml. And the way our data types thatthe yaml parses into works is that
next
is aMaybe Text
, because it may or may not be there (the finalstep in an process will not have a
next
value set).The problem with this data representation is that it correctly parses this example. The only way this
issue is discovered is while the process is executing the steps, and it looks up the next step in actions, and
it finds that no action with that name exists. If we imagine that this is a very long running process, a user
might not get an alert that the error has occurred until hours after initiating the process with this config.
Now, we could write a post-parsing validation step that indicates if there is an error or not. But this still
leaves us with an annoying data model. Any time you want to figure out what the next step is in a process, you
have to do a lookup which might fail, which is just following the types. If we have a list of actions, then
looking for one with a matching name might not return any, or it might return multiple with the same name.
You can choose to ignore that possibility, but that is considered bad haskell style. Instead, whenever possible
we try to make illegal states unrepresentable. So lets say that the data model is like this today:
We would rather have a type that looks like this:
That way our code can use
nextAction
to access the next action; of course this might be the last one (hence theMaybe
),but this removes the possibiliby of there incorrectly being a named action that doesn't exist, or
multiple actions named the same thing.
So basically the idea is to "make illegal states unrepresentable", but also its just more convenient to have datatypes that
map to your business rules. This is essentially the same as this https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/
Interesting, I still need time to digest it at first, but that's make sense. One question to clarify my understanding, this
nextAction
help us to access the next action in "next"ConfigAction
, right?yessir!