Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rudimentary support for enum types #118

Merged
merged 5 commits into from
Jan 27, 2025
Merged

Conversation

zlondrej
Copy link
Contributor

@zlondrej zlondrej commented Jan 9, 2025

  • Added rudimentary support for enum types.
  • Grouped some parameters of migrateDatabase and checkDatabase into a DatabaseDefinitions record type,
    as it was getting confusing with 5 list parameters.
  • Removed stylish-haskell configuration file, since we format moved to fourmolu.

@zlondrej zlondrej requested review from arybczak and Raveline January 9, 2025 13:58
Copy link
Contributor

@Raveline Raveline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First and foremost: thank you for tackling this. It's a tricky feature, that is indeed lacking from this library.

I am somewhat not inclined to approve as is, on several basis.
1°) The PR does too many unrelated things, though I fully support dropping the stylish-haskell and I support the refactoring of database definition.
2°) The lack of support for enum update seems precarious (particularly when it comes to migrations).

Though I understand reluctancy to add
ALTER TYPE xxx ADD VALUE 'newvalue'
I would advocate adding at least support for
ALTER TYPE xxx RENAME TO yyy which would be invaluable when migrating enums.

checkDatabaseWithReport
options
DatabaseDefinitions
{ dbExtensions = _ -- We currently don't check extensions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no "we", here since this is a public library. If this is not supported, this shouldn't be listed.

Copy link
Contributor Author

@zlondrej zlondrej Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can get rid of the comment or rephrase it, but I'd rather add extension checking than splitting DatabaseDefinitions into two different types with difference of one field being present/absent if this is really necessary for approval.

Especially considering that listing installed extensions is as easy as select extname::text from pg_catalog.pg_extension. We already perform the check during migration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Please get rid of the comment then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering whether I should do the actual checking, but it's yet another unrelated thing, which you already complained about.

unless (resultHasErrors report) $
tell =<< lift (checkInitialSetups tables)
where
checkInitialSetups :: [Table] -> m ValidationResult
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ghc lets you do it (and it should), it might be a good opportunity to remove the intermediary signatures here, I think this is clear enough as it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is essentially just moved code, so I didn't made any changes that were not necessary. But I can do that.

@zlondrej
Copy link
Contributor Author

zlondrej commented Jan 16, 2025

Though I understand reluctancy to add
ALTER TYPE xxx ADD VALUE 'newvalue'
I would advocate adding at least support for
ALTER TYPE xxx RENAME TO yyy which would be invaluable when migrating enums.

I didn't introduce anything that wasn't needed at the moment, especially given that you can write raw queries in the migrations anyway.

@zlondrej
Copy link
Contributor Author

I am somewhat not inclined to approve as is, on several basis.
1°) The PR does too many unrelated things, though I fully support dropping the stylish-haskell and I support the refactoring of database definition.
2°) The lack of support for enum update seems precarious (particularly when it comes to migrations).

  1. I don't really understand this general obsession with PR "purity". Sometimes there's a trash lying (no longer used stylish-haskell configuration) around so you sweep it when you walk around it. Because let's be honest, people usually don't make PR's just for that. And the DB definition refactoring really was necessary at this point, because having 5 sequential list parameters was just confusing to use.
  2. Yeah, there are at least two enum update operations. Rename is straightforward, but add value has 2 optional parameters, which I'm not sure how to model (3 different functions, or 1 function with boolean for IF NOT EXIST and some Maybe EnumNeighbor for BEFORE | AFTER parameter). Could do that, but at the same time, I didn't find it too important since writing the raw SQL is still available.

ALTER TYPE name ADD VALUE [ IF NOT EXISTS ] new_enum_value [ { BEFORE | AFTER } neighbor_enum_value ]
ALTER TYPE name RENAME VALUE existing_enum_value TO new_enum_value

@Raveline
Copy link
Contributor

  1. I don't really understand this general obsession with PR "purity".

That's because you haven't had to do enough rollback in your life, lucky man that you are !

Joking aside, it's also a way to make reviewing easier.

On your second point, I think rename is the safest option (based on experience). In the meantime, ok for using RawSQL.

@zlondrej
Copy link
Contributor Author

I'm confused about what you mean by "rename is the safest option". It's one of two operations for managing enums, they are meant to do different thing.

@Raveline
Copy link
Contributor

Renaming used to be a very common way to update enums before ADD VALUE was added in 9.1 (or when you needed to run this outside of a transaction): you would rename an enumerated type to something "myenum_legacy" and recreate your type. It's still the way to go if you need to remove a value from the type by the way.

@arybczak
Copy link
Collaborator

arybczak commented Jan 24, 2025

It's still the way to go if you need to remove a value from the type by the way.

Well, apparently there's a hacky way to modify an existing enum type.

The PR looks alright to me. However, I'm still not sure if we want "proper" enums, specially since they aren't perfect and hpqtypes doesn't support them, so insertion and retrieval are going to be a pain. Yes, I know, you said you could add support for them to hpqtypes, but it's not going to be particularly easy (nor pleasant).

What's more, we already support serialization of enum Haskell types to readable values on the DB side via EnumAsTextEncoding, which "just works" (see KnownJobType in kontrakcja as an example). Is there any particular reason you didn't use it?

"Enum '"
<> enumName
<> "' has same values, but differs in order (database:"
<+> T.pack (show dbValues)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use just <> here (or mconcat) and add spaces where necessary? It's a bit confusing to read with different operators.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I only used <+> because it was already used in similar messages elsewhere.

, dbDomains = domains
, dbTables = tables
} = execWriterT $ do
(_, report) <- W.listen $ do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't particularly like usage of Writer here, but can't think of anything better.

@zlondrej
Copy link
Contributor Author

zlondrej commented Jan 24, 2025

What's more, we already support serialization of enum Haskell types to readable values on the DB side via EnumAsTextEncoding, which "just works" (see KnownJobType in kontrakcja as an example). Is there any particular reason you didn't use it?

I did use it, but the storage type for that is Text. The "support" for enum types can be achieved fairly easily by combining EnumAsTextEncoding with explicit casting.

@zlondrej
Copy link
Contributor Author

Renaming used to be a very common way to update enums before ADD VALUE was added in 9.1 (or when you needed to run this outside of a transaction): you would rename an enumerated type to something "myenum_legacy" and recreate your type. It's still the way to go if you need to remove a value from the type by the way.

Okay, but then you're renaming the entire type, not just one of the enum values. It's a different thing.
I'm also confused how this would work, if your column is of type myenum, you rename the type to myenum_legacy wouldn't that "change" the type of enum for the column to myenum_legacy as well?

And how does it achieve correct mapping between the old and new enums?

@arybczak
Copy link
Collaborator

I did use it, but the storage type for that is Text

I know, but is this really an issue?

@arybczak
Copy link
Collaborator

@marco44 any thoughts about simply using a text column instead of enum?

@marco44
Copy link

marco44 commented Jan 24, 2025

Both have their advantages

Enum is more compact and it enforces what's in the list efficiently. But you have this limitation on removing an element from the list. And as it's just a number in disguise, the stats collection is faster and more accurate.

Text stores any string, but check constraints to enforce the content are a pain: it takes more resources to enforce, and needs a very slow & resource heavy recheck each time you change your mind about the content of the list.

Indexes (btrees) are also faster on enums, of course... as it's much easier to sort (we're doing unicode collations…) and more compact

@arybczak
Copy link
Collaborator

@zlondrej Ok, if you think enum is better than text in your case, considered potential cons and found them not problematic, let's merge this.

@zlondrej zlondrej merged commit e9162e1 into master Jan 27, 2025
9 checks passed
@zlondrej zlondrej deleted the rudimentary-enum-support branch January 27, 2025 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants