Long primary keys cause problems with mysql newer version #577

lfrank · 2023-04-14T20:21:00Z

lfrank
Apr 14, 2023
Maintainer

Mysql 8 has a limit of 3072 characters for the "index" which, I believe, corresponds to the primary key.

This currently causes problems in
common_backup.py (which we can remove)
and the spikesorting pipeline.

For the spikesorting pipeline, we can likely limit each varchar primary key to 175 characters with no loss, so we might want to try that.

jsoules · 2023-04-14T21:22:25Z

jsoules
Apr 14, 2023

@lfrank, you're referring to the InnoDB limit? (https://dev.mysql.com/doc/refman/8.0/en/innodb-limits.html) I believe that's 3072 bytes, so around 190 characters depending on your encoding.

I don't have your database structure to hand and I'm not expert in DataJoint, but it occurs to me that you might also have a risk of running into this limitation if there are tables in your application which need to refer to the primary keys of multiple other tables. In particular, if you want to support a many-to-many relationship between table A and table B, you need a "join table" or "crosswalk table" which records the unique pairing of a row from A and a row from B. Many tools will make the primary key of such a join table by concatenating the keys of the two fact tables. Even without such a rule being used to create the join table's primary key, you'd still likely need an index to enforce the UNIQUE constraint on the pairing of columns.

As a concrete example, consider an "Experimenters" table and an "Experiments" table; if you want to record that K was part of the team that ran experiment D, your system would likely create an Experimenters_Experiments table which uses an index to enforce that the combination of K and D is unique.

In such an instance, you could hit the index size cap just by allowing a many-to-many relation between two tables whose text primary keys were a mere 100 characters. And you can imagine cases where you need to record facts about the unique combination of an even larger number of tables' records; so you could run into problems with the indexes used to enforce UNIQUE or FOREIGN KEY constraints in those cases as well if they require concatenating even modest-length text primary keys.

Depending on how much you're willing to restructure your data, this limitation may constitute a reasonable case for using a synthetic primary key (or surrogate key, ie an arbitrary value assigned by the database).

0 replies

lfrank · 2023-04-14T21:34:48Z

lfrank
Apr 14, 2023
Maintainer Author

@jsoules Agreed on all points, but I believe we're limited to a total of 16 primary key entries in all cases, including when we inherit keys from upstream tables. That's where the 175 came from, but we'd certainly need to check all that.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Long primary keys cause problems with mysql newer version #577

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Long primary keys cause problems with mysql newer version #577

lfrank Apr 14, 2023 Maintainer

Replies: 2 comments

jsoules Apr 14, 2023

lfrank Apr 14, 2023 Maintainer Author

lfrank
Apr 14, 2023
Maintainer

jsoules
Apr 14, 2023

lfrank
Apr 14, 2023
Maintainer Author