Thread: Proposal: Better generation of values in GENERATED columns.
Hello, one of the most frustating things when I started with PostgreSQL was that IDENTITY columns are based on sequences that are completly disconnected from the table contents and manually imported data will lead to errors like 'duplicate key value violates unique constraint "xyz_pkey"'. I had to fight this initially with an insert trigger that always updates the sequences on each insert, or with client side code that updates the sequence when such an error occurs and then retries the insert. Even Microsoft Access did a better job at autogenerated primaries keys, and while I love the elegant design of PostgreSQL in many ways I believe we can do better here. I would like to implement a fallback solution that detects such errors and automatically updates the nextvalue of the sequence when the nextvalue is already used on insert. I believe this can be implemented without affecting performance negatively when one just does extra stuff in the error case, so I wouldn't do table scans when creating the insert initially. Any reasons why this isn't a good idea to try? Regards, Daniel Migowski
On 2019-08-26 13:28, Daniel Migowski wrote: > I would like to implement a fallback solution > that detects such errors and automatically updates the nextvalue of the > sequence when the nextvalue is already used on insert. ISTM that such a system would likely have many of the same problems as the often-proposed ignore-errors mode for COPY, namely that you can't catch errors and do something else other than rethrowing the error. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > On 2019-08-26 13:28, Daniel Migowski wrote: >> I would like to implement a fallback solution >> that detects such errors and automatically updates the nextvalue of the >> sequence when the nextvalue is already used on insert. > ISTM that such a system would likely have many of the same problems as > the often-proposed ignore-errors mode for COPY, namely that you can't > catch errors and do something else other than rethrowing the error. In principle you could probably use the same infrastructure used by ON CONFLICT to detect the unique-key violation. But ON CONFLICT is mighty complicated, and not very cheap either. I don't for one second believe Daniel's assertion that this could be done without a significant COPY performance hit. I'm also dubious that the right response to a duplicate key would be as simple as "try the next nextval() result". I for one wouldn't be satisfied with waiting for COPY to grind through a few thousand/million sequential nextval values before finding one that doesn't conflict with the existing table entries. The actually-sound advice for loading data that might have conflicting serial values is to do the equivalent of setval('sequence', max(existing keys) + 1) before you start loading. I wonder whether there's a way to make that simpler. regards, tom lane