Thread: Proposal: Better generation of values in GENERATED columns.

Proposal: Better generation of values in GENERATED columns.

From

Daniel Migowski

Date:

26 August 2019, 11:28:47

Hello,

one of the most frustating things when I started with PostgreSQL was 
that IDENTITY columns are based on sequences that are completly 
disconnected from the table contents and manually imported data will 
lead to errors like 'duplicate key value violates unique constraint 
"xyz_pkey"'.

I had to fight this initially with an insert trigger that always updates 
the sequences on each insert, or with client side code that updates the 
sequence when such an error occurs and then retries the insert.

Even Microsoft Access did a better job at autogenerated primaries keys, 
and while I love the elegant design of PostgreSQL in many ways I believe 
we can do better here. I would like to implement a fallback solution 
that detects such errors and automatically updates the nextvalue of the 
sequence when the nextvalue is already used on insert.

I believe this can be implemented without affecting performance 
negatively when one just does extra stuff in the error case, so I 
wouldn't do table scans when creating the insert initially.

Any reasons why this isn't a good idea to try?

Regards,
Daniel Migowski

Re: Proposal: Better generation of values in GENERATED columns.

From

Peter Eisentraut

Date:

26 August 2019, 17:18:55

On 2019-08-26 13:28, Daniel Migowski wrote:
> I would like to implement a fallback solution 
> that detects such errors and automatically updates the nextvalue of the 
> sequence when the nextvalue is already used on insert.

ISTM that such a system would likely have many of the same problems as
the often-proposed ignore-errors mode for COPY, namely that you can't
catch errors and do something else other than rethrowing the error.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Proposal: Better generation of values in GENERATED columns.

From

Tom Lane

Date:

26 August 2019, 17:46:41

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> On 2019-08-26 13:28, Daniel Migowski wrote:
>> I would like to implement a fallback solution 
>> that detects such errors and automatically updates the nextvalue of the 
>> sequence when the nextvalue is already used on insert.

> ISTM that such a system would likely have many of the same problems as
> the often-proposed ignore-errors mode for COPY, namely that you can't
> catch errors and do something else other than rethrowing the error.

In principle you could probably use the same infrastructure used by
ON CONFLICT to detect the unique-key violation.  But ON CONFLICT is
mighty complicated, and not very cheap either.  I don't for one second
believe Daniel's assertion that this could be done without a
significant COPY performance hit.

I'm also dubious that the right response to a duplicate key would be
as simple as "try the next nextval() result".  I for one wouldn't be
satisfied with waiting for COPY to grind through a few thousand/million
sequential nextval values before finding one that doesn't conflict with
the existing table entries.

The actually-sound advice for loading data that might have conflicting
serial values is to do the equivalent of

    setval('sequence', max(existing keys) + 1)

before you start loading.  I wonder whether there's a way to make that
simpler.

            regards, tom lane