Sorry for the long delay.
Let's analyze the scenario of fake data insertion. I want to create a million fake products, sometimes even 100 million (we're on MariaDB now and we plan to migrate to Postgres). My team uses fake data for performance tests and other use cases. Now there could be literally now way to sanitize those records
Another scenario is translations. Even in production we have translation files for more than 20 languages, and for more than 2 thousand keys. That means we need to insert 40 thousand translation records in the production.
Another scenario is updating nested model values for a large hierarchical table. For example, the categories table. Anytime the user changes a record in that table we need to recalculate the nested model for the entire categories and bulk update the results.
The point is, the database schema is not in our hands. We don't know what rules exist on each table and what rules change. And it's not practical and feasible to spend resources on keeping our bulk insertion logic with the database changes.
It's a good design that Postgres add a catch-all handler for each row and report accordingly. Give it 1 million records, and it should give you back 1 million results.
Is there a problem in implementing this? After all one expects the most advanced open source database to support this real-world requirement.
Regards
Saeed
On Sun, 2025-02-09 at 16:00 +0330, me nefcanto wrote:
> @laurenz if I use `insert into` or the `merge` would I be able to bypass records
> with errors? Or would I fail there too? I mean there are lots of ways a record
> can be limited. Unique indexes, check constraints, foreign key constraints, etc.
> What happens in those cases?
With INSERT ... ON CONFLICT, you can only handle primar and unique key violations.
MERGE allows some more freedom, but it also only checks for rows that match existing
rows.
You won't find a command that ignores or handles arbitrary kinds of errors.
You have to figure out what kinds of errors you expect and handle them explicitly
by running queries against the data.
I don't think that a catch-it-all handler that handles all errors would be very
useful. Normally, there are certain errors you want to tolerate, while others
should be considered unrecoverable and lead to errors.
Yours,
Laurenz Albe