I have performed comparison of different ways of implementing UPSERT in Postgres. May be it will be interesting not only for me, so I share my results:
So first of all initialization step:
create table jsonb_schemas(id serial, schema bytea primary key); create unique index on jsonb_schemas(id); insert into jsonb_schemas (schema) values ('some') on conflict(schema) do nothing returning id;
Then I test performance of getting ID of exitsed schema:
1. Use plpgsql script to avoid unneeded database modifications:
create function upsert(obj_schema bytea) returns integer as $$ declare obj_id integer; begin select id from jsonb_schemas where schema=obj_schema into obj_id; if obj_id is null then insert into jsonb_schemas (schema) values (obj_schema) on conflict(schema) do nothing returning id into obj_id; if obj_id is null then select id from jsonb_schemas where schema=obj_schema into obj_id; end if; end if; return obj_id; end; $$ language plpgsql;
In parallel execution the plpgsql variant can fail. The possible raise conditions are not handled.
So maybe this is the reason why this is really fast.
With this example I model real use case, where we need to map long key (json schema in this case) to short identifier (serial column in this case). Rows of jsonb_schemas are never updated: it is append-only dictionary. In this assumption no race condition can happen with this PLpgSQL implementation (and other implementations of UPSERT as well).
yes, the performance depends on possibilities - and if you can implement optimistic or pessimistic locking (or if you know so there is not race condition possibility)