Thread: strange error sequence on parallel btree creation

strange error sequence on parallel btree creation

From
Alvaro Herrera
Date:
Hi

While trying out the progress report mechanism for btrees, I noticed
this strange chain of errors:

2019-01-29 15:51:55.928 -03 [43789] ERROR:  no se pudo crear el índice único «a_generate_series_idx»
2019-01-29 15:51:55.928 -03 [43789] DETALLE:  La llave (generate_series)=(152) está duplicada.
2019-01-29 15:51:55.928 -03 [43789] SENTENCIA:  create unique index concurrently on a (generate_series);
2019-01-29 15:51:55.928 -03 [44634] ERROR:  no se pudo crear el índice único «a_generate_series_idx»
2019-01-29 15:51:55.928 -03 [44634] DETALLE:  La llave (generate_series)=(31339) está duplicada.
2019-01-29 15:51:55.928 -03 [44634] SENTENCIA:  create unique index concurrently on a (generate_series);
2019-01-29 15:51:55.985 -03 [43670] LOG:  background worker "parallel worker" (PID 44634) terminó con código de salida
1


Note that those come from the same create index: the one on process
46299 must evidently be a parallel worker.  It's weird that two
processes report the index building error.  But even if it were correct,
the CONTEXT line in the other process is not okay ... precisely because
it's the parent.

What I did was

create table a as select * from generate_series(1, 1000000);
insert into a select * from generate_series(1, 80000000);
create index on a (generate_series);

The last command used the laptop disk, excessive use of which cause the whole
thing to stall for a few dozen seconds (I think it's because of the encryption
but I'm not sure).  I then changed lc_messages to C, for pasting here, and
repeated with an external USB drive -- result: it fails cleanly (only one
ERROR).

-- 
Álvaro Herrera


Re: strange error sequence on parallel btree creation

From
Peter Geoghegan
Date:
Hi Álvaro,

On Wed, Jan 30, 2019 at 5:40 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Note that those come from the same create index: the one on process
> 46299 must evidently be a parallel worker.  It's weird that two
> processes report the index building error.  But even if it were correct,
> the CONTEXT line in the other process is not okay ... precisely because
> it's the parent.
>
> What I did was
>
> create table a as select * from generate_series(1, 1000000);
> insert into a select * from generate_series(1, 80000000);
> create index on a (generate_series);

I can see why you'd find that slightly confusing. I'm not sure what
can be done about this scenario specifically, though. It seems to come
down to how the parallel infrastructure works, which is not something
that I gave much input on.

Fundamentally, the parallel infrastructure wants to propagate all
errors that it received before parallel workers were shut down. I
think that that's probably the right thing to do. I'm not sure what
you mean by "But even if it were correct, the CONTEXT line in the
other process is not okay ... precisely because it's the parent".
Perhaps you can go into more detail on that. The CONTEXT looks like it
would look regardless of this race.

In any case, I think that the chances of this happening in production
are pretty slim. The error messages each refer to specific, distinct
pairs of duplicates (duplicated values). It's probably necessary to
have an enormous number of duplicates for things to work out this way.
That's hardly typical.

--
Peter Geoghegan