Thread: BUG #6393: cluster sometime fail under heavy concurrent write load

BUG #6393: cluster sometime fail under heavy concurrent write load

From

maxim.boguk@gmail.com

Date:

11 January 2012, 01:01:16

The following bug has been logged on the website:

Bug reference:      6393
Logged by:          Maxim Boguk
Email address:      maxim.boguk@gmail.com
PostgreSQL version: 9.0.6
Operating system:   Linux Ubuntu
Description:=20=20=20=20=20=20=20=20

I have heavy write-load table under PostgreSQL 9.0.6 and sometime (not
always but more then 50% chance) i'm getting the next error during cluster:

db=3D# cluster public.enqueued_mail;
ERROR:  duplicate key value violates unique constraint
"pg_toast_119685646_index"
DETAIL:  Key (chunk_id, chunk_seq)=3D(119685590, 0) already exists.

chunk_id different each time.

No uncommon datatypes exists in the table.

Currently I work on create reproducible test case (but it seems require 2-3
open write transaction on the table).

Re: BUG #6393: cluster sometime fail under heavy concurrent write load

From

Alvaro Herrera

Date:

11 January 2012, 14:25:20

Excerpts from maxim.boguk's message of mar ene 10 23:00:59 -0300 2012:
> The following bug has been logged on the website:
>=20
> Bug reference:      6393
> Logged by:          Maxim Boguk
> Email address:      maxim.boguk@gmail.com
> PostgreSQL version: 9.0.6
> Operating system:   Linux Ubuntu
> Description:=20=20=20=20=20=20=20=20
>=20
> I have heavy write-load table under PostgreSQL 9.0.6 and sometime (not
> always but more then 50% chance) i'm getting the next error during cluste=
r:
>=20
> db=3D# cluster public.enqueued_mail;
> ERROR:  duplicate key value violates unique constraint
> "pg_toast_119685646_index"
> DETAIL:  Key (chunk_id, chunk_seq)=3D(119685590, 0) already exists.
>=20
> chunk_id different each time.
>=20
> No uncommon datatypes exists in the table.
>=20
> Currently I work on create reproducible test case (but it seems require 2=
-3
> open write transaction on the table).

I don't see how can this be done at all, given that cluster grabs an
exclusive lock on the table in question.  An better example illustrating
what you're really doing would be useful.

--=20
=C3=81lvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: BUG #6393: cluster sometime fail under heavy concurrent write load

From

Maxim Boguk

Date:

12 January 2012, 03:05:40

Hi.

The test case:

The first stage - preparation:

CREATE TABLE test (id serial primary key, flag integer, value text);
cluster "test_pkey" on test;
insert into test select nextval('test_id_seq') as id, 0 as flag,(SELECT
array(select random() from generate_series(1,1000)))::text as value from
generate_series(1,1000) as gs(id);
cluster test;

The second stage - test:

Console one:
postgres@db10:~$ while true; do psql -d test_db -c "begin;update test set
flag=3D(random()*1000)::integer where id IN (select (random()*1000)::integer
from generate_series(1,30)); select pg_sleep(1); commit;"; done
COMMIT
COMMIT
COMMIT
COMMIT
COMMIT
COMMIT
COMMIT
COMMIT
COMMIT
COMMIT

Console two:
postgres@db10:~$ while true; do psql -d test_db -c "cluster test;"; done
CLUSTER
ERROR:  duplicate key value violates unique constraint
"pg_toast_119812558_index"
DETAIL:  Key (chunk_id, chunk_seq)=3D(119811577, 0) already exists.
ERROR:  duplicate key value violates unique constraint
"pg_toast_119812564_index"
DETAIL:  Key (chunk_id, chunk_seq)=3D(119811601, 0) already exists.
ERROR:  duplicate key value violates unique constraint
"pg_toast_119812570_index"
DETAIL:  Key (chunk_id, chunk_seq)=3D(119811494, 0) already exists.
ERROR:  duplicate key value violates unique constraint
"pg_toast_119812576_index"
DETAIL:  Key (chunk_id, chunk_seq)=3D(119811552, 0) already exists.
ERROR:  duplicate key value violates unique constraint
"pg_toast_119812582_index"
DETAIL:  Key (chunk_id, chunk_seq)=3D(119811525, 0) already exists.

PS: no such effects happens on 9.0.4
I think it can be related to TOAST related changes in 9.0.6

On Thu, Jan 12, 2012 at 2:24 AM, Alvaro Herrera
<alvherre@commandprompt.com>wrote:

>
> Excerpts from maxim.boguk's message of mar ene 10 23:00:59 -0300 2012:
> > The following bug has been logged on the website:
> >
> > Bug reference:      6393
> > Logged by:          Maxim Boguk
> > Email address:      maxim.boguk@gmail.com
> > PostgreSQL version: 9.0.6
> > Operating system:   Linux Ubuntu
> > Description:
> >
> > I have heavy write-load table under PostgreSQL 9.0.6 and sometime (not
> > always but more then 50% chance) i'm getting the next error during
> cluster:
> >
> > db=3D# cluster public.enqueued_mail;
> > ERROR:  duplicate key value violates unique constraint
> > "pg_toast_119685646_index"
> > DETAIL:  Key (chunk_id, chunk_seq)=3D(119685590, 0) already exists.
> >
> > chunk_id different each time.
> >
> > No uncommon datatypes exists in the table.
> >
> > Currently I work on create reproducible test case (but it seems require
> 2-3
> > open write transaction on the table).
>
> I don't see how can this be done at all, given that cluster grabs an
> exclusive lock on the table in question.  An better example illustrating
> what you're really doing would be useful.
>
> --
> =C3=81lvaro Herrera <alvherre@commandprompt.com>
> The PostgreSQL Company - Command Prompt, Inc.
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support
>

--=20
Maxim Boguk
Senior Postgresql DBA.

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com

LinkedIn profile: http://nz.linkedin.com/in/maximboguk
"If they can send one man to the moon... why can't they send them all?"

=D0=9C=D0=BE=D0=B9=D0=9A=D1=80=D1=83=D0=B3: http://mboguk.moikrug.ru/
"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."

Re: BUG #6393: cluster sometime fail under heavy concurrent write load

From

Tom Lane

Date:

12 January 2012, 16:24:46

Maxim Boguk <maxim.boguk@gmail.com> writes:
> PS: no such effects happens on 9.0.4
> I think it can be related to TOAST related changes in 9.0.6

Yes, this is a consequence of commit
44b6d53b467bfe848c34c7a8a174779bb2f43c39, which tried to preserve toast
OIDs when doing a table rewrite.  The problem is that CLUSTER needs to
copy both the old and new versions of an updated row, since the old one
is only "recently dead", and both of those versions are pointing at the
same TOAST item because the update didn't affect the toasted column.
So when we save those versions into the new table separately, we try to
write the same TOAST item twice.

We could fix this by having toast_save_datum, in the place where it
decides to adopt an old toast OID for the value (line 1345 in HEAD),
check to see if that OID already exists in the new table.  If so, it
could either just assume that the value matches, or we could add enough
code to verify that there's a match.  The latter seems like overkill,
and yet I'm not 100% comfortable with just assuming a collision is OK.
Comments?

            regards, tom lane

Re: BUG #6393: cluster sometime fail under heavy concurrent write load

From

Tom Lane

Date:

12 January 2012, 17:19:32

I wrote:
> Yes, this is a consequence of commit
> 44b6d53b467bfe848c34c7a8a174779bb2f43c39, which tried to preserve toast
> OIDs when doing a table rewrite.  The problem is that CLUSTER needs to
> copy both the old and new versions of an updated row, since the old one
> is only "recently dead", and both of those versions are pointing at the
> same TOAST item because the update didn't affect the toasted column.
> So when we save those versions into the new table separately, we try to
> write the same TOAST item twice.

BTW, it strikes me that the preceding implementation had its own bug in
this area: in the same situation of having to copy multiple versions of
a row, it would create independent copies of the referenced TOAST value.
Not only is this space-wasteful, but the copies attached to the
already-dead row versions would never be reclaimed, at least not short
of another CLUSTER, because they would be inserted into the toast table
as live not deleted rows.

So preserving toast item OIDs and relying on them for duplicate
detection actually will improve matters in a way unrelated to the
original purpose of commit 44b6d53b467bfe848c34c7a8a174779bb2f43c39.

            regards, tom lane