Re: BUG #5412: test case produced, possible race condition. - Mailing list pgsql-bugs
From | Rusty Conover |
---|---|
Subject | Re: BUG #5412: test case produced, possible race condition. |
Date | |
Msg-id | 99EABEAB-D3C1-4EF7-A958-639317F8778C@infogears.com Whole thread Raw |
In response to | Re: BUG #5412: Crash in production SIGSEGV, equalTupleDescs (tupdesc1=0x7f7f7f7f, tupdesc2=0x966508c4) at tupdesc.c (Rusty Conover <rconover@infogears.com>) |
Responses |
Re: BUG #5412: test case produced, possible race condition.
|
List | pgsql-bugs |
Hi Heikki and everybody else, It seems like this is a race condition cause by the system catalog cache no= t being locked properly. I've included a perl script below that causes the = crash on my box consistently. The script forks two different types of processes: #1 - begin transaction, create a few temp tables and analyze them in a tran= saction, commit (running in database foobar_1) #2 - begin transaction, truncate table, insert records into table from sele= ct in a transaction, commit (running in database foobar_2) I setup the process to have 10 instances of task #1 and 1 instance of task = #2. Running this script causes the crash of postgres within seconds on my box. If you change the parameters to say <6 of task #1, no crash happens, but if= you have >7 the crash does happen. The box that I'm running the script on has 8 cores, so CPU contention and s= ome improper locking might cause some of the problem.=20 The specs of the box are: Fedora release 10 (Cambridge) Intel(R) Xeon(R) CPU E5420 @ 2.50GHz glibc-2.9-3.i686 Linux 2.6.27.30-170.2.82.fc10.i686.PAE #1 SMP Mon Aug 17 08:24:23 EDT 2009 = i686 i686 i386 GNU/Linux PostgreSQL: 8.4.3 I tried to reproduce it on one of my 16-core x64 boxes and the same crash d= oesn't occur, also I tried on a dual core box and couldn't get a crash but= I haven't exhaustively tested the right number of parameters for task #1. If the script doesn't cause a crash for you please try changing the variabl= e $total_job_1_children to be a greater number then the number of CPU cores= of the machine that you're running it on. Any help would be appreciated and if I can be of further assistance please = let me know, Rusty -- Rusty Conover rconover@infogears.com InfoGears Inc / GearBuyer.com / FootwearBuyer.com http://www.infogears.com http://www.gearbuyer.com http://www.footwearbuyer.com #!/usr/bin/perl use strict; use warnings; use DBI; use POSIX ":sys_wait_h"; # Number of children for job1, create temp tables and analyze them # The number of jobs here matters: (on my 8 core box you need to have some = contention to get a failure) # >11=3Dfail # 10=3Dfail # 9=3Dfail # 8=3Dfail # 7=3Dfail # <6 works, my $total_job_1_children =3D 11; # Number of children for job 2 run a truncate and insert query loop. # we only need one of these jobs to be running really, because the truncate= locks. my $total_job_2_children =3D 1; # Just need two databases on your machine, foobar_1 and foobar_2 are the de= faults. my $database_1_dsn =3D ['dbi:Pg:dbname=3Dfoobar_1', 'postgres']; my $database_2_dsn =3D ['dbi:Pg:dbname=3Dfoobar_2', 'postgres']; # Do some setup transactions. if(1) { my $dbh =3D DBI->connect(@$database_2_dsn); $dbh->do("drop table foo_dest"); $dbh->do("drop table foobar_source"); $dbh->begin_work(); eval { $dbh->do("create table foobar_source (id integer, name text, size intege= r)") || die("Failed to create foobar_source: " . $dbh->errstr()); for(my $k =3D 0; $k < 3500; $k++) { $dbh->do("insert into foobar_source (id, name, size) values (?, 'test = me', ?)", undef, $k, int(rand(400000))) || die("Failed to insert into fooba= r_source: " . $dbh->errstr()); } $dbh->do("analyze foobar_source"); $dbh->do("create table foo_dest (id integer, name text, size integer)"); }; if($@) { print "Error doing init of tables: " . $@ . "\n"; $dbh->rollback(); $dbh->disconnect(); exit(0); } $dbh->commit(); $dbh->disconnect(); } my @child_pids; for(my $i =3D0; $i < $total_job_1_children; $i++) { print "Forking\n"; my $pid =3D fork(); if($pid =3D=3D 0) { run_child('job1'); exit(0); } else { push @child_pids, $pid; } } for(my $i =3D0; $i < $total_job_2_children; $i++) { print "Forking\n"; my $pid =3D fork(); if($pid =3D=3D 0) { run_child('job2'); exit(0); } else { push @child_pids, $pid; } } foreach my $pid (@child_pids) { print "Waiting for $pid\n"; waitpid($pid, 0); print "Got it\n"; } exit(0); sub run_child { my $job_type =3D shift; my $dsn; if($job_type eq 'job1') { $dsn =3D $database_1_dsn; } else { $dsn =3D $database_2_dsn; } my $dbh =3D DBI->connect(@$dsn); defined($dbh) || die("Failed to get connection to database"); for(my $i =3D0; $i < 400; $i++) { $dbh->begin_work(); eval { if($job_type eq 'job1') { $dbh->{Warn} =3D 0; $dbh->do("create temp table c_products (id INTEGER NOT NULL, product_name_= stemmed text, average_price numeric(12,2), cset_bitmap bit(437), gender tex= t) WITHOUT OIDS ON COMMIT DROP"); $dbh->do("create temp table c_products_oids (c_products_id INTEGER NOT NUL= L, oid INTEGER NOT NULL UNIQUE, price numeric(12,2) not null, product_name_= stemmed text not null) WITHOUT OIDS ON COMMIT DROP"); $dbh->{Warn} =3D 1; =09 $dbh->do("analyze c_products"); $dbh->do("analyze c_products_oids"); } else { $dbh->do("truncate table foo_dest"); $dbh->do("insert into foo_dest (id, name, size) select id, name, size from= foobar_source"); } }; if($@) { print "Got error in job $job_type: $@\n"; $dbh->rollback(); $dbh->disconnect(); exit(0); } $dbh->commit(); } $dbh->disconnect(); print "Child finished\n"; return; }
pgsql-bugs by date: