BUG #3033: performance issue with self join - Mailing list pgsql-bugs

From Luigi Tarenga
Subject BUG #3033: performance issue with self join
Date
Msg-id 200702192027.l1JKRWWk029032@wwwmaster.postgresql.org
Whole thread Raw
Responses Re: BUG #3033: performance issue with self join  (tomas@tuxteam.de)
List pgsql-bugs
The following bug has been logged online:

Bug reference:      3033
Logged by:          Luigi Tarenga
Email address:      luigi.tarenga@gmail.com
PostgreSQL version: 8.1.4
Operating system:   OpenBSD 4.0 amd64
Description:        performance issue with self join
Details:

hi,
my full version is PostgreSQL 8.1.4 on x86_64-unknown-openbsd4.0, compiled
by GCC cc (GCC) 3.3.5 (propolice)
installed by the OpenBSD port.
i'm not sure this is a bug but i found a strange
behavior during some custom benchmark.
i'm used to run a query like this one on a 14 row table:
select count(*)
from emp7 emp1,
     emp7 emp2,
     emp7 emp3,
     emp7 emp4,
     emp7 emp5,
     emp7 emp6,
     emp7 emp7_7
where emp7_7.deptno=10;

it should return:
   count
-----------
 105413504
(1 row)

and the first time it takes about 30 seconds on my pc.
the problem raise when i make lots of insert in the
same table and then delete new rows until i left the
original 14. if i rerun the select it seems to block
forever (i waited more then 10 minutes).

if i drop the table, recreate it with 14 rows and
rerun the select i can get the result in 30 seconds again.

i have a script to automatize all this procedure.
i use it this way:
./emp7.sh create
./emp7.sh run     #run the select and exit in 30 sec.
./emp7.sh ins20k  #run a 10k insert test
./emp7.sh run     #now it blocks forever (almost)!
./emp7.sh drop    #drop the table, you can restart the procedure from the
"create"
the script code:

#!/bin/sh

if [ "$1" = "create" ]; then
psql postgres <<EOF
create table emp7 (
  empno        int        not null,
  ename        char(10),
  job          char(9),
  mgr          int,
  hiredate     date,
  sal          float,
  comm         float,
  deptno       int
 );
create unique index pkemp7 on emp7(empno);
insert into emp7 (empno, ename, deptno) values(1,'SMITH',10);
insert into emp7 (empno, ename, deptno) values(2,'SMITH',10);
insert into emp7 (empno, ename, deptno) values(3,'SMITH',10);
insert into emp7 (empno, ename, deptno) values(4,'SMITH',10);
insert into emp7 (empno, ename, deptno) values(5,'SMITH',10);
insert into emp7 (empno, ename, deptno) values(6,'SMITH',10);
insert into emp7 (empno, ename, deptno) values(7,'SMITH',10);
insert into emp7 (empno, ename, deptno) values(8,'SMITH',10);
insert into emp7 (empno, ename, deptno) values(9,'SMITH',10);
insert into emp7 (empno, ename, deptno) values(10,'SMITH',10);
insert into emp7 (empno, ename, deptno) values(11,'SMITH',10);
insert into emp7 (empno, ename, deptno) values(12,'SMITH',10);
insert into emp7 (empno, ename, deptno) values(13,'SMITH',10);
insert into emp7 (empno, ename, deptno) values(14,'SMITH',10);
EOF
fi

if [ "$1" = "drop" ]; then
psql postgres <<EOF
DROP TABLE emp7 ;
EOF
fi

if [ "$1" = "run" ]; then
echo
echo ... benchmark query in corso ...
time psql postgres <<EOF
select count(*)
from emp7 emp1,
     emp7 emp2,
     emp7 emp3,
     emp7 emp4,
     emp7 emp5,
     emp7 emp6,
     emp7 emp7_7
where emp7_7.deptno=10;
EOF
fi

if [ "$1" = "ins10k" ]; then
cont=10000
{
  while [ cont -lt 20000 ] ; do
    echo "insert into emp7 (empno, ename, deptno)
values($cont,'SMITH2',10);"
    cont=$(($cont+1))
  done
}| time psql postgres > /dev/null

psql postgres <<EOF
delete from emp7 where ename = 'SMITH2';
EOF
fi

pgsql-bugs by date:

Previous
From: "Dr. Axel Reimann"
Date:
Subject: BUG #3030: Unable to query SQL_WCHAR column via ODBC
Next
From: "Craig White"
Date:
Subject: BUG #3032: Commit hung for days