Re: index problem - Mailing list pgsql-sql

From CoL
Subject Re: index problem
Date
Msg-id 3BCCA1B7.9020804@mportal.hu
Whole thread Raw
In response to Re: index problem  (Stephan Szabo <sszabo@megazone23.bigpanda.com>)
List pgsql-sql
Hi,

I did not make vacuum analyze ;), the vacuum  once now:

vacuumdb -Uuser -ddb -v -tprog_dgy_xy

NOTICE:  --Relation prog_dgy_xy--
NOTICE:  Pages 20935: Changed 0, reaped 0, Empty 0, New 0; Tup 921013: 
Vac 0, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 178, MaxLen 184; 
Re-using: Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 
2.71s/0.32u sec.
NOTICE:  Index progdgyxy_idx1: Pages 6679; Tuples 921013. CPU 
1.41s/1.40u sec.
NOTICE:  Index progdgyxy_idx2: Pages 2019; Tuples 921013. CPU 
0.28s/1.28u sec.

I make it with -z too.
So this table has more 921013 rows.
The query show the same as bellow. The version is 7.1.3.

-------------------------
One more interesting: the insering of these rows.
Postgres:
bash-2.04$ time  psql -q -Uuser -f prog_dgy_xy.dump  db
real    131m50.006s
user    3m21.838s
sys     1m20.963s

Mysql:
bash-2.04$ time cat prog_dgy_xy.dump | mysql -uuser -ppass db
real    24m50.137s
user    2m6.629s
sys     1m37.757s

the dump file was: insert into table (...) values (...);

I tried with copy, and to add begin; inserts; commit; , but the result 
with same time :(
[For Oracle 8.1.6 sqlloader it takes 450 sec ;) ]

---------------------------
The 2 table query, where prog_data has ~8800 rowsn and index on prog_id:
bash-2.04$ time echo "explain select distinct 
prog_id,prog_ftype,prog_fcasthour,prog_date from prog_dgy_xy,prog_data 
where pxygy_pid=prog_id " | psql -Uuser db
NOTICE:  QUERY PLAN:

Unique  (cost=7432549.69..7680455.07 rows=2479054 width=32)  ->  Sort  (cost=7432549.69..7432549.69 rows=24790538
width=32)       ->  Merge Join  (cost=148864.65..161189.33 rows=24790538 width=32)              ->  Index Scan using
prog_data_pkeyon prog_data 
 
(cost=0.00..701.12 rows=8872 width=28)              ->  Sort  (cost=148864.65..148864.65 rows=921013 width=4)
        ->  Seq Scan on prog_dgy_xy  (cost=0.00..30145.13 
 
rows=921013 width=4)

Time: !!!
real    2m3.620s

the same query with mysql (i did explain in mysql, and says it use the 
indexes):
real    0m1.998s !!!

I just askin why? and why just using the index on releation "=".
(same table, same index, vacuumed) (made the test more than twice)
It seams to be a 7.1.3 bug? i do not test yet with 7.1.2 but tomorrow i 
will.

CoL

Stephan Szabo wrote:

> On Mon, 15 Oct 2001, Szabo Zoltan wrote:
> 
> 
>>Hi,
>>
>>I have that:
>>
>>1)
>>db=> explain select pxygy_pid from prog_dgy_xy where pxygy_pid=12121;
>>NOTICE:  QUERY PLAN:
>>
>>Group  (cost=0.00..29970.34 rows=921 width=4)
>>   ->  Index Scan using progdgyxy_idx2 on prog_dgy_xy 
>>(cost=0.00..29947.32 rows=9210 width=4)
>>
>>than:
>>2)
>>db=> explain select pxygy_pid from prog_dgy_xy where pxygy_pid>12121;
>>NOTICE:  QUERY PLAN:
>>
>>Group  (cost=66927.88..67695.39 rows=30700 width=4)
>>   ->  Sort  (cost=66927.88..66927.88 rows=307004 width=4)
>>         ->  Seq Scan on prog_dgy_xy  (cost=0.00..32447.66 rows=307004 
>>width=4)
>>
>>I making some banchmarks on: oracle vs postgres vs mysql. And this is 
>>breaking me now;) Mysql and oracle width same table and index use that 
>>index on pxygy_pid;
>>I had vacuum before.
>>
> 
> I assume you mean you did a vacuum analyze (a plain vacuum isn't
> sufficient).  If you did just do a regular vacuum, do a vacuum analyze
> to get the updated statistics.
> 
> How many rows actually match pxygy_pid>12121?  Is 307000 rows a reasonable
> estimate?  How many rows are in the table?
> 
> 
> 





pgsql-sql by date:

Previous
From: Szabo Zoltan
Date:
Subject: Re: index problem
Next
From: "Anthony Bouvier"
Date:
Subject: Multiple Parameters to an Aggregate Function