Wrong rows number expected - Mailing list pgsql-performance

From Mendola Gaetano
Subject Wrong rows number expected
Date
Msg-id 012401c3530f$096eb150$10d4a8c0@mm.eutelsat.org
Whole thread Raw
List pgsql-performance
Hi all,
last week Josh Berkus point my attenction
( see post Wrong plan or what )
to the
fact that in this select:

select *
from user_logs ul,
     user_data ud,
     class_default cd
where
     ul.id_user = ud.id_user and
     ud.id_class = cd.id_class and
     cd.id_provider = 39;


The planner say:
QUERY PLAN
 Hash Join  (cost=265.64..32000.76 rows=40612 width=263) (actual
time=11074.21..11134.28 rows=10 loops=1)
   Hash Cond: ("outer".id_user = "inner".id_user)
   ->  Seq Scan on user_logs ul  (cost=0.00..24932.65 rows=1258965 width=48)
(actual time=0.02..8530.21 rows=1258966 loops=1)
   ->  Hash  (cost=264.81..264.81 rows=331 width=215) (actual
time=30.22..30.22 rows=0 loops=1)
         ->  Nested Loop  (cost=0.00..264.81 rows=331 width=215) (actual
time=29.95..30.20 rows=6 loops=1)
               ->  Seq Scan on class_default cd  (cost=0.00..1.39 rows=1
width=55) (actual time=0.08..0.10 rows=1 loops=1)
                     Filter: (id_provider = 39)
               ->  Index Scan using idx_user_data_class on user_data ud
(cost=0.00..258.49 rows=395 width=160) (actual time=29.82..29.96 rows=6
loops=1)
                     Index Cond: (ud.id_class = "outer".id_class)
 Total runtime: 11135.65 msec
(10 rows)


and the quantity reported in:
Hash Join  (cost=265.64..32000.76 rows=40612 width=263) (actual
time=11074.21..11134.28 rows=10 loops=1)

is wrong about the rows returned,
I did what Josh Berkus suggeted me:

1) Make sure you've VACUUM ANALYZED
2) Adjust the following postgresql.conf statistics:
a) effective_cache_size: increase to 70% of available (not used by other
processes) RAM.
b) random_page_cost: decrease, maybe to 2.
c) default_statistics_target: try increasing to 100
(warning: this will significantly increase the time required to do ANALYZE)


I pushed also default_statistics_target to 1000 but the plan remain the same
with an execution
of 11 secs but If I do the followin 3 equivalent query I obatin the same
result in olny fews ms:


SELECT id_class from class_default where id_provider = 39;
 id_class
----------
       48
(1 row)

SELECT id_user from user_data where id_class in ( 48 );
 id_user
---------
   10943
   10942
   10934
   10927
   10910
   10909
(6 rows)


SELECT * from user_logs where id_user in (
 10943,   10942,   10934,   10927,   10910,  10909
);



May I do something else ?


Thank you in advance
Gaetano Mendola












pgsql-performance by date:

Previous
From: Franco Bruno Borghesi
Date:
Subject: Re: index questions
Next
From: Andrew McMillan
Date:
Subject: Re: Tuning PostgreSQL