Re: horizontal partition - Mailing list pgsql-performance

From Gaetano Mendola
Subject Re: horizontal partition
Date
Msg-id cu5ehf$q6k$1@floppy.pyrenet.fr
Whole thread Raw
In response to Re: horizontal partition  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
>
>>The issue here is that the planner is capable of "pushing down" the WHERE
>>criteria into the first view, but not into the second, "nested" view, and so
>>postgres materializes the UNIONed data set before perfoming the join.
>
>
>>Thing is, I seem to recall that this particular issue was something Tom fixed
>>a while ago.  Which is why I wanted to know what version Gaetano is using.
>
>
> It's still true that we can't generate a nestloop-with-inner-indexscan
> join plan if the inner side is anything more complex than a single table
> scan.  Since that's the only plan that gives you any chance of not
> scanning the whole partitioned table, it's rather a hindrance :-(
>
> It might be possible to fix this by treating the nestloop's join
> conditions as "push down-able" criteria, instead of the present rather
> ad hoc method for generating nestloop/indexscan plans.  It'd be quite
> a deal of work though, and I'm concerned about how slow the planner
> might run if we did do it like that.
>

I don't know if this will help my attempt to perform an horizontal
partition, if it do I think that it can solve lot of problems out there,
I tried the inheritance technique too:

The table user_logs is the original one, I created two tables extending this one:

CREATE TABLE user_logs_2003_h () inherits (user_logs);
CREATE TABLE user_logs_2002_h () inherits (user_logs);

I defined on this table the index already defined on user_logs.


And this is the result:

empdb=# explain analyze select * from user_logs where id_user = sp_id_user('kalman');
                                                                                QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Result  (cost=0.00..426.33 rows=335 width=67) (actual time=20.891..129.218 rows=98 loops=1)
    ->  Append  (cost=0.00..426.33 rows=335 width=67) (actual time=20.871..128.643 rows=98 loops=1)
          ->  Index Scan using idx_user_user_logs on user_logs  (cost=0.00..133.11 rows=66 width=67) (actual
time=20.864..44.594rows=3 loops=1) 
                Index Cond: (id_user = 4185)
          ->  Index Scan using idx_user_user_logs_2003_h on user_logs_2003_h user_logs  (cost=0.00..204.39 rows=189
width=67)(actual time=1.507..83.662 rows=95 loops=1) 
                Index Cond: (id_user = 4185)
          ->  Index Scan using idx_user_user_logs_2002_h on user_logs_2002_h user_logs  (cost=0.00..88.83 rows=80
width=67)(actual time=0.206..0.206 rows=0 loops=1) 
                Index Cond: (id_user = 4185)
  Total runtime: 129.500 ms
(9 rows)

that is good, but now look what happen in a view like this one:


create view to_delete AS
SELECT v.login,
        u.*
from  user_login v,
       user_logs u
where v.id_user = u.id_user;



empdb=# explain analyze select * from to_delete where login = 'kalman';
                                                                    QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------
  Hash Join  (cost=4.01..65421.05 rows=143 width=79) (actual time=1479.738..37121.511 rows=98 loops=1)
    Hash Cond: ("outer".id_user = "inner".id_user)
    ->  Append  (cost=0.00..50793.17 rows=2924633 width=67) (actual time=21.391..33987.363 rows=2927428 loops=1)
          ->  Seq Scan on user_logs u  (cost=0.00..7195.22 rows=411244 width=67) (actual time=21.385..5641.307
rows=414039loops=1) 
          ->  Seq Scan on user_logs_2003_h u  (cost=0.00..34833.95 rows=2008190 width=67) (actual time=0.024..18031.218
rows=2008190loops=1) 
          ->  Seq Scan on user_logs_2002_h u  (cost=0.00..8764.00 rows=505199 width=67) (actual time=0.005..5733.554
rows=505199loops=1) 
    ->  Hash  (cost=4.00..4.00 rows=2 width=16) (actual time=0.195..0.195 rows=0 loops=1)
          ->  Index Scan using user_login_login_key on user_login v  (cost=0.00..4.00 rows=2 width=16) (actual
time=0.155..0.161rows=1 loops=1) 
                Index Cond: ((login)::text = 'kalman'::text)
  Total runtime: 37122.069 ms
(10 rows)



and how you can see this path is not applicable too :-(

Any other suggestion ?



Regards
Gaetano Mendola








pgsql-performance by date:

Previous
From: John A Meinel
Date:
Subject: Re: query produces 1 GB temp file
Next
From: Greg Stark
Date:
Subject: Re: query produces 1 GB temp file