horizontal partition - Mailing list pgsql-performance
From | Gaetano Mendola |
---|---|
Subject | horizontal partition |
Date | |
Msg-id | ctp64h$l4d$1@floppy.pyrenet.fr Whole thread Raw |
Responses |
Re: horizontal partition
|
List | pgsql-performance |
Hi all, I have a big table with ~ 10 Milion rows, and is a very pain administer it, so after years I convinced my self to partition it and replace the table usage ( only for reading ) with a view. Now my user_logs table is splitted in 4: user_logs user_logs_2002 user_logs_2003 user_logs_2004 and the view v_user_logs is builded on top of these tables: CREATE OR REPLACE VIEW v_user_logs AS SELECT * FROM user_logs UNION ALL SELECT * FROM user_logs_2002 UNION ALL SELECT * FROM user_logs_2003 UNION ALL SELECT * FROM user_logs_2004 ; the view is performing really well: empdb=# explain analyze select * from v_user_logs where id_user = sp_id_user('kalman'); QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------- Subquery Scan v_user_logs (cost=0.00..895.45 rows=645 width=88) (actual time=17.039..2345.388 rows=175 loops=1) -> Append (cost=0.00..892.23 rows=645 width=67) (actual time=17.030..2344.195 rows=175 loops=1) -> Subquery Scan "*SELECT* 1" (cost=0.00..120.70 rows=60 width=67) (actual time=17.028..17.036 rows=1 loops=1) -> Index Scan using idx_user_user_logs on user_logs (cost=0.00..120.40 rows=60 width=67) (actual time=17.012..17.018rows=1 loops=1) Index Cond: (id_user = 4185) -> Subquery Scan "*SELECT* 2" (cost=0.00..475.44 rows=316 width=67) (actual time=49.406..1220.400 rows=79 loops=1) -> Index Scan using idx_user_user_logs_2004 on user_logs_2004 (cost=0.00..473.86 rows=316 width=67) (actualtime=49.388..1219.386 rows=79 loops=1) Index Cond: (id_user = 4185) -> Subquery Scan "*SELECT* 3" (cost=0.00..204.33 rows=188 width=67) (actual time=59.375..1068.806 rows=95 loops=1) -> Index Scan using idx_user_user_logs_2003 on user_logs_2003 (cost=0.00..203.39 rows=188 width=67) (actualtime=59.356..1067.934 rows=95 loops=1) Index Cond: (id_user = 4185) -> Subquery Scan "*SELECT* 4" (cost=0.00..91.75 rows=81 width=67) (actual time=37.623..37.623 rows=0 loops=1) -> Index Scan using idx_user_user_logs_2002 on user_logs_2002 (cost=0.00..91.35 rows=81 width=67) (actualtime=37.618..37.618 rows=0 loops=1) Index Cond: (id_user = 4185) Total runtime: 2345.917 ms (15 rows) the problem is now if this view is used in others views like this: CREATE OR REPLACE VIEW v_ua_user_login_logout_tmp AS SELECT u.login, ul.* FROM user_login u, v_user_logs ul WHERE u.id_user = ul.id_user ; empdb=# explain analyze select * from v_ua_user_login_logout_tmp where login = 'kalman'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=4.01..228669.81 rows=173 width=100) (actual time=1544.784..116490.363 rows=175 loops=1) Hash Cond: ("outer".id_user = "inner".id_user) -> Subquery Scan ul (cost=0.00..193326.71 rows=7067647 width=88) (actual time=5.677..108190.096 rows=7067831 loops=1) -> Append (cost=0.00..157988.47 rows=7067647 width=67) (actual time=5.669..77109.995 rows=7067831 loops=1) -> Subquery Scan "*SELECT* 1" (cost=0.00..8158.48 rows=362548 width=67) (actual time=5.666..3379.178 rows=362862loops=1) -> Seq Scan on user_logs (cost=0.00..6345.74 rows=362548 width=67) (actual time=5.645..1395.673 rows=362862loops=1) -> Subquery Scan "*SELECT* 2" (cost=0.00..93663.88 rows=4191588 width=67) (actual time=9.149..35094.798rows=4191580 loops=1) -> Seq Scan on user_logs_2004 (cost=0.00..72705.94 rows=4191588 width=67) (actual time=9.117..16531.486rows=4191580 loops=1) -> Subquery Scan "*SELECT* 3" (cost=0.00..44875.33 rows=2008233 width=67) (actual time=0.562..24017.680rows=2008190 loops=1) -> Seq Scan on user_logs_2003 (cost=0.00..34834.17 rows=2008233 width=67) (actual time=0.542..13224.265rows=2008190 loops=1) -> Subquery Scan "*SELECT* 4" (cost=0.00..11290.78 rows=505278 width=67) (actual time=7.100..3636.163 rows=505199loops=1) -> Seq Scan on user_logs_2002 (cost=0.00..8764.39 rows=505278 width=67) (actual time=6.446..1474.709rows=505199 loops=1) -> Hash (cost=4.00..4.00 rows=1 width=16) (actual time=0.083..0.083 rows=0 loops=1) -> Index Scan using user_login_login_key on user_login u (cost=0.00..4.00 rows=1 width=16) (actual time=0.064..0.066rows=1 loops=1) Index Cond: ((login)::text = 'kalman'::text) Total runtime: 116491.056 ms (16 rows) as you can see the index scan is not used anymore. Do you see any problem on this approach ? Regards Gaetano Mendola
pgsql-performance by date: