Per-table random_page_cost for tables that we know are always cached - Mailing list pgsql-hackers
From | PFC |
---|---|
Subject | Per-table random_page_cost for tables that we know are always cached |
Date | |
Msg-id | op.t90b1fmucigqcu@apollo13.peufeu.com Whole thread Raw |
Responses |
Re: Per-table random_page_cost for tables that we know are always cached
Re: Per-table random_page_cost for tables that we know are always cached |
List | pgsql-hackers |
It started with this query : EXPLAIN ANALYZE SELECT * FROM relations r JOIN nodes n ON (n.id=r.child_id) WHERE r.parent_id=16330; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------- HashJoin (cost=370.96..496.29 rows=543 width=273) (actual time=18.887..21.164 rows=543 loops=1) Hash Cond: (r.child_id = n.id) -> Index Scan using relations_unique on relationsr (cost=0.00..111.75 rows=543 width=58) (actual time=0.022..0.519 rows=543 loops=1) Index Cond: (parent_id = 16330) -> Hash (cost=243.76..243.76 rows=10176 width=215) (actual time=18.830..18.830 rows=10176 loops=1) -> Seq Scan on nodes n (cost=0.00..243.76 rows=10176 width=215) (actual time=0.006..5.135 rows=10176 loops=1) Total runtime: 21.453 ms SET enable_hashjoin TO 0; EXPLAIN ANALYZE SELECT * FROM relations r JOIN nodes n ON (n.id=r.child_id) WHERE r.parent_id=16330; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------- NestedLoop (cost=0.00..514.50 rows=543 width=273) (actual time=0.037..4.412 rows=543 loops=1) -> Index Scan using relations_unique on relations r (cost=0.00..111.75 rows=543 width=58) (actual time=0.023..0.476 rows=543 loops=1) Index Cond: (parent_id = 16330) -> Index Scan using nodes_pkey on nodes n (cost=0.00..0.73 rows=1 width=215) (actual time=0.004..0.005 rows=1 loops=543) Index Cond: (n.id = r.child_id) Total runtime: 4.638 ms In order to have it use the fast plan I must set random_page_cost to 1 which I absolutely don't want to do. Setting effective_cache_size to huge values has no effect. If I select a value of parent_id that has much less children, the index will be used, but in this case I think the threshold is misplaced, it should be slightly higher. Here we have about 5% of values selected. Hash join becomes better at about 15% because the table is cached. This is 8.3. Perhaps there would be a need for a per-object setting (object=table,index,partition) to alter the aggressiveness/lazyness of the page flushing and how long the pages for this object are kept in shared_buffers... this would be used to modify random_page_cost on a per-table/index/partition basis. Example : let's imagine a "cache priority" setting. - "cache priority" set to the minimum means this table is mostly write-only - "cache priority" set to default would give current behaviour (which is correct in most cases) - "cache priority" set to a high value would tell Postgres "I know this table/index/partition is small and often accessed rather randomly, so I want you to keep it in shared_buffers, purge it if you must but otherwise keep it in memory, flush something else instead which has lower cache_priority". The optimizer could then use a different (much lower) value of random_page_cost for tables for which "cache priority" is set highest since it would know. An alternative would be for the background writer to keep some stats and do the thing for us : - begin bgwriter scan - setup hashtable of [relid => page count] - at each page that is scanned, increment "page count" for this relation (uses very little CPU) - end bgwriter stats - for each relation, compare the number of pages we found in shared_buffers with the number of pages in the relation and draw conclusions about how well cached the relation is - update random_page_cost accordingly for this relation This would not examine whatever is in the OS' cache, though.
pgsql-hackers by date: