Re: Implementing Incremental View Maintenance - Mailing list pgsql-hackers
| From | Konstantin Knizhnik | 
|---|---|
| Subject | Re: Implementing Incremental View Maintenance | 
| Date | |
| Msg-id | 024ba056-6b4f-b26c-af17-297c1bbca65a@postgrespro.ru Whole thread Raw | 
| In response to | Re: Implementing Incremental View Maintenance (Yugo NAGATA <nagata@sraoss.co.jp>) | 
| Responses | Re: Implementing Incremental View Maintenance Re: Implementing Incremental View Maintenance Re: Implementing Incremental View Maintenance | 
| List | pgsql-hackers | 
On 05.10.2020 12:16, Yugo NAGATA wrote: > Hi, > > Attached is the rebased patch (v18) to add support for Incremental > Materialized View Maintenance (IVM). It is able to be applied to > current latest master branch. > Thank you very much for this work. I consider incremental materialized views as "reincarnation" of OLAP hypercubes. There are two approaches of making OLAP queries faster: 1. speed up query execution (using JIT, columnar store, vector operations and parallel execution) 2. precalculate requested data Incremental materialize views make it possible to implement second approach. But how competitive it is? I do not know current limitations of incremental materialized views, but I checked that basic OLAP functionality: JOIN+GROUP_BY+AGGREGATION is supported. The patch is not applied to the current master because makeFuncCall prototype is changed, I fixed it by adding COAERCE_CALL_EXPLICIT. Then I did the following simple test: 1. Create pgbench database with scale 100. pgbench speed at my desktop is about 10k TPS: pgbench -M prepared -N -c 10 -j 4 -T 30 -P 1 postgres tps = 10194.951827 (including connections establishing) 2. Then I created incremental materialized view: create incremental materialized view teller_sums as select t.tid,sum(abalance) from pgbench_accounts a join pgbench_tellers t on a.bid=t.bid group by t.tid; SELECT 1000 Time: 20805.230 ms (00:20.805) 20 second is reasonable time, comparable with time of database initialization. Then obviously we see advantages of precalculated aggregates: postgres=# select * from teller_sums where tid=1; tid | sum -----+-------- 1 | -96427 (1 row) Time: 0.871 ms postgres=# select t.tid,sum(abalance) from pgbench_accounts a join pgbench_tellers t on a.bid=t.bid group by t.tid having t.tid=1 ; tid | sum -----+-------- 1 | -96427 (1 row) Time: 915.508 ms Amazing. Almost 1000 times difference! 3. Run pgbench once again: Ooops! Now TPS are much lower: tps = 141.767347 (including connections establishing) Speed of updates is reduced more than 70 times! Looks like we loose parallelism because almost the same result I get with just one connection. 4. Finally let's create one more view (it is reasonable to expect that analytics will run many different queries and so need multiple views). create incremental materialized view teller_avgs as select t.tid,avg(abalance) from pgbench_accounts a join pgbench_tellers t on a.bid=t.bid group by t.tid; It is great that not only simple aggregates like SUM are supported, but also AVG. But insertion speed now is reduced twice - 72TPS. I tried to make some profiling but didn't see something unusual: 16.41% postgres postgres [.] ExecInterpExpr 8.78% postgres postgres [.] slot_deform_heap_tuple 3.23% postgres postgres [.] ExecMaterial 2.71% postgres postgres [.] AllocSetCheck 2.33% postgres postgres [.] AllocSetAlloc 2.29% postgres postgres [.] slot_getsomeattrs_int 2.26% postgres postgres [.] ExecNestLoop 2.11% postgres postgres [.] MemoryContextReset 1.98% postgres postgres [.] tts_minimal_store_tuple 1.87% postgres postgres [.] heap_compute_data_size 1.78% postgres postgres [.] fill_val 1.56% postgres postgres [.] tuplestore_gettuple 1.44% postgres postgres [.] sentinel_ok 1.35% postgres postgres [.] heap_fill_tuple 1.27% postgres postgres [.] tuplestore_gettupleslot 1.17% postgres postgres [.] ExecQual 1.14% postgres postgres [.] tts_minimal_clear 1.13% postgres postgres [.] CheckOpSlotCompatibility 1.10% postgres postgres [.] base_yyparse 1.10% postgres postgres [.] heapgetpage 1.04% postgres postgres [.] heap_form_minimal_tuple 1.00% postgres postgres [.] slot_getsomeattrs So good news is that incremental materialized views really work. And bad news is that maintenance overhead is too large which significantly restrict applicability of this approach. Certainly in case of dominated read-only workload such materialized views can significantly improve performance. But unfortunately my dream that them allow to combine OLAP+OLPT is not currently realized. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
pgsql-hackers by date: