Re: Parallel Seq Scan vs kernel read ahead - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Parallel Seq Scan vs kernel read ahead |
Date | |
Msg-id | CAA4eK1+Cu=2qHP0LWaB4jZB2tf7GJmruxapoPXv72UZmdVvQcw@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel Seq Scan vs kernel read ahead (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
RE: Parallel Seq Scan vs kernel read ahead
|
List | pgsql-hackers |
On Tue, Jul 21, 2020 at 8:06 AM k.jamison@fujitsu.com <k.jamison@fujitsu.com> wrote: > > Thank you for the advice. I repeated the test as per your advice and average of 3 runs > per worker/s planned. > It still shows the following similar performance results between Master and Patch V2. > I wonder why there's no difference though. > > The test on my machine is roughly like this: > > createdb test > psql -d test > create table t_heap as select generate_series(1, 100000000) i; > \q > > pg_ctl restart > psql -d test > SET track_io_timing = on; > SET max_parallel_workers_per_gather = 0; > SHOW max_parallel_workers_per_gather; > EXPLAIN (ANALYZE, BUFFERS) SELECT count(*) from t_heap; > \timing > SELECT count(*) from t_heap; > > drop table t_heap; > \q > dropdb test > pg_ctl restart > > Below are the results. Again, almost no discernible difference between the master and patch. > Also, the results when max_parallel_workers_per_gather is more than 4 could be inaccurate > due to my machine's limitation of only having v4 CPUs. Even so, query planner capped it at > 6 workers. > > Query Planner I/O Timings (track_io_timing = on) in ms : > | Worker | I/O READ (Master) | I/O READ (Patch) | I/O WRITE (Master) | I/O WRITE (Patch) | > |--------|-------------------|------------------|--------------------|-------------------| > | 0 | "1,130.777" | "1,250.821" | "01,698.051" | "01,733.439" | > | 1 | "1,603.016" | "1,660.767" | "02,312.248" | "02,291.661" | > | 2 | "2,036.269" | "2,107.066" | "02,698.216" | "02,796.893" | > | 3 | "2,298.811" | "2,307.254" | "05,695.991" | "05,894.183" | > | 4 | "2,098.642" | "2,135.960" | "23,837.088" | "26,537.158" | > | 5 | "1,956.536" | "1,997.464" | "45,891.851" | "48,049.338" | > | 6 | "2,201.816" | "2,219.001" | "61,937.828" | "67,809.486" | > > Query Planner Execution Time (ms): > | Worker | QueryPlanner (Master) | QueryPlanner (Patch) | > |--------|-----------------------|----------------------| > | 0.000 | "40,454.252" | "40,521.578" | > | 1.000 | "21,332.067" | "21,205.068" | > | 2.000 | "14,266.756" | "14,385.539" | > | 3.000 | "11,597.936" | "11,722.055" | > | 4.000 | "12,937.468" | "13,439.247" | > | 5.000 | "14,383.083" | "14,782.866" | > | 6.000 | "14,671.336" | "15,507.581" | > > Based from the results above, the I/O latency increases as number of workers > also increase. Despite that, the query planner execution time is almost closely same > when 2 or more workers are used (14~11s). Same results between Master and Patch V2. > > As for buffers, same results are shown per worker (both Master and Patch). > | Worker | Buffers | > |--------|--------------------------------------------------| > | 0 | shared read=442478 dirtied=442478 written=442446 | > | 1 | shared read=442478 dirtied=442478 written=442414 | > | 2 | shared read=442478 dirtied=442478 written=442382 | > | 3 | shared read=442478 dirtied=442478 written=442350 | > | 4 | shared read=442478 dirtied=442478 written=442318 | > | 5 | shared read=442478 dirtied=442478 written=442286 | > | 6 | shared read=442478 dirtied=442478 written=442254 | > > > SQL Query Execution Time (ms) : > | Worker | SQL (Master) | SQL (Patch) | > |--------|--------------|--------------| > | 0 | "10,418.606" | "10,377.377" | > | 1 | "05,427.460" | "05,402.727" | > | 2 | "03,662.998" | "03,650.277" | > | 3 | "02,718.837" | "02,692.871" | > | 4 | "02,759.802" | "02,693.370" | > | 5 | "02,761.834" | "02,682.590" | > | 6 | "02,711.434" | "02,726.332" | > > The SQL query execution time definitely benefitted from previous run of query planner, > so the results are faster. But again, both Master and Patched have almost the same results. > Nonetheless, the execution time is almost consistent when > max_parallel_workers_per_gather is 2 (default) and above. > > I am definitely missing something. Perhaps I think I could not understand why there's no > I/O difference between the Master and Patched (V2). Or has it been already improved > even without this patch? > I don't think it is strange that you are not seeing much difference because as per the initial email by Thomas this patch is not supposed to give benefits on all systems. I think we wanted to check that the patch should not regress performance in cases where it doesn't give benefits. I think it might be okay to run with a higher number of workers than you have CPUs in the system as we wanted to check if such cases regress as shown by Soumyadeep above [1]. Can you once try with 8 and or 10 workers as well? [1] - https://www.postgresql.org/message-id/CADwEdoqirzK3H8bB%3DxyJ192EZCNwGfcCa_WJ5GHVM7Sv8oenuA%40mail.gmail.com -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: