Thread: How does Postgres decide if to use additional workers?
Hi all, I have the following table with an added BTREE-index on "captured_at". > CREATE TABLE datagram > ( > id bigserial NOT NULL, > src_re integer NOT NULL, > src_clt integer NOT NULL, > src_meter integer NOT NULL, > captured_at timestamp with time zone NOT NULL, > captured_rssi smallint NOT NULL, > oms_status smallint NOT NULL, > oms_enc bytea, > oms_dec bytea > ); I have a query containing the following condition: > AND "datagram"."captured_at" BETWEEN (CAST('2020-08-28T10:34:32.855+02:00' AS TIMESTAMP WITH TIME ZONE) - CAST('P5D' ASINTERVAL)) AND (CAST('2020-08-28T10:34:32.855+02:00' AS TIMESTAMP WITH TIME ZONE) + CAST('P0D' AS INTERVAL)) When that query is applied to the above table with it's index on "captured_at", that results in the following query plan. The important thing to note is that NO parallel workers a re used. > -> Hash Join (cost=246164.35..2004405.07 rows=11323 width=51) (actual time=93.802..5776.755 rows=104607 loops=1) > Hash Cond: (meter.meter_bcd = meter_bcd.id) > -> Hash Join (cost=246019.19..2003889.83 rows=68494 width=37) (actual time=93.067..5744.787 rows=104607 loops=1) > Hash Cond: (datagram.src_meter = meter.id) > -> Index Scan using idx_datagram_captured_at_btree on datagram (cost=0.57..1756571.73 rows=495033 width=20) (actualtime=0.054..5451.417 rows=514369 loops=1) > Index Cond: ((captured_at >= ('2020-08-28 10:34:32.855+02'::timestamp with time zone - '5 days'::interval))AND (captured_at <= ('2020-08-28 10:34:32.855+02'::timestamp with time zone + '00:00:00'::interval))) For various reasons I tested the above table as a partitioned one as well, with individual partitions containing the rows of one year only. The important thing to note is that I simply kept the same index on "captured_at" like before, though the query plan looks different now: > Workers Planned: 2 > Workers Launched: 2 > -> Hash Join (cost=245966.53..272335.67 rows=5419 width=51) (actual time=625.846..1560.103 rows=34869 loops=3) > Hash Cond: (datagram_y2020_h2.src_meter = meter.id) > -> Parallel Append (cost=4.19..25430.72 rows=236911 width=20) (actual time=2.827..863.298 rows=171456 loops=3) > Subplans Removed: 23 > -> Parallel Index Scan using datagram_y2020_h2_captured_at_idx on datagram_y2020_h2 (cost=0.44..24051.22 rows=236888width=20) (actual time=2.826..848.388 rows=171456 loops=3) > Index Cond: ((captured_at >= ('2020-08-28 10:34:32.855+02'::timestamp with time zone - '5 days'::interval))AND (captured_at <= ('2020-08-28 10:34:32.855+02'::timestamp with time zone + '00:00:00'::interval))) It seems that only because of a different number of rows per individual table, additional workers are used. Though, in the past I had all of those "captured_at" in one table as well, only with far less columns and for that table additional workers have been used, pretty much like is the case now: > Workers Planned: 2 > Workers Launched: 2 > -> Hash Join (cost=264793.42..1666293.23 rows=4332 width=51) (actual time=96.080..638.802 rows=34869 loops=3) > Hash Cond: (oms_rec.meter = meter.id) > -> Nested Loop (cost=1.14..1400747.39 rows=189399 width=20) (actual time=0.145..496.366 rows=171456 loops=3) > -> Hash (cost=264709.53..264709.53 rows=6620 width=39) (actual time=95.521..95.528 rows=40044 loops=3) > Buckets: 65536 (originally 8192) Batches: 1 (originally 1) Memory Usage: 3016kB > -> Parallel Index Scan using idx_clt_rec_captured_at on clt_rec (cost=0.57..14853.95 rows=189399 width=24)(actual time=0.098..81.556 rows=171456 loops=3) > -> Index Scan using pk_oms_rec on oms_rec (cost=0.57..7.32 rows=1 width=12) (actual time=0.002..0.002 rows=1loops=514369) > -> Hash Join (cost=145.59..264709.53 rows=6620 width=39) (actual time=9.883..86.390 rows=40044 loops=3) > Index Cond: (id = clt_rec.oms_rec) > Index Cond: ((captured_at >= ('2020-08-28 10:34:32.855+02'::timestamp with time zone - '5 days'::interval))AND (captured_at <= ('2020-08-28 10:34:32.855+02'::timestamp with time zone + '00:00:00'::interval))) > Hash Cond: (meter.meter_bcd = meter_bcd.id) So, based on which facts does Postgres decide if to use aadditional workers or not? Can I see those decisions explained somewhere? I don't see anything in the query plan. Thanks! Mit freundlichen Grüßen Thorsten Schöning -- Thorsten Schöning AM-SoFT IT-Service - Bitstore Hameln GmbH i.G. ist ein Mitglied der Bitstore Gruppe - Ihr Full-Service-Dienstleister fürIT und TK E-Mail: Thorsten.Schoening@AM-SoFT.de Web: http://www.AM-SoFT.de/ Telefon: 05151- 9468- 0 Telefon: 05151- 9468-55 Fax: 05151- 9468-88 Mobil: 0178-8 9468-04 AM-SoFT IT-Service - Bitstore Hameln GmbH i.G., Brandenburger Str. 7c, 31789 Hameln AG Hannover HRB neu - Geschäftsführer: Janine Galonska Für Rückfragen stehe ich Ihnen sehr gerne zur Verfügung. Mit freundlichen Grüßen Thorsten Schöning Tel: 05151 9468 0 Fax: 05151 9468 88 Mobil: Webseite: https://www.am-soft.de AM-Soft IT-Service - Bitstore Hameln GmbH i.G. ist ein Mitglied der Bitstore Gruppe - Ihr Full-Service-Dienstleister fürIT und TK AM-Soft IT-Service - Bitstore Hameln GmbH i.G. Brandenburger Str. 7c 31789 Hameln Tel: 05151 9468 0 Bitstore IT-Consulting GmbH Zentrale - Berlin Lichtenberg Frankfurter Allee 285 10317 Berlin Tel: 030 453 087 80 CBS IT-Service - Bitstore Kaulsdorf UG Tel: 030 453 087 880 1 Büro Dallgow-Döberitz Tel: 03322 507 020 Büro Kloster Lehnin Tel: 033207 566 530 PCE IT-Service - Bitstore Darmstadt UG Darmstadt Tel: 06151 392 973 0 Büro Neuruppin Tel: 033932 606 090 ACI EDV Systemhaus Dresden GmbH Dresden Tel: 0351 254 410 Das Systemhaus - Bitstore Magdeburg GmbH Magdeburg Tel: 0391 636 651 0 Allerdata.IT - Bitstore Wittenberg GmbH Wittenberg Tel: 03491 876 735 7 Büro Liebenwalde Tel: 033054 810 00 HSA - das Büro - Bitstore Altenburg UG Altenburg Tel: 0344 784 390 97 Bitstore IT – Consulting GmbH NL Piesteritz Piesteritz Tel: 03491 644 868 6 Solltec IT-Services - Bitstore Braunschweig UG Braunschweig Tel: 0531 206 068 0 MF Computer Service - Bitstore Gütersloh GmbH Gütersloh Tel: 05245 920 809 3 Firmensitz: MF Computer Service - Bitstore Gütersloh GmbH, Gildestraße 25, 33442 Herzebrock-Clarholz Geschäftsführer Janine Galonska
De: "Thorsten Schöning" <tschoening@am-soft.de>
Para: "pgsql-general" <pgsql-general@lists.postgresql.org>
Enviadas: Terça-feira, 9 de fevereiro de 2021 12:52:02
Assunto: How does Postgres decide if to use additional workers?
Para: "pgsql-general" <pgsql-general@lists.postgresql.org>
Enviadas: Terça-feira, 9 de fevereiro de 2021 12:52:02
Assunto: How does Postgres decide if to use additional workers?
So, based on which facts does Postgres decide if to use aadditional
workers or not? Can I see those decisions explained somewhere? I don't
see anything in the query plan. Thanks!
Mit freundlichen Grüßen
Thorsten Schöning
workers or not? Can I see those decisions explained somewhere? I don't
see anything in the query plan. Thanks!
Mit freundlichen Grüßen
Thorsten Schöning
Hi!
I think you should look into min_parallel_table_scan_size and min_parallel_index_scan_size GUCs.
> On Feb 9, 2021, at 10:52 AM, Thorsten Schöning <tschoening@am-soft.de> wrote: > > So, based on which facts does Postgres decide if to use aadditional > workers or not? Can I see those decisions explained somewhere? I don't > see anything in the query plan. Thanks! Hi Thorsten, This is an interesting topic for me too. here’s a formula for the max number of workers that Postgres will consider for atable. Of course, most queries use more than just one table, and I don’t yet understand how Postgres handles the situationwhere the formula suggests multiple workers for some tables and a single worker for others. There was some conversation about this on the performance mailing list in June. Here’s a link to the message that containsthe formula; there’s more items of interest in the whole thread: https://www.postgresql.org/message-id/89423FD3-0F13-447D-8D9E-EB1722150F94%40americanefficient.com Hope this helps Philip