Re: [HACKERS] [GSoC] Push-based query executor discussion - Mailing list pgsql-hackers

From Arseny Sher
Subject Re: [HACKERS] [GSoC] Push-based query executor discussion
Date
Msg-id 87r31pxouh.fsf@ispras.ru
Whole thread Raw
In response to Re: [HACKERS] [GSoC] Push-based query executor discussion  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
I will share the actual benchmarks and the code to give it another
chance and give an idea how the result looks like. Currently I have
implemented suggested changes to nodes seqscan, hash, hashjoin, limit,
hashed aggregation and in-memory sort. It allows to test q1, q3, q4, q5,
q10, q12 and q14 queries from TPC-H set. Since my goal was just to
estimate performance benefits, there are several restictions:
* ExecReScan is not supported
* only CMD_SELECT operations currently work
* only forward direction is supported.
SRF, subplanes and parallel execution are not supported either because
corresponding nodes are not yet implemented.


Here you can see the results:

+-----+-----------+---------+----------+
|query|reversed, s|master, s|speedup, %|
+-----+-----------+---------+----------+
|q01  |128.53     |138.94   |8.1       |
+-----+-----------+---------+----------+
|q03  |61.53      |67.29    |9.36      |
+-----+-----------+---------+----------+
|q04  |86.27      |95.95    |11.22     |
+-----+-----------+---------+----------+
|q05  |54.44      |56.82    |4.37      |
+-----+-----------+---------+----------+
|q10  |55.44      |59.88    |8.01      |
+-----+-----------+---------+----------+
|q12  |69.59      |76.65    |10.15     |
+-----+-----------+---------+----------+


'reversed' is Postgres with push-based executor, master' is current
master branch. 24 runs were conducted and median of them was
taken. Speedup in % is (master - reversed) / reversed * 100. Scale of
TPC-H database was 40. We use doubles as floating point types instead of
numerics. Only q1 here is fully supported, meaning that that the planner
would anyway choose this plan, even if all other nodes were
implemented. For other queries planner also uses Index Scan, Nested Loop
Semi Join, bitmap scans, Materialize, which are not yet reversed.

postgresql.conf was

shared_buffers = 128GB
maintenance_work_mem = 1GB
work_mem = 8GB
effective_cache_size = 128GB

max_wal_senders = 0
max_parallel_workers_per_gather = 0  # disable parallelism

# disable not yet implemented nodes
set enable_indexscan TO off;
set enable_indexonlyscan TO off;
set enable_material TO off;
set enable_bitmapscan TO off;
set enable_nestloop TO off;
set enable_sort TO off;

Patches are attached, they apply cleanly on 767ce36ff36747.

While in some places patches introduce kind of ugliness which is
described in commit messages and commits, e.g. heapam.h now must know
about PlanState *, I think in others this approach can make the
architecture a bit cleaner.  Specifically, now we can cleanly separate
logic for handling tuples from inner and outer sides (see hashjoin), and
also separate logic for handling NULL tuples. I haven't yet added the
latter, but the idea is that the node below always knows when it is
done, so it can call its parent function for handling null tuples
directly instead of keeping extra 'if' in generic
execProcNode/pushTuple.

--
Arseny Sher

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: [HACKERS] increasing the default WAL segment size
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] exposing wait events for non-backends (was: Trackingwait event for latches)