Re: Yet another vectorized engine - Mailing list pgsql-hackers

From Hubert Zhang
Subject Re: Yet another vectorized engine
Date
Msg-id CAB0yrekS3ydO3NgRToj18xB1Ez0zCrMsBoMK+rD2JXyKUyjJqA@mail.gmail.com
Whole thread Raw
In response to Re: Yet another vectorized engine  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Responses Re: Yet another vectorized engine  (Hubert Zhang <hzhang@pivotal.io>)
List pgsql-hackers


On Wed, Feb 26, 2020 at 7:59 PM Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:


On 26.02.2020 13:11, Hubert Zhang wrote:


and with JIT:

 13.88%  postgres  postgres             [.] tts_buffer_heap_getsomeattrs
   7.15%  postgres  vectorize_engine.so  [.] vfloat8_accum
   6.03%  postgres  postgres             [.] HeapTupleSatisfiesVisibility
   5.55%  postgres  postgres             [.] bpchareq
   4.42%  postgres  vectorize_engine.so  [.] VExecStoreColumns
   4.19%  postgres  postgres             [.] hashbpchar
   4.09%  postgres  vectorize_engine.so  [.] vfloat8pl


I also tested Q1 with your latest code. Result of vectorized is still slow.
PG13 native: 38 secs
PG13 Vec: 30 secs
PG13 JIT: 23 secs
PG13 JIT+Vec: 27 secs


It is strange that your results are much slower than my and profile is very different.
Which postgres configuration you are using?


./configure CFLAGS="-O3 -g -march=native" --prefix=/usr/local/pgsql/ --disable-cassert --enable-debug --with-llvm
 I also use `PGXS := $(shell $(PG_CONFIG) --pgxs)` to compile vectorized_engine. So it will share the same compile configuration.

My perf result is as belows. There are three parts: 
1. lookup_hash_entry(43.5%) this part is not vectorized yet.
It is vectorized in some sense: lookup_hash_entry performs bulk of hash lookups and pass array with results of such lookups to aggregate transmit functions.
It will be possible to significantly increase speed of HashAgg if we store data in order of grouping attributes and use RLE (run length encoding) to peform just one
hash lookup for group of values. But it requires creation of special partitions (like it is done in Vertica and VOPS).


Yes, Vertica's partition needed to be pre-sorted on user defined columns. So for TPCH Q1 on Postgres, we could not have that assumption. And my Q1 plan uses HashAgg instead of GroupAgg based on cost.
 
2. scan part: fetch_input_tuple(36%)
3. vadvance_aggregates part(20%)
I also perfed on PG96 vectorized version and got similar perf results and running time of vectorized PG96 and PG13 are also similar. But PG13 is much faster than PG96. So I just wonder whether we merge all the latest executor code of PG13 into the vectorized PG13 branch?

Sorry, I do not understand the question. vectorize_executor contains patched versions of nodeSeqscan  and nodeAgg from standard executor.
When performing porting to PG13, I took the latest version of nodeAgg and tried to apply your patches to it. Certainly not always it was possible and I have to rewrite a lt of places. Concerning nodeSeqscan - I took old version from vectorize_executor and port it to PG13.


It is strange that I am not seeing lookup_hash_entry in profile in my case.


So you already have the PG13 nodeAgg, that is good.
Yes, it is strange. Hash table probing is always the costly part.
My perf command `perf record --call-graph dwarf -p pid`
Could you share your lineitem schema and Q1 query?
My schema and Q1 query are:
CREATE TABLE lineitem (
        l_orderkey BIGINT NOT NULL,
        l_partkey INTEGER NOT NULL,
        l_suppkey INTEGER NOT NULL,
        l_linenumber INTEGER NOT NULL,
        l_quantity double precision NOT NULL,
        l_extendedprice double precision NOT NULL,
        l_discount double precision NOT NULL,
        l_tax double precision NOT NULL,
        l_returnflag CHAR(1) NOT NULL,
        l_linestatus CHAR(1) NOT NULL,
        l_shipdate DATE NOT NULL,
        l_commitdate DATE NOT NULL,
        l_receiptdate DATE NOT NULL,
        l_shipinstruct CHAR(25) NOT NULL,
        l_shipmode CHAR(10) NOT NULL,
        l_comment VARCHAR(44) NOT NULL
);
select
    l_returnflag,
    l_linestatus,
    sum(l_quantity) as sum_qty,
    sum(l_extendedprice) as sum_base_price,
    sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
    sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
    avg(l_quantity) as avg_qty,
    avg(l_extendedprice) as avg_price,
    avg(l_discount) as avg_disc,
    count(l_discount) as count_order
from
    lineitem
where
    l_shipdate <= date '1998-12-01' - interval '106 day'
group by
    l_returnflag,
    l_linestatus
;


--
Thanks

Hubert Zhang

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Commit fest manager for 2020-03
Next
From: Thomas Munro
Date:
Subject: Re: Collation versioning