Re: [HACKERS] rewrite HeapSatisfiesHOTAndKey - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: [HACKERS] rewrite HeapSatisfiesHOTAndKey
Date
Msg-id CABOikdMUQQs4BnJ4Ws-ObOEDh8vhNp13Y1caK_i8seSHKPjbhw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] rewrite HeapSatisfiesHOTAndKey  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] rewrite HeapSatisfiesHOTAndKey  (Amit Kapila <amit.kapila16@gmail.com>)
Re: [HACKERS] rewrite HeapSatisfiesHOTAndKey  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers


On Tue, Jan 3, 2017 at 9:33 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Mon, Jan 2, 2017 at 1:36 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > Okay, but I think if we know how much is the additional cost in
> > average and worst case, then we can take a better call.
>
> Yeah.  We shouldn't just rip out optimizations that are inconvenient
> without doing some test of what the impact is on the cases where those
> optimizations are likely to matter.  I don't think it needs to be
> anything incredibly laborious and if there's no discernable impact,
> great.  


So I performed some tests to measure if this causes any noticeable regression. I used the following simple schema:

DROP TABLE IF EXISTS testtab;
CREATE UNLOGGED TABLE testtab (
    col1 integer,
    col2 text,
    col3 float,
    col4 text,
    col5 text,
    col6 char(30),
    col7 text,
    col8 date,
    col9 text,
    col10 text
);
INSERT INTO testtab
    SELECT generate_series(1,100000),
        md5(random()::text),
        random(),
        md5(random()::text),
        md5(random()::text),
        md5(random()::text)::char(30),
        md5(random()::text),
        now(),
        md5(random()::text),
        md5(random()::text);
CREATE INDEX testindx ON testtab (col1, col2, col3, col4, col5, col6, col7, col8, col9);

I used a rather wide UNLOGGED table with an index on first 9 columns, as suggested by Amit. Also, the table has reasonable number of rows, but not more than what shared buffers (set to 512MB for these tests) can hold. This should make the test mostly CPU bound.

A transaction then updates the second column in the table. So the refactored patch will do heap_getattr() on more columns that the master while checking if HOT update is possible and before giving up. I believe we are probably testing a somewhat worst case with this setup, though may be I could have tuned some other configuration parameters.

\set value random(1, 100000)
UPDATE testtab SET col2 = md5(random()::text) WHERE col1 = :value;

I tested with -c1 and -c8 -j4 and the results are:

1-client
              Master                     Refactored
Run1 8774.089935 8979.068604
Run2 8509.2661 8943.613575
Run3 8879.484019 8950.994425


8-clients
              Master                     Refactored
Run1 22520.422448 22672.798871
Run2 21967.812303 22022.969747
Run3 22305.073223 21909.945623


So at best there is some improvement with the patch, though I don't see any reason why it should positively affect the performance. The results with more number of clients look almost identical, probably because the bottleneck shifts somewhere else. For all these tests, table was dropped and recreated in every iteration, so I don't think there was any error in testing. It might be a good idea for someone else to repeat the tests to confirm the improvement that I noticed.

Apart from this, I also ran "make check" multiple times and couldn't find any significant difference in the average time.

I will leave it to Alvaro's judgement to decide whether it's worth to commit the patch now or later when he or other committer looks at committing WARM/indirect indexes because without either of those patches this change probably does not bring up much value, if we ignore the slight improvement we see here.

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: [HACKERS] proposal: session server side variables
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] Declarative partitioning - another take