Re: [HACKERS] GSoC 2017: Foreign Key Arrays - Mailing list pgsql-hackers

From Mark Rofail
Subject Re: [HACKERS] GSoC 2017: Foreign Key Arrays
Date
Msg-id CAJvoCutyJSuOp_jdJBqmQ=dGonEX_xN1ZwFpEyzmO4FaPhx0DQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] GSoC 2017: Foreign Key Arrays  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: [HACKERS] GSoC 2017: Foreign Key Arrays  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers
On Tue, 10 Apr 2018 at 4:17 pm, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Mark Rofail wrote:
I meant for the GIN operator. (Remember, these are two patches, and each
of them needs its own tests.)
Yes, you are right. I have been dealing with the code as a single patch that I almost forgot.

True.  So my impression from the numbers you posted last time is that
you need to run each measurement case several times, and provide
averages/ stddevs/etc for the resulting numbers, and see about outliers
(maybe throw them away, or maybe they indicate some problem in the test
or in the code); then we can make an informed decision about whether the
variations between the several different scenarios are real improvements
(or pessimizations) or just measurement noise.
I'd rather just throw away the previous results and start over with new performance tests. However, like I showed you, it was my first time to write performance tests. If there's something I could use as a reference that would help me so much.

In particular: it seemed to me that you decided to throw away the idea
of the new GIN operator without sufficient evidence that it was
unnecessary.
I have to admit to that. But in my defence @> is also GIN indexable so the only difference in performance between 'anyarray @>> anyelement' and 'anyarray @> ARRAY [anyelement]' is the delay caused by the ARRAY[] operation theoretically. 

I apologise, however, I needed more evidence to support my claims.

Regards

On Tue, Apr 10, 2018 at 4:17 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Mark Rofail wrote:
> On Tue, Apr 10, 2018 at 3:59 PM, Alvaro Herrera <alvherre@alvh.no-ip.org>
> wrote:
> >
> > documentation to it and a few extensive tests to ensure it works well);
>
> I think the existing regression tests verify that the patch works as
> expectations, correct?

I meant for the GIN operator. (Remember, these are two patches, and each
of them needs its own tests.)

> We need more *exhaustive* tests to test performance, not functionality.

True.  So my impression from the numbers you posted last time is that
you need to run each measurement case several times, and provide
averages/ stddevs/etc for the resulting numbers, and see about outliers
(maybe throw them away, or maybe they indicate some problem in the test
or in the code); then we can make an informed decision about whether the
variations between the several different scenarios are real improvements
(or pessimizations) or just measurement noise.

In particular: it seemed to me that you decided to throw away the idea
of the new GIN operator without sufficient evidence that it was
unnecessary.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: [HACKERS] path toward faster partition pruning
Next
From: Amit Langote
Date:
Subject: Re: [HACKERS] path toward faster partition pruning