Home > mailing lists

Re: Progress on fast path sorting, btree index creation time - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: Progress on fast path sorting, btree index creation time
Date	January 31, 2012 15:47:36
Msg-id	CA+TgmoY+2ZTt82nzp+GX6OevQkEpWv5KFhn8yzxGDWJnUwp9kQ@mail.gmail.com Whole thread
In response to	Re: Progress on fast path sorting, btree index creation time (Peter Geoghegan <peter@2ndquadrant.com>)
Responses	Re: Progress on fast path sorting, btree index creation time
List	pgsql-hackers

Tree view

On Fri, Jan 27, 2012 at 3:33 PM, Peter Geoghegan <peter@2ndquadrant.com> wrote:
> Patch is attached. I have not changed the duplicate functions. This is
> because I concluded that it was the lesser of two evils to have to get
> the template to generate both declarations in the header file, and
> definitions in the .c file - that seemed particularly obscure. We're
> never going to have to expose/duplicate any more comparators anyway.
> Do you agree?

Not really.  You don't really need macros to generate the prototypes;
you could just write them out longhand.

I think there's a mess of naming confusion in here, though, as perhaps
best illlustrated by this macro definition:

#define TEMPLATE_QSORT_ARG_HEAP(TYPE, COMPAR)                               \
DO_TEMPLATE_QSORT_ARG(TYPE, COMPAR, inlheap,                                \       SING_ADDITIONAL_CODE,
TYPE##inlheapcomparetup_inline)              \

DO_TEMPLATE_QSORT_ARG(TYPE, COMPAR, regheap,                                \
MULT_ADDITIONAL_CODE(TYPE##regheapAppFunc),                        \           TYPE##regheapcomparetup_inline)

The idea here is that when we have only a single sort key, we include
SING_ADDITIONAL_CODE in the tuple comparison function, whereas when we
have more than one, we instead include MULT_ADDITIONAL_CODE.  Right
there, I think we have a naming problem, because abbreviating "single"
to "sing" and multiple to "mult" is less than entirely clear.  For a
minute or two I was trying to figure out whether our sorting code was
musically inclined, and I'm a native english speaker.  But then we
switch to another set of terminology completely for the generated
functions: inlheap for the single-key case, and regheap for the
multiple-key case.   I find that even more confusing.

I think we ought to get rid of this:

+typedef enum TypeCompar
+{
+       TYPE_COMP_OTHER,
+       TYPE_COMP_INT4,
+       TYPE_COMP_INT8,
+       TYPE_COMP_FLOAT4,
+       TYPE_COMP_FLOAT8,
+       TYPE_COMP_FULL_SPECIALISATION
+} TypeCompar;

Instead, just modify SortSupportData to have two function pointers as
members, one for the single-key case and another for the multiple-key
case, and have the sortsupport functions initialize them to the actual
functions that should be called.  The layer of indirection, AFAICS,
serves no purpose.

> It's pretty easy to remove a specialisation at any time - just remove
> less than 10 lines of code. It's also pretty difficult to determine,
> with everyone's absolute confidence, where the right balance lies.
> Perhaps the sensible thing to do is to not be so conservative in what
> we initially commit, while clearly acknowledging that we may not have
> the balance right, and that it may have to change. We then have the
> entire beta part of the cycle in which to decide to roll back from
> that position, without any plausible downside. If, on the other hand,
> we conservatively lean towards fewer specialisations in the initial
> commit, no one will complain about the lack of an improvement in
> performance that they never had.

Eh, really?  Typically when we do something good, the wolves are
howling at the door to make it work in more cases.

> I think that possibly the one remaining blocker to tentatively
> committing this with all specialisations intact is that I haven't
> tested this on Windows, as I don't currently have access to a Windows
> development environment. I have set one up before, but it's a huge
> pain. Can anyone help me out?

This doesn't strike me as terribly OS-dependent, unless by that we
mean compiler-dependent.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Peter Geoghegan
Date: 31 January 2012, 15:42:47
Subject: Re: Issues with C++ exception handling in an FDW

From: Merlin Moncure
Date: 31 January 2012, 15:49:54
Subject: Re: JSON for PG 9.2

Re: Progress on fast path sorting, btree index creation time - Mailing list pgsql-hackers

Previous

Next