Re: Highly Efficient Custom Sorting - Mailing list pgsql-performance

From Robert Haas
Subject Re: Highly Efficient Custom Sorting
Date
Msg-id AANLkTik51nNmr6E_fu5lJtwCmvRhl3frzoMsgvB1-qf2@mail.gmail.com
Whole thread Raw
In response to Re: Highly Efficient Custom Sorting  (Eliot Gable <egable+pgsql-performance@gmail.com>)
Responses Re: Highly Efficient Custom Sorting  (Eliot Gable <egable+pgsql-performance@gmail.com>)
List pgsql-performance
On Sat, Jul 3, 2010 at 4:17 PM, Eliot Gable
<egable+pgsql-performance@gmail.com> wrote:
> Read RFC 2782 on random weighted load balancing of SRV records inside DNS.

It may be asking a bit much to expect people here to read an RFC to
figure out how to help you solve this problem, but...

> I've looked through the documentation on how to re-write this in C, but I
> cannot seem to find good documentation on working with the input array
> (which is an array of a complex type). I also don't see good documentation
> for working with the complex type. I found stuff that talks about
> constructing a complex type in C and returning it. However, I'm not sure how
> to take an input complex type and deconstruct it into something I can work
> with in C. Also, the memory context management stuff is not entirely clear.

...there's no question that writing things in C is a lot more work,
and takes some getting used to.  Still, it's fast, so maybe worth it,
especially since you already know C++, and will therefore mostly just
need to learn the PostgreSQL coding conventions.  The best thing to do
is probably to look at some of the existing examples within the
backend code.  Most of the datatype code is in src/backend/utils/adt.
You might want to look at arrayfuncs.c (perhaps array_ref() or
array_map()); and also rowtypes.c (perhaps record_cmp()).

> Specifically, how do I go about preserving the pointers to the data that I
> allocate in multi-call memory context so that they still point to the data
> on the next call to the function for the next result row? Am I supposed to
> set up some global variables to do that, or am I supposed to take a
> different approach? If I need to use global variables, then how do I deal
> with concurrency?

Global variables would be a bad idea, not so much because of
concurrency as because they won't get cleaned up properly.  Again, the
best thing to do is to look at existing examples, like array_unnest()
in src/backend/utils/adt/arrayfuncs.c; the short answer is that you
probably want to compute all your results on the first call and stash
them in the FuncCallContext (funcctx->user_fctx); and then on
subsequent calls just return one row per call.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

pgsql-performance by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: Extremely high CPU usage when building tables
Next
From: Robert Haas
Date:
Subject: Re: Two "equivalent" WITH RECURSIVE queries, one of them slow.