Datum values consistency within one query - Mailing list pgsql-hackers

From Paul Ramsey
Subject Datum values consistency within one query
Date
Msg-id CACowWR3JEgEQmWJNbRK6UyPcMHdsa8UHKW7i_OTLMfv05JaV2w@mail.gmail.com
Whole thread Raw
Responses Re: Datum values consistency within one query  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Imagine a function that was going to take a table of, say, images, and
so something to them over and over, like:

  SELECT pixel_count(img), clr_name, img_name
  FROM images img
  CROSS JOIN colors clr

When you run this function, you find that a great amount of time is
being spend in the decompression/detoasting routines, so you think: I
have a nested loop here, driven on the 'img' side, if I can avoid
re-loading the big image object over and over I can make things
faster.

Getting the datum value is really fast, so I can have a cache that
keeps the latest detoasted object around, and update it when the datum
changes, and store the cache information in the parent context. Like
so:

struct {
    Datum d;
    bytea *ba;
} DatumCache;

PG_FUNCTION_INFO_V1(pixel_count);
Datum pixel_count(PG_FUNCTION_ARGS)
{
    Datum d = PG_GETARG_DATUM(0);
    DatumCache *dcache = fcinfo->flinfo->fn_extra;
    bytea *ba;

    if (!dcache)
    {
        dcache = MemoryContextAllocZero(fcinfo->flinfo->fn_mcxt,
sizeof(DatumCache));
        fcinfo->flinfo->fn_extra = dcache;
    }

    if (dcache->d != d)
    {
        if (dcache->ba) pfree(dcache->ba);
        MemoryContext old_context =
MemoryContextSwitchTo(fcinfo->flinfo->fn_mcxt);
        dcache->ba = PG_GETARG_BYTEA_P_COPY(0);
        MemoryContextSwitchTo(old_context);
    }

    ba = dcache->ba;

    /* now do things with ba here */
}

Now, notwithstanding any concerns about the particularities of my
example (I've found order-of-magnitude improvements on PostGIS
workloads avoiding the detoasting overhead this way) is my core
assumption correct: within the context of a single SQL statement, will
the Datum values for a particular object remain constant?

They *seem* to, in the examples I'm running. But do they always?



pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: Add A Glossary
Next
From: Tom Lane
Date:
Subject: Re: [PATCH] Fix for slow GIN index queries when "gin_fuzzy_search_limit" setting is relatively small for large tables