Thread: Expanded Objects and Order By

Expanded Objects and Order By

From
Paul Ramsey
Date:
I'm starting to think this might not actually be my mistake, but be a
very narrow issue w/ the expanded object code.

So, I've added support for converting postgis in-memory objects into
expanded outputs, and have overwritten the usual
lwgeom_to_gserialized() function with one that creates an expanded
object. I haven't done anything to actually handle expanded objects on
input, but as I understand it, that's fine, everything is supposed to
work as normal when handed expanded objects. And thus far, that has
been true, almost all regression tests work fine.

But here's my narrow case. This works fine with the old
lwgeom_to_gserialized that uses standard flat varlena outputs. It does
the following w/ expanded outputs.

SELECT gid
FROM test_table
ORDER BY st_distance(geom, 'POINT(-305 998.5)'::geometry) LIMIT 5;

ERROR:  could not find pathkey item to sort

Tracing through the debugger, I see the lwgeom_to_gserialized()
function get hit once for the geometry literal, via parse_analyze.
After that, the error shows up shortly.

If I change the query ever so slightly, adding a geometry output column:

SELECT gid, geom
FROM test_table
ORDER BY st_distance(geom, 'POINT(-305 998.5)'::geometry) LIMIT 5;

It runs through to completion, five records returned. As long as the
query includes a geometry output on the output line, it works. A
function that uses a geometry, but not doesn't actually output it,
will still fail, with the same pathkey error. This will fail, for
example.

SELECT gid, ST_AsText(geom)
FROM test_table
ORDER BY st_distance(geom, 'POINT(-305 998.5)'::geometry) LIMIT 5;

So I'm wondering what I should debug next. My code is hardly being
touched at all at this point, and I'm very confident it's memory
clean, etc (the flattener raises hell if the output is shorter or
longer than the expected allocation).

Any pointers as to what to look for, much appreciated.

P.



Re: Expanded Objects and Order By

From
Tom Lane
Date:
Paul Ramsey <pramsey@cleverelephant.ca> writes:
> So, I've added support for converting postgis in-memory objects into
> expanded outputs, and have overwritten the usual
> lwgeom_to_gserialized() function with one that creates an expanded
> object. I haven't done anything to actually handle expanded objects on
> input, but as I understand it, that's fine, everything is supposed to
> work as normal when handed expanded objects. And thus far, that has
> been true, almost all regression tests work fine.

Hmm ... you do need to be able to flatten them again.  In the given
example, the parser is going to want to form a Const node whose Datum
value is a geometry object, and that Const node needs to be copiable
by copyObject(), which means datumCopy() has to work, and if you look
at that it will exercise EOH_get_flat_size/EOH_flatten_into when the
input routine originally produced an expanded object.

The error message is very strange; it's hard to see how toying with the
internal representation of Consts could cause that.  I think the
hypothesis of a memory clobber is stronger than you give it credit for,
especially since you see the behavior changing for seemingly unrelated
reasons.  Valgrind might be a useful tool here.
        regards, tom lane



Re: Expanded Objects and Order By

From
Paul Ramsey
Date:
I have a size/flatten callback setup (and they are very careful not to write outside their boundaries), so that’s all
OK. 
Since you’re not seeing anything “aha” in the error pattern, I’ll go back to the mats on memory… is there a good page
onvalgriding postgresql? I thought the memory manager papered over things so much that valgrind couldn’t see what was
goingon under the covers. 

Thanks!

P

> On Jan 18, 2016, at 8:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Paul Ramsey <pramsey@cleverelephant.ca> writes:
>> So, I've added support for converting postgis in-memory objects into
>> expanded outputs, and have overwritten the usual
>> lwgeom_to_gserialized() function with one that creates an expanded
>> object. I haven't done anything to actually handle expanded objects on
>> input, but as I understand it, that's fine, everything is supposed to
>> work as normal when handed expanded objects. And thus far, that has
>> been true, almost all regression tests work fine.
>
> Hmm ... you do need to be able to flatten them again.  In the given
> example, the parser is going to want to form a Const node whose Datum
> value is a geometry object, and that Const node needs to be copiable
> by copyObject(), which means datumCopy() has to work, and if you look
> at that it will exercise EOH_get_flat_size/EOH_flatten_into when the
> input routine originally produced an expanded object.
>
> The error message is very strange; it's hard to see how toying with the
> internal representation of Consts could cause that.  I think the
> hypothesis of a memory clobber is stronger than you give it credit for,
> especially since you see the behavior changing for seemingly unrelated
> reasons.  Valgrind might be a useful tool here.
>
>             regards, tom lane




Re: Expanded Objects and Order By

From
Tom Lane
Date:
Paul Ramsey <pramsey@cleverelephant.ca> writes:
> Since you’re not seeing anything “aha” in the error pattern, I’ll go back to the mats on memory… is there a good page
onvalgriding postgresql?
 

https://wiki.postgresql.org/wiki/Valgrind

> I thought the memory manager papered over things so much that valgrind couldn’t see what was going on under the
covers.

That used to be true, but it's a lot better now.
        regards, tom lane



Re: Expanded Objects and Order By

From
Paul Ramsey
Date:
Thank the Maker, it is reproduceable: returning an expanded header in the _in function is not appreciated in a very
narrownumber of cases. 

Edit arrayfuncs.c:array_in(), change the return line thus:
    // PG_RETURN_ARRAYTYPE_P(retval);    PG_RETURN_DATUM(expand_array(PointerGetDatum(retval), CurrentMemoryContext,
my_extra));

And here is a small test case that exercises it:

CREATE TABLE orderby_expanded ( id integer, a integer[]
);

INSERT INTO orderby_expanded (id, a) VALUES (1, ARRAY[1,2]);

-- works
SELECT id, a FROM orderby_expanded
ORDER BY a = '{1,2}'::integer[];

-- ERROR:  could not find pathkey item to sort
SELECT id FROM orderby_expanded
ORDER BY a = '{1,2}'::integer[];

-- works
SELECT id FROM orderby_expanded
ORDER BY a = ARRAY[1,2];




Re: Expanded Objects and Order By

From
Tom Lane
Date:
Paul Ramsey <pramsey@cleverelephant.ca> writes:
> Thank the Maker, it is reproduceable: returning an expanded header in the _in function is not appreciated in a very
narrownumber of cases.
 

Thanks for finding a test case!  I'll check into it shortly.
        regards, tom lane



Re: Expanded Objects and Order By

From
Tom Lane
Date:
I wrote:
> Paul Ramsey <pramsey@cleverelephant.ca> writes:
>> Thank the Maker, it is reproduceable: returning an expanded header in
>> the _in function is not appreciated in a very narrow number of cases.

> Thanks for finding a test case!  I'll check into it shortly.

So the short answer is that the planner assumes, not unreasonably,
that copying a parse node with copyObject() should produce a node
that's still equal() to the original.  (The proximate cause of the
"could not find pathkey item to sort" error is failure to match
an ORDER BY expression to an equivalence-class item that was made
from it earlier.)

In the case here, coerce_type() produces a Const node whose constvalue
is pointing at the expanded array returned by array_in().  Nothing
further happens to that node until the planner tries to copy it ---
and copyObject will produce a flattened array value, cf datumCopy().
Even if copying made a new expanded object, that object would not be
equal() to the original node according to datumIsEqual's very
simplistic comparisons.

Now that I've seen this, I'm suspicious that we probably have a lot
of other related bugs --- for example, short-header and non-short-header
varlena values would also not be seen as equal().

What I'm thinking is that we had best apply pg_detoast_datum() to any
varlena value being put into a Const node.  That will do nothing in the
normal case where the value already is a simple standard-header varlena
value, but will fix any cases where we might be trying to put a toasted,
expanded, etc value into a Const.  Generally it'd be a bad idea to have a
toast reference in a Const anyhow; for example, there is code at the end
of evaluate_expr() that does the equivalent of this when making a Const
from the result of constant-simplifying an expression.  That got put in
there as the result of hard experience; but now it seems we did not go
deep enough.

An alternative idea would be to make datumIsEqual smarter; but given the
other ways in which it's used I doubt we want it to be trying to do
detoasting.
        regards, tom lane



Re: Expanded Objects and Order By

From
Tom Lane
Date:
Paul Ramsey <pramsey@cleverelephant.ca> writes:
> Thank the Maker, it is reproduceable: returning an expanded header in the _in function is not appreciated in a very
narrownumber of cases.
 

BTW, on further poking around: if you'd had RANDOMIZE_ALLOCATED_MEMORY
enabled, returning an expanded object from an input function would have
failed this test in stringTypeDatum():

#ifdef RANDOMIZE_ALLOCATED_MEMORY
   /*    * For pass-by-reference data types, repeat the conversion to see if the    * input function leaves any
uninitializedbytes in the result.  We can    * only detect that reliably if RANDOMIZE_ALLOCATED_MEMORY is enabled, so
* we don't bother testing otherwise.  The reason we don't want any    * instability in the input function is that
comparisonof Const nodes    * relies on bytewise comparison of the datums, so if the input function    * leaves garbage
thensubexpressions that should be identical may not get    * recognized as such.  See pgsql-hackers discussion of
2008-04-04.   */   if (string && !typform->typbyval)   {       Datum        result2;
 
       result2 = OidInputFunctionCall(typinput, string,                                      typioparam, atttypmod);
  if (!datumIsEqual(result, result2, typform->typbyval, typform->typlen))           elog(WARNING, "type %s has unstable
inputconversion for \"%s\"",                NameStr(typform->typname), string);   }
 
#endif

The pointer values in the two objects could not be equal so datumIsEqual
would certainly fail.  So one way of looking at this is that the case
indeed should be considered unsupported.

However, what I'm realizing from this discussion is that we need to have a
stronger convention on what we put into Const nodes.  Repeatable results
from the input function are certainly necessary, but we *also* need to
forbid short-header, compressed, or externally toasted values.  Otherwise
we will have Consts that aren't very Const (if the external value goes
away) or fail to compare equal to other Consts that they reasonably should
compare equal to.

Seen in that light, applying pg_detoast_datum() to all varlena values
that are going into Consts is sensible, and it'll take care of the
expanded-object variant of the issue as well.

I've not coded the fix yet, but it looks like the thing to do is to
put "if not-null and typlen==-1 then pg_detoast_datum()" logic into
makeConst() and coerce_type(), which so far as I can find are the only
places that build new Const nodes.  Also, the above-quoted
repeatable-results check has to be applied *after* the detoast step,
else it'll still be a hazard for expanded objects.  However, coerce_type()
is the only caller of stringTypeDatum(), so what we can do is move the
repeatable-results check over to coerce_type() and have it repeat
pg_detoast_datum() as well as the input-function call proper.
        regards, tom lane