Thread: RelOptInfo -> Relation

RelOptInfo -> Relation

From
Robert Haas
Date:
Is there some good reason why get_relation_info() copies all of the
data that will be needed later in planning into the RelOptInfo instead
of just storing a pointer to the Relation directly into the
RelOptInfo?  Of course, if we did that, then it would have to leave
the relation open instead of closing it, and similarly for the
indexes, but so what?  As far as I understand it, we're keeping a lock
on the relation, so none of this stuff should be able to change under
it, and pointing to data is faster than copying it.

I think that this problem has gradually gotten worse as we've added
features to the system.  Between 7.4 and master, we've added the
following stuff to what this function needs to propagate into the
RelOptInfo: tablespace, attr_needed, attr_width, allvisfrac,
rel_parallel_workers, statlist, serverid, fdwroutine, fkey_list,
part_scheme, boundinfo, nparts, partexprs.  That's a fair amount of
stuff, and IndexOptInfo includes more things now, too.  The whole
thing seems pretty inefficient.  get_relation_statistics() stores a
list of statistics object OIDs in the relcache and then, in
get_relation_statistics(), builds a StaExtInfo for each one.  That,
however, means that the StaExtInfo is getting rebuilt for every query.
If the RelOptInfo could point directly into the relcache, then we
could build that stuff once when the relcache entry was created, or
maybe on demand, and then keep it forever.  But if we did that today
it wouldn't work too well: we'd still have to copy the result from the
relcache into the RelOptInfo.  The just-copy-it approach is what
set_relation_partition_info does for rel->boundinfo, and that
similarly seems like a waste of cycles.  I have a feeling we're all
cargo-culting each new feature into a system that we may have
outgrown, but maybe there's some reason behind this that I'm missing.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: RelOptInfo -> Relation

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Is there some good reason why get_relation_info() copies all of the
> data that will be needed later in planning into the RelOptInfo instead
> of just storing a pointer to the Relation directly into the
> RelOptInfo?  Of course, if we did that, then it would have to leave
> the relation open instead of closing it, and similarly for the
> indexes, but so what?  As far as I understand it, we're keeping a lock
> on the relation, so none of this stuff should be able to change under
> it, and pointing to data is faster than copying it.

We are not keeping a lock strong enough to prevent the relcache entry
from changing altogether, e.g. an ANALYZE could commit midway through
and change, say, the relpages/reltuples info.  The consequences of
that for consistent planner behavior are not clear to me, but they're
probably not something to dismiss out of hand.

Also, there's a big difference between the entry's contents being
logically unchanging and being physically unchanging: we could get a
relcache inval event that forces a rebuild, and only very limited parts of
the entry are guaranteed not to move around if that happens.  There's a
general rule that code outside the relcache shouldn't keep pointers to
most parts of a relcache entry's infrastructure, and for those parts where
we do provide a guarantee (like the tupdesc), it doesn't come for free.
Maybe the planner's uses of the info wouldn't be any more dangerous than
anything else's, but again it's not clear.

I'm disinclined to monkey with the way this works without someone
presenting hard evidence that it creates enough of a performance problem
to be worth spending a significant amount of time changing it.  Up to
now I don't think I've ever noticed plancat.c being a large fraction
of planner runtime, though of course that might depend on the test case.

            regards, tom lane


Re: RelOptInfo -> Relation

From
Robert Haas
Date:
On Fri, Feb 2, 2018 at 7:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> We are not keeping a lock strong enough to prevent the relcache entry
> from changing altogether, e.g. an ANALYZE could commit midway through
> and change, say, the relpages/reltuples info.  The consequences of
> that for consistent planner behavior are not clear to me, but they're
> probably not something to dismiss out of hand.

Right -- stuff that could actually change during planning should
probably be copied, but things like the partition key and partition
bounds are annoyingly large to copy and cannot change during planning.

> Also, there's a big difference between the entry's contents being
> logically unchanging and being physically unchanging: we could get a
> relcache inval event that forces a rebuild, and only very limited parts of
> the entry are guaranteed not to move around if that happens.  There's a
> general rule that code outside the relcache shouldn't keep pointers to
> most parts of a relcache entry's infrastructure, and for those parts where
> we do provide a guarantee (like the tupdesc), it doesn't come for free.
> Maybe the planner's uses of the info wouldn't be any more dangerous than
> anything else's, but again it's not clear.

A good number of bits of the relcache entry defend against this sort
of hazard by building a new entry, comparing it to the old one, and
swapping in the new one only if there is a difference.  This technique
is used for the tupledesc, for rules, for policies, and for the
partition key and descriptor, and I suspect it could be profitably
used for other things that we don't currently bother to cache in the
relcache because each query would have to copy/rebuild them anyway.

> I'm disinclined to monkey with the way this works without someone
> presenting hard evidence that it creates enough of a performance problem
> to be worth spending a significant amount of time changing it.  Up to
> now I don't think I've ever noticed plancat.c being a large fraction
> of planner runtime, though of course that might depend on the test case.

Well, I got concerned about this looking at Amit's patch for faster
partition pruning.  It tried to walk around this problem by reopening
the relation elsewhere in the planner, but I didn't think you'd like
that approach, either.  (I don't.)  We can certainly handle that
problem in the same way we've handled previous ones - viz. just copy
more stuff - but at some point that's going to start costing enough to
matter.  And I'm quite sure that the only reason we have to do that
copying is that we close the relation, because we've already done the
work to keep the relevant structures stable as long as the relation is
open and locked.  The executor is already relying on that guarantee.

If we're going to have to change this at some point (and I bet we
are), I'd rather do it before people jam even more stuff into the
current system rather than wait until it gets totally out of hand.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: RelOptInfo -> Relation

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Feb 2, 2018 at 7:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm disinclined to monkey with the way this works without someone
>> presenting hard evidence that it creates enough of a performance problem
>> to be worth spending a significant amount of time changing it.  Up to
>> now I don't think I've ever noticed plancat.c being a large fraction
>> of planner runtime, though of course that might depend on the test case.

> If we're going to have to change this at some point (and I bet we
> are), I'd rather do it before people jam even more stuff into the
> current system rather than wait until it gets totally out of hand.

While I'm prepared to believe that this *could* be a problem, I repeat
that you've offered no hard evidence that it actually *is* a problem,
or might become one in the future.  We could easily expend a significant
amount of effort here for no real return.

            regards, tom lane


Re: RelOptInfo -> Relation

From
Robert Haas
Date:
On Tue, Feb 6, 2018 at 2:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> If we're going to have to change this at some point (and I bet we
>> are), I'd rather do it before people jam even more stuff into the
>> current system rather than wait until it gets totally out of hand.
>
> While I'm prepared to believe that this *could* be a problem, I repeat
> that you've offered no hard evidence that it actually *is* a problem,
> or might become one in the future.  We could easily expend a significant
> amount of effort here for no real return.

Sure.  The point of the email was to ask whether there was some
consideration of which I had not thought which makes this utterly
unthinkable; the answer at least so far seems to be "no".

I think it will be interesting to see what happens with the faster
partition pruning patch.  If it doesn't run into a problem with this,
then there's no urgent need to do anything here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company