Thread: tick buildfarm failure

tick buildfarm failure

From
Stephen Frost
Date:
All,
 I've been keeping an eye on tick as it failed a day-or-so ago and it looks to be related to RLS.  Using a local
CLFAGS="-DCLOBBER_CACHE_ALWAYS-DRANDOMIZE_ALLOCATED_MEMORY" build, I was able to see the regression tests failing in
check_role_for_policy()due to a pretty clear reset of the memory used for the policies. 
 Looking through relcache.c, which I have to admit to not being as familiar with as some, the issue becomes rather
apparent(I believe)- RelationClearRelation() hasn't been set up to handle the RLS policies specially, as it does for
rulesand tupledesc.  Fair enough, it's reasonably straight-forward to add an equalPolicies() and handle the policies
thesame way- but I'm left wondering if that's actually *safe* to do? 
 To be a bit more clear- why is it safe to change the contents if the equal() function returns 'false'?  I'm guessing
theanswer is "locking"- whereby such a change would imply a lock was acquired on the relation by another backend, which
shouldn'tbe possible if we're currently working with it, but wanted to double-check that.  I also wonder if we might be
betteroff with a way to identify that nothing has actually been changed due to the locks we hold and avoid rebuilding
therelation cache entry entirely.. 
     Thanks!
    Stephen

Re: tick buildfarm failure

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
>   To be a bit more clear- why is it safe to change the contents if the
>   equal() function returns 'false'?  I'm guessing the answer is
>   "locking"- whereby such a change would imply a lock was acquired on
>   the relation by another backend, which shouldn't be possible if we're
>   currently working with it, but wanted to double-check that.

Right.  If that were to fail, it would be the fault of something else
not having gotten sufficient lock for whatever it had been doing: either
lack of exclusive lock for an RLS update, or lack of an access lock to
examine the cached data.  The reason we want to make the equal() check
rather than just summarily replace the data is that code that *does*
hold AccessShare might reasonably expect the data to not move while it's
looking at it. 
>   I also wonder if we might be better off with a way to identify that
>   nothing has actually been changed due to the locks we hold and avoid
>   rebuilding the relation cache entry entirely..

I am entirely disinclined to reinvent relcache as part of the RLS patch.
You've got enough problems.
        regards, tom lane



Re: tick buildfarm failure

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> >   To be a bit more clear- why is it safe to change the contents if the
> >   equal() function returns 'false'?  I'm guessing the answer is
> >   "locking"- whereby such a change would imply a lock was acquired on
> >   the relation by another backend, which shouldn't be possible if we're
> >   currently working with it, but wanted to double-check that.
>
> Right.  If that were to fail, it would be the fault of something else
> not having gotten sufficient lock for whatever it had been doing: either
> lack of exclusive lock for an RLS update, or lack of an access lock to
> examine the cached data.  The reason we want to make the equal() check
> rather than just summarily replace the data is that code that *does*
> hold AccessShare might reasonably expect the data to not move while it's
> looking at it.

Ok, great, thanks for the confirmation.  Yes- the RLS code wasn't
anticipating a change happening underneath it such as this (as other
places didn't appear worried about same, I didn't expect to see it be an
issue).  No problem, I'll definitely address it.

> >   I also wonder if we might be better off with a way to identify that
> >   nothing has actually been changed due to the locks we hold and avoid
> >   rebuilding the relation cache entry entirely..
>
> I am entirely disinclined to reinvent relcache as part of the RLS patch.

Apologies- I hadn't intend to even suggest that but rather to speculate
about this possibility for future improvements.
Thanks again!
    Stephen