Re: Temporary tables versus wraparound... again - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Temporary tables versus wraparound... again
Date
Msg-id CAM-w4HPSygr7AL+CmmkOFLx4NO41ZAfLoSK4ZG0CDO=SrqvPEA@mail.gmail.com
Whole thread Raw
In response to Re: Temporary tables versus wraparound... again  (Andres Freund <andres@anarazel.de>)
Responses Re: Temporary tables versus wraparound... again
List pgsql-hackers
On Wed, 5 Apr 2023 at 13:42, Andres Freund <andres@anarazel.de> wrote:

>
> Somehow it doesn't feel right to use vac_update_relstats() in
> heapam_handler.c.
>
> I also don't like that your patch references
> heapam_relation_nontransactional_truncate in AddNewRelationTuple() - we
> shouldn't add more comments piercing tableam than necessary.

I'm really puzzled because this does look like it was in the last
patch on the mailing list archive. But it's definitely not the code I
have here. I guess I did some cleanup that I never posted, so sorry.

I've attached patches using GetOldestNonRemovableTransactinId() and it
seems to have fixed the race condition here. At least I can't
reproduce it any more.



> Not if you determine a relation specific xmin, and the relation is not a
> shared relation.
>
> ISTM that the problem here really is that you're relying on RecentXmin, rather
> than computing something more accurate. Why not use
> GetOldestNonRemovableTransactionId(rel) - It's a bit more expensive, but I
> don't think it'll matter compared to the cost of truncating the relation.

I am a bit nervous about the overhead here because if your transaction
touched *any* temporary tables then this gets called for *every*
temporary table with ON COMMIT DELETE. That could be a lot and it's
not obvious to users that having temporary tables will impose an
overhead even if they're not actually using them.

So I went ahead and used GetOldestNonRemovableTransactionId and tried
to do some profiling. But this is on a cassert enabled build with -O0
so it's not serious profiling. I can repeat it on a real build if it
matters. But it's been a long time since I've read gprof output. This
is for -F PreCommit_on_commit_actions so the percentages are as a
percent of just the precommit cleanup:

index % time    self  children    called     name
                0.00    0.00   10102/10102       CommitTransaction (1051)
[1]    100.0    0.01   31.47   10102         PreCommit_on_commit_actions [1]
                0.01   31.43   10100/10100       heap_truncate [2]
                0.00    0.03 1005050/1005260     lappend_oid [325]
-----------------------------------------------
                0.01   31.43   10100/10100       PreCommit_on_commit_actions [1]
[2]     99.9    0.01   31.43   10100         heap_truncate [2]
                0.09   27.30 1005050/1005050     heap_truncate_one_rel [3]
                0.20    3.57 1005050/6087120     table_open <cycle 1> [465]
                0.01    0.22 1005050/6045137     table_close [48]
                0.00    0.03 1005050/1017744     lappend [322]
                0.01    0.00   10100/10100       heap_truncate_check_FKs [425]
-----------------------------------------------
                0.09   27.30 1005050/1005050     heap_truncate [2]
[3]     87.0    0.09   27.30 1005050         heap_truncate_one_rel [3]
                0.02   12.23 1005050/1005050     RelationTruncateIndexes [5]
                0.06   10.08 1005050/1005050     ResetVacStats [7]
                0.03    4.89 1005050/1005050
table_relation_nontransactional_truncate [12]

I think this is saying that more than half the time is being spent
just checking for indexes. There were no indexes on these temporary
tables. Does not having any indexes cause the relcache treat it as a
cache miss every time?

                0.06   10.08 1005050/1005050     heap_truncate_one_rel [3]
[7]     32.2    0.06   10.08 1005050         ResetVacStats [7]
                0.02    3.83 1005050/1005250     SearchSysCacheCopy [16]
                0.20    3.57 1005050/6087120     table_open <cycle 1> [465]
                0.01    2.02 1005050/1005050     heap_inplace_update [35]
                0.01    0.22 1005050/6045137     table_close [48]
                0.00    0.20 1005050/1005150
GetOldestNonRemovableTransactionId [143]
                0.00    0.01 1005050/1005150     GetOurOldestMultiXactId [421]
                0.00    0.00 1005050/1008750     ObjectIdGetDatum [816]

I guess this means GetOldestNonRemovableTransactionId is not the main
cost in ResetVacStats though I don't understand why the syscache would
be so slow.

I think there's a facility for calculating the Horizons and then
reusing them for a while but I don't see how to use that here. It
would be appropriate I think.


>
> > Honestly I'm glad I wrote the test because it was hard to know whether
> > my code was doing anything at all without it (and it wasn't in the
> > first cut...) But I don't think there's much value in having it be in
> > the regression suite. We don't generally write tests to ensure that a
> > specific internal implementation behaves in the specific way it was
> > written to.
>
> To me it seems important to test that your change actually does what it
> intends to. Possibly the test needs to be relaxed some, but I do think we want
> tests for the change.
>
> Greetings,
>
> Andres Freund



--
greg

Attachment

pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: Clean up hba.c of code freeing regexps
Next
From: Thomas Munro
Date:
Subject: Re: Direct I/O