Hi,
I just saw this got committed and wanted to briefly play with it. It works
nicely!
Except that at first I tried this in a debugging build, and was briefly rather
dismayed by the performance. It was really slow. But it's not really related
to repack / the patches here.
Turns out that it is to a good part due to
heap_insert()
->CacheInvalidateHeapTuple()
->CacheInvalidateHeapTupleCommon()
->AssertCouldGetRelation()
not being cheap and running a *lot*.
Admittedly it's way worse if you build with -O0, which I tend to do to make
debugging easier.
In that config, the assert single-handled increases the time for a repack by
35% or so.
Noah, is there any reason we need to do the AssertCouldGetRelation() before
the !IsCatalogRelation(relation)? Given that the goal is to make
RelationGetRelid() safe, it doesn't seem there is?
It's totally valid to not have done so initially, this is a quite complicated
feature:
I saw this is using individual heap_insert()s during the
heapam_relation_copy_for_cluster(). Doing individual WAL logged inserts isn't
exactly cheap or efficient from a WAL volume perspective...
Is there anything other than round tuits preventing us from using
multi_insert?
That actually would also reduce the cost in the REPACK decoding worker, due to
having to parse far fewer WAL records.
Greetings,
Andres Freund