Re: Fix Brin Private Spool Initialization (src/backend/access/brin/brin.c) - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Fix Brin Private Spool Initialization (src/backend/access/brin/brin.c) |
Date | |
Msg-id | dce5799d-3a15-8ede-0b43-12838d61fc85@enterprisedb.com Whole thread Raw |
In response to | Re: Fix Brin Private Spool Initialization (src/backend/access/brin/brin.c) (Ranier Vilela <ranier.vf@gmail.com>) |
List | pgsql-hackers |
On 12/29/23 14:53, Ranier Vilela wrote: > > > Em sex., 29 de dez. de 2023 às 10:33, Tomas Vondra > <tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>> > escreveu: > > > > On 12/29/23 12:53, Ranier Vilela wrote: > > Em qui., 28 de dez. de 2023 às 22:16, Tomas Vondra > > <tomas.vondra@enterprisedb.com > <mailto:tomas.vondra@enterprisedb.com> > <mailto:tomas.vondra@enterprisedb.com > <mailto:tomas.vondra@enterprisedb.com>>> > > escreveu: > > > > > > > > On 12/27/23 12:37, Ranier Vilela wrote: > > > Em ter., 26 de dez. de 2023 às 19:07, Tomas Vondra > > > <tomas.vondra@enterprisedb.com > <mailto:tomas.vondra@enterprisedb.com> > > <mailto:tomas.vondra@enterprisedb.com > <mailto:tomas.vondra@enterprisedb.com>> > > <mailto:tomas.vondra@enterprisedb.com > <mailto:tomas.vondra@enterprisedb.com> > > <mailto:tomas.vondra@enterprisedb.com > <mailto:tomas.vondra@enterprisedb.com>>>> > > > escreveu: > > > > > > > > > > > > On 12/26/23 19:10, Ranier Vilela wrote: > > > > Hi, > > > > > > > > The commit b437571 > > > <http://b437571714707bc6466abde1a0af5e69aaade09c > <http://b437571714707bc6466abde1a0af5e69aaade09c> > > <http://b437571714707bc6466abde1a0af5e69aaade09c > <http://b437571714707bc6466abde1a0af5e69aaade09c>> > > > <http://b437571714707bc6466abde1a0af5e69aaade09c > <http://b437571714707bc6466abde1a0af5e69aaade09c> > > <http://b437571714707bc6466abde1a0af5e69aaade09c > <http://b437571714707bc6466abde1a0af5e69aaade09c>>>> I > > > > think has an oversight. > > > > When allocate memory and initialize private spool in > function: > > > > _brin_leader_participate_as_worker > > > > > > > > The behavior is the bs_spool (heap and index fields) > > > > are left empty. > > > > > > > > The code affected is: > > > > buildstate->bs_spool = (BrinSpool *) > > palloc0(sizeof(BrinSpool)); > > > > - buildstate->bs_spool->heap = buildstate->bs_spool->heap; > > > > - buildstate->bs_spool->index = > buildstate->bs_spool->index; > > > > + buildstate->bs_spool->heap = heap; > > > > + buildstate->bs_spool->index = index; > > > > > > > > Is the fix correct? > > > > > > > > > > Thanks for noticing this. > > > > > > You're welcome. > > > > > > > > > Yes, I believe this is a bug - the assignments > > > are certainly wrong, it leaves the fields set to NULL. > > > > > > I wonder how come this didn't fail during testing. > Surely, if > > the leader > > > participates as a worker, the tuplesort_begin_index_brin > shall > > be called > > > with heap/index being NULL, leading to some failure > during the > > sort. But > > > maybe this means we don't actually need the heap/index > fields, > > it's just > > > a copy of TuplesortIndexArg, but BRIN does not need that > > because we sort > > > the tuples by blkno, and we don't need the descriptors > for that. > > > > > > Unfortunately I can't test on Windows, since I can't build with > > meson on > > > Windows. > > > > > > > > > In any case, the _brin_parallel_scan_and_build does not > > actually need > > > the separate heap/index arguments, those are already in > the spool. > > > > > > Yeah, for sure. > > > > > > > > > I'll try to figure out if we want to simplify the > tuplesort or > > remove > > > the arguments from _brin_parallel_scan_and_build. > > > > > > > Here is a patch simplifying the BRIN parallel create code a > little bit. > > As I suspected, we don't need the heap/index in the spool at > all, and we > > don't need to pass it to tuplesort_begin_index_brin either - > we only > > need blkno, and we have that in the datum1 field. This also > means we > > don't need TuplesortIndexBrinArg. > > > > With Windows 10, msvc 2022, compile end pass ninja test. > > > > But, if you allow me, I would like to try another approach to > > simplification. > > Instead of increasing the arguments in the call, wouldn't it be better > > to decrease them > > and this way all arguments will be passed in the registers instead > of on > > a stack? > > > > If this was beneficial, we'd be passing everything through structs and > not as explicit arguments. But we don't. If you're arguing it's > beneficial in this case, it'd be good to see it demonstrated. > > Please see the https://www.agner.org/optimize/optimizing_cpp.pdf > <https://www.agner.org/optimize/optimizing_cpp.pdf> > Excerpt: > "Use 64-bit mode > Parameter transfer is more efficient in 64-bit mode than in 32-bit mode, > and more efficient in 64-bit Linux than in 64-bit Windows. In 64-bit > Linux, the first six integer parameters and the first eight floating > point parameters are transferred in registers, totaling up to fourteen > register parameters. In 64-bit Windows, the first four parameters are > transferred in registers, regardless of whether they are integers or > floating point numbers." > > With function: > _brin_parallel_scan_and_build(buildstate, buildstate->bs_spool, > brinshared, sharedsort, heapRel, indexRel, sortmem, false); > We have: > Linux -> six first parameters in registers and two parameters in stack > Windows -> four parameters in registers and four parameters in stack > I suggested you demonstrate this actually makes a difference in practice. Quoting a document is not that. Also, that document is about C++, and while C and C++ are very close, I wouldn't be surprised if there were differences. Furthermore, that section talks about integer/floating point arguments, while we're dealing with pointers, and it's not clear if that changes something (the document has a separate section about pointers/references, which suggests pointers and integers are not 100% the same thing). And finally, I haven't tried disassembling the code, but I'd be quite surprised if these things were not heavily dependent on the compiler and/or optimization level. > > > bs_spool may well contain this data and will probably be useful in the > > future. > > > > I made a v1 version, based on your patch, for your consideration. > > > > I did actually consider doing it this way yesterday, but I don't like > this approach. I don't believe it's faster (and even if it was, the > difference is going to be negligible), and parameters hidden in some > struct increase the cognitive load. I like explicit arguments. > > Personally I prefer data in structs, of course, > always thinking about size and alignment, to optimize loading. > As I said, I think this is quite irrelevant because we'll call the function maybe 10-times during the whole index build. With millions of other function calls. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: