Re: Custom tuplesorts for extensions - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Custom tuplesorts for extensions |
Date | |
Msg-id | 20220706150139.sidv4zcu7ipcklth@awork3.anarazel.de Whole thread Raw |
In response to | Re: Custom tuplesorts for extensions (Maxim Orlov <orlovmg@gmail.com>) |
Responses |
Re: Custom tuplesorts for extensions
|
List | pgsql-hackers |
Hi, I think this needs to be evaluated for performance... I agree with the nearby comment that the commits need a bit of justification at least to review them. On 2022-06-23 15:12:27 +0300, Maxim Orlov wrote: > From 03b78cdade3b86a0e97723721fa1d0bd64d0c7df Mon Sep 17 00:00:00 2001 > From: Alexander Korotkov <akorotkov@postgresql.org> > Date: Tue, 21 Jun 2022 13:28:27 +0300 > Subject: [PATCH v2 1/6] Remove Tuplesortstate.copytup Yea. I was recently complaining about the pointlessness of copytup. > From 1d78e271b22d7c6a1557defbe15ea5039ff28510 Mon Sep 17 00:00:00 2001 > From: Alexander Korotkov <akorotkov@postgresql.org> > Date: Tue, 21 Jun 2022 14:03:13 +0300 > Subject: [PATCH v2 2/6] Tuplesortstate.getdatum1 method Hm. This adds a bunch of indirect function calls were there previously weren't. > From 494d46dcf938e5f824a498e38ce77782751208e1 Mon Sep 17 00:00:00 2001 > From: Alexander Korotkov <akorotkov@postgresql.org> > Date: Tue, 21 Jun 2022 14:13:56 +0300 > Subject: [PATCH v2 3/6] Put abbreviation logic into puttuple_common() There's definitely a lot of redundancy removed... But the list of branches in puttuple_common() grew. Perhaps we instead can add a few flags to putttuple_common() that determine whether abbreviation should happen, so that we only do the work necessary for the "type" of sort? > From ee2dd46b07d62e13ed66b5a38272fb5667c943f3 Mon Sep 17 00:00:00 2001 > From: Alexander Korotkov <akorotkov@postgresql.org> > Date: Wed, 22 Jun 2022 00:14:51 +0300 > Subject: [PATCH v2 4/6] Move freeing memory away from writetup() Seems to do more than just moving freeing around? > @@ -1973,8 +1963,13 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull) > static void > puttuple_common(Tuplesortstate *state, SortTuple *tuple) > { > + MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext); > + > Assert(!LEADER(state)); > > + if (tuple->tuple != NULL) > + USEMEM(state, GetMemoryChunkSpace(tuple->tuple)); > + Adding even more branches into common code... > From 3a0e1fa7c7e4da46a86f7d5b9dd0392549f3b460 Mon Sep 17 00:00:00 2001 > From: Alexander Korotkov <akorotkov@postgresql.org> > Date: Wed, 22 Jun 2022 18:11:26 +0300 > Subject: [PATCH v2 5/6] Reorganize data structures Hard to know what this is trying to achieve. > -struct Tuplesortstate > +struct TuplesortOps > { > - TupSortStatus status; /* enumerated value as shown above */ > - int nKeys; /* number of columns in sort key */ > - int sortopt; /* Bitmask of flags used to setup sort */ > - bool bounded; /* did caller specify a maximum number of > - * tuples to return? */ > - bool boundUsed; /* true if we made use of a bounded heap */ > - int bound; /* if bounded, the maximum number of tuples */ > - bool tuples; /* Can SortTuple.tuple ever be set? */ > - int64 availMem; /* remaining memory available, in bytes */ > - int64 allowedMem; /* total memory allowed, in bytes */ > - int maxTapes; /* max number of input tapes to merge in each > - * pass */ > - int64 maxSpace; /* maximum amount of space occupied among sort > - * of groups, either in-memory or on-disk */ > - bool isMaxSpaceDisk; /* true when maxSpace is value for on-disk > - * space, false when it's value for in-memory > - * space */ > - TupSortStatus maxSpaceStatus; /* sort status when maxSpace was reached */ > MemoryContext maincontext; /* memory context for tuple sort metadata that > * persists across multiple batches */ > MemoryContext sortcontext; /* memory context holding most sort data */ > MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */ > - LogicalTapeSet *tapeset; /* logtape.c object for tapes in a temp file */ > > /* > * These function pointers decouple the routines that must know what kind To me it seems odd to have memory contexts and similar things in a datastructure calls *Ops. > From b06bcb5f3666f0541dfcc27c9c8462af2b5ec9e0 Mon Sep 17 00:00:00 2001 > From: Alexander Korotkov <akorotkov@postgresql.org> > Date: Wed, 22 Jun 2022 21:48:05 +0300 > Subject: [PATCH v2 6/6] Split tuplesortops.c I strongly suspect this will cause a slowdown. There was potential for cross-function optimization that's now removed. Greetings, Andres Freund
pgsql-hackers by date: