Thread: Sort support for macaddr8
Peter and I implemented this small (attached) patch to extend
abbreviated key compare sort to macaddr8 datatype (currently supported
for macaddr).
I tried checking to see if there is a performance difference using the
attached DDL based on src/test/regress/sql/macaddr8.sql. I found
that the sort function is only exercised when creating an index (not,
for example, when doing some type of aggregation).
With the patch applied to current master and using the DDL attached,
the timing for creating the index hovered around 20 ms for master and
15 ms for the patched version.
Machine and version specs: PostgreSQL 12beta1 on x86_64-pc-linux-gnu
compiled by gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0, 64-bit
I was just wondering what the accepted way to test and share
performance numbers is for a small patch like this. Is sharing DDL
enough? Do I need to use pg_bench?
--
abbreviated key compare sort to macaddr8 datatype (currently supported
for macaddr).
I tried checking to see if there is a performance difference using the
attached DDL based on src/test/regress/sql/macaddr8.sql. I found
that the sort function is only exercised when creating an index (not,
for example, when doing some type of aggregation).
With the patch applied to current master and using the DDL attached,
the timing for creating the index hovered around 20 ms for master and
15 ms for the patched version.
Machine and version specs: PostgreSQL 12beta1 on x86_64-pc-linux-gnu
compiled by gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0, 64-bit
I think that that seems like an improvement. I was thinking of
registering this patch for the next commitfest. Is that okay?I was just wondering what the accepted way to test and share
performance numbers is for a small patch like this. Is sharing DDL
enough? Do I need to use pg_bench?
--
Melanie Plageman
Attachment
On 6/3/19 3:23 PM, Melanie Plageman wrote: > Peter and I implemented this small (attached) patch to extend > abbreviated key compare sort to macaddr8 datatype (currently supported > for macaddr). Am I going cross-eyed, or would the memset be serving more of a purpose if it were in the SIZEOF_DATUM != 8 branch? Regards, -Chap
On Mon, Jun 3, 2019 at 1:17 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > I tried checking to see if there is a performance difference using the > attached DDL based on src/test/regress/sql/macaddr8.sql. I found > that the sort function is only exercised when creating an index (not, > for example, when doing some type of aggregation). As you know, it's a bit weird that we're proposing adding sort support with abbreviated keys for a type that is 8 bytes, since you'd expect it to also be pass-by-value on most platforms, which largely defeats the purpose of having abbreviated keys (though sort support could still make sense, for the same reason it makes sense to have it for int8). However, macaddr8 isn't actually pass-by-value, and it seems too late to do anything about that now, so abbreviated keys actually make sense. > With the patch applied to current master and using the DDL attached, > the timing for creating the index hovered around 20 ms for master and > 15 ms for the patched version. I would expect a sufficiently large sort to execute in about half the time with abbreviation, based on previous experience. However, I think that this patch can be justified in a relatively straightforward way. It extends sort support for macaddr to macaddr8, since these two types are almost identical in every other way. We should still validate the performance out of an abundance of caution, but I would be very surprised if there was much difference between the macaddr and macaddr8 cases. In short, users should not be surprised by the big gap in performance between macaddr and macaddr8. It's worth being consistent there. > I think that that seems like an improvement. I was thinking of > registering this patch for the next commitfest. Is that okay? Definitely, yes. > I was just wondering what the accepted way to test and share > performance numbers is for a small patch like this. Is sharing DDL > enough? Do I need to use pg_bench? I've always used custom microbenchmarks for stuff like this. Repeatedly executing a particular query and taking the median execution time as representative seems to be the best approach. -- Peter Geoghegan
On Mon, Jun 3, 2019 at 2:03 PM Chapman Flack <chap@anastigmatix.net> wrote: > Am I going cross-eyed, or would the memset be serving more of a purpose > if it were in the SIZEOF_DATUM != 8 branch? No, it wouldn't -- that's the correct place for it with the macaddr type. However, it isn't actually necessary to memset() at the equivalent point for macaddr8, since we cannot "run out of bytes from the authoritative representation" that go in the Datum/abbreviated key. I suppose that the memset() should simply be removed, since it is superfluous here. -- Peter Geoghegan
On 6/3/19 5:03 PM, Chapman Flack wrote: > On 6/3/19 3:23 PM, Melanie Plageman wrote: >> Peter and I implemented this small (attached) patch to extend >> abbreviated key compare sort to macaddr8 datatype (currently supported >> for macaddr). > > Am I going cross-eyed, or would the memset be serving more of a purpose > if it were in the SIZEOF_DATUM != 8 branch? It looks like a copy-pasto coming from mac.c, where the size of the thing to be compared isn't itself 8 bytes. With sizeof(macaddr) being 6, that original code may have had these cases in mind: - SIZEOF_DATUM is something smaller than 6 (likely 4). The whole key doesn't fit, but that's ok, because abbreviated "equality" just means to recheck with the authoritative routine. - SIZEOF_DATUM is exactly 6. Probably not a thing. - SIZEOF_DATUM is anything larger than 6 (likely 8). Needs the memset. Also, in this case, abbreviated "equality" could be taken as true equality, never needing the authoritative fallback. For macaddr8, the cases morph into these: - SIZEOF_DATUM is something smaller than 8 (likely 4). Ok; it's just an abbreviation. - SIZEOF_DATUM is exactly 8. Now an actual thing, even likely. - SIZEOF_DATUM is larger than 8. Our flying cars run postgres, and we need the memset to make sure they don't crash. This leaves me with a couple of questions: 1. (This one seems like a bug.) In the little-endian case, if SIZEOF_DATUM is smaller than the type, I'm not convinced by doing the DatumBigEndianToNative() after the memcpy(). Seems like that's too late to make sure the most-significant bytes got copied. 2. (This one seems like an API opportunity.) If it becomes common to add abbreviation support for smallish types such that (as here, when SIZEOF_DATUM >= 8), an abbreviated "equality" result is in fact authoritative, would it be worthwhile to have some way for the sort support routine to announce that fact to the caller? That could spare the caller the effort of re-checking with the authoritative routine. It could also (by making the equality case less costly) end up changing the weight assigned to the cardinality estimate in deciding whether to abbrev.. Regards, -Chap
On 6/3/19 5:59 PM, Chapman Flack wrote: > On 6/3/19 5:03 PM, Chapman Flack wrote: > 1. (This one seems like a bug.) In the little-endian case, if > SIZEOF_DATUM is smaller than the type, I'm not convinced by doing > the DatumBigEndianToNative() after the memcpy(). Seems like that's > too late to make sure the most-significant bytes got copied. Wait, I definitely was cross-eyed for that one. It's the abbreviated copy whose endianness varies. Never mind. Regards, -Chap
On Mon, Jun 3, 2019 at 2:59 PM Chapman Flack <chap@anastigmatix.net> wrote: > 1. (This one seems like a bug.) In the little-endian case, if > SIZEOF_DATUM is smaller than the type, I'm not convinced by doing > the DatumBigEndianToNative() after the memcpy(). Seems like that's > too late to make sure the most-significant bytes got copied. Uh, when else would you do it? Before the memcpy()? > 2. (This one seems like an API opportunity.) If it becomes common to > add abbreviation support for smallish types such that (as here, > when SIZEOF_DATUM >= 8), an abbreviated "equality" result is in fact > authoritative, would it be worthwhile to have some way for the sort > support routine to announce that fact to the caller? That could > spare the caller the effort of re-checking with the authoritative > routine. It's possible that that would make sense, but I don't think that this patch needs to do that. There is at least one pre-existing cases that does this -- macaddr. -- Peter Geoghegan
Greetings, * Melanie Plageman (melanieplageman@gmail.com) wrote: > Peter and I implemented this small (attached) patch to extend > abbreviated key compare sort to macaddr8 datatype (currently supported > for macaddr). Nice. > I think that that seems like an improvement. I was thinking of > registering this patch for the next commitfest. Is that okay? Sure. > I was just wondering what the accepted way to test and share > performance numbers is for a small patch like this. Is sharing DDL > enough? Do I need to use pg_bench? Detailed (enough... doesn't need to include timing of every indivudal query, but something like the average timing across 5 runs or similar would be good) results posted to this list, with enough information about how to reproduce the tests, would be the way to go. The idea is to let others also test and make sure that they come up with similar results to yours, and if they don't, ideally have enough information to narrow down what's happening / what's different. Thanks! Stephen
Attachment
On Mon, Jun 3, 2019 at 2:39 PM Peter Geoghegan <pg@bowt.ie> wrote:
As you know, it's a bit weird that we're proposing adding sort support
with abbreviated keys for a type that is 8 bytes, since you'd expect
it to also be pass-by-value on most platforms, which largely defeats
the purpose of having abbreviated keys (though sort support could
still make sense, for the same reason it makes sense to have it for
int8). However, macaddr8 isn't actually pass-by-value, and it seems
too late to do anything about that now, so abbreviated keys actually
make sense.
so, if what you say is true and it is either not worth it or
potentially a breaking change to make macaddr8 pass-by-value and
adding abbreviated sort support has the potential to avoid "pointer
chasing" and guarantee equivalent relative performance for macaddr8
and macaddr, then that seems worth it.
With regard to macaddr8_abbrev_convert() and memset(), I attached a patch
with the memset() removed, since it is superfluous here.
macaddr8_cmp_internal() already existed before this patch and I noticed
that it explicitly returns int32 whereas the return type of
macaddr_cmp_internal() is just specified as an int. I was wondering why.
I also noticed that the prototype for macaddr8_cmp_internal() was not
at the top of the file with the other static function prototypes. I
added it there, but I wasn't sure if there was some reason that it was
like that to begin with.
potentially a breaking change to make macaddr8 pass-by-value and
adding abbreviated sort support has the potential to avoid "pointer
chasing" and guarantee equivalent relative performance for macaddr8
and macaddr, then that seems worth it.
With regard to macaddr8_abbrev_convert() and memset(), I attached a patch
with the memset() removed, since it is superfluous here.
macaddr8_cmp_internal() already existed before this patch and I noticed
that it explicitly returns int32 whereas the return type of
macaddr_cmp_internal() is just specified as an int. I was wondering why.
I also noticed that the prototype for macaddr8_cmp_internal() was not
at the top of the file with the other static function prototypes. I
added it there, but I wasn't sure if there was some reason that it was
like that to begin with.
--
Melanie Plageman
Attachment
On Tue, Jun 04, 2019 at 01:49:23PM -0400, Stephen Frost wrote: >Greetings, > >* Melanie Plageman (melanieplageman@gmail.com) wrote: >> Peter and I implemented this small (attached) patch to extend >> abbreviated key compare sort to macaddr8 datatype (currently supported >> for macaddr). > >Nice. > >> I think that that seems like an improvement. I was thinking of >> registering this patch for the next commitfest. Is that okay? > >Sure. > >> I was just wondering what the accepted way to test and share >> performance numbers is for a small patch like this. Is sharing DDL >> enough? Do I need to use pg_bench? > >Detailed (enough... doesn't need to include timing of every indivudal >query, but something like the average timing across 5 runs or similar >would be good) results posted to this list, with enough information >about how to reproduce the tests, would be the way to go. > >The idea is to let others also test and make sure that they come up with >similar results to yours, and if they don't, ideally have enough >information to narrow down what's happening / what's different. > Yeah, there's no "approved way" to do performance tests the contributors would have to follow. That applies both to tooling and how detailed the data need/should be. Ultimately, the goal is to convince others (and yourself) that the change is an improvement. Does a simple pgbench script achieve that? Cool, use that. Do you need something more complex? Sure, do a shell script or something like that. As long as others can reasonably reproduce your tests, it's fine. For me, the most critical part of benchmarking a change is deciding what to test - which queries, data sets, what amounts of data, config, etc. For example, the data set you used has ~12k rows. Does the behavior change with 10x or 100x that? It probably does not make sense to go above available RAM (the I/O costs are likely to make everything else mostly irrelevant), but CPU caches may matter a lot. Different work_mem (and maintenance_work_mem) values may be useful too. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-Jun-03, Peter Geoghegan wrote: > As you know, it's a bit weird that we're proposing adding sort support > with abbreviated keys for a type that is 8 bytes, since you'd expect > it to also be pass-by-value on most platforms, which largely defeats > the purpose of having abbreviated keys (though sort support could > still make sense, for the same reason it makes sense to have it for > int8). However, macaddr8 isn't actually pass-by-value, and it seems > too late to do anything about that now, so abbreviated keys actually > make sense. I'm not sure I understand why you say it's too late to change now. Surely the on-disk representation doesn't actually change, so it is not impossible to change? And you make it sound like doing that change is worthwhile, performance-wise. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On 2019-06-04 17:37:35 -0400, Alvaro Herrera wrote: > On 2019-Jun-03, Peter Geoghegan wrote: > > As you know, it's a bit weird that we're proposing adding sort support > > with abbreviated keys for a type that is 8 bytes, since you'd expect > > it to also be pass-by-value on most platforms, which largely defeats > > the purpose of having abbreviated keys (though sort support could > > still make sense, for the same reason it makes sense to have it for > > int8). However, macaddr8 isn't actually pass-by-value, and it seems > > too late to do anything about that now, so abbreviated keys actually > > make sense. > > I'm not sure I understand why you say it's too late to change now. > Surely the on-disk representation doesn't actually change, so it is not > impossible to change? And you make it sound like doing that change is > worthwhile, performance-wise. Yea, I don't immediately see a problem with doing that on a major version boundary. Obviously that'd only be possible for sizeof(Datum) == 8 == sizeof(macaddr8) platforms, but that's the vast majority these days. And generally, I think it's just about *always* worth to go for a pass-by-value for the cases where that doesn't imply space wastage. I think it might be worthwhile to optimize things so that all typlen > 0 && typlen <= sizeof(Datum) are allowed for byval datums. SELECT typname, typlen FROM pg_type WHERE typlen > 0 AND typlen <= 8 AND NOT typbyval; ┌──────────┬────────┐ │ typname │ typlen │ ├──────────┼────────┤ │ tid │ 6 │ │ macaddr │ 6 │ │ macaddr8 │ 8 │ └──────────┴────────┘ (3 rows) Seems like adding byval support for sizes outside of 1/2/4/8 bytes would be worthwhile for tid alone. Not sure whether there's extensions with signifcant use that have fixed-width types <= 8 bytes. Greetings, Andres Freund
Hi, On 2019-06-04 14:55:16 -0700, Andres Freund wrote: > On 2019-06-04 17:37:35 -0400, Alvaro Herrera wrote: > I think it might be worthwhile to optimize things so that all typlen > 0 > && typlen <= sizeof(Datum) are allowed for byval datums. Maybe even just deprecate specifying byval at CREATE TYPE time, and instead automatically infer it from the type length. We've had a number of blunders around this, and I can't really see any reason for specifying byval = false when we internally could treat it as a byval. Greetings, Andres Freund
On Tue, Jun 4, 2019 at 2:55 PM Andres Freund <andres@anarazel.de> wrote: > Yea, I don't immediately see a problem with doing that on a major > version boundary. Obviously that'd only be possible for sizeof(Datum) == > 8 == sizeof(macaddr8) platforms, but that's the vast majority these > days. And generally, I think it's just about *always* worth to go for a > pass-by-value for the cases where that doesn't imply space wastage. It would be faster to do it that way, I think. You would need a more complicated comparator than a classic abbreviated comparator (i.e. a 3-way unsigned int comparator) that way, but it would very probably be faster on balance. I'm glad to hear that it isn't *obviously* a problem from a compatibility perspective -- I really wasn't sure about that, since retrofitting a type to be pass-by-value like this is something that may never have been attempted before now (at least not since we started to care about pg_upgrade). > I think it might be worthwhile to optimize things so that all typlen > 0 > && typlen <= sizeof(Datum) are allowed for byval datums. > > SELECT typname, typlen FROM pg_type WHERE typlen > 0 AND typlen <= 8 AND NOT typbyval; > ┌──────────┬────────┐ > │ typname │ typlen │ > ├──────────┼────────┤ > │ tid │ 6 │ > │ macaddr │ 6 │ > │ macaddr8 │ 8 │ > └──────────┴────────┘ > (3 rows) This is half the reason why I ended up implementing itemptr_encode() to accelerate the TID sort used by CREATE INDEX CONCURRENTLY some years back -- TID is 6 bytes wide, which doesn't have the necessary macro support within postgres.h. There is no reason why that couldn't be added for the benefit of both TID and macaddr types, though it probably wouldn't be worth it. And, as long as we're not going to those lengths, there may be some value in keeping the macaddr8 code in line with macaddr code -- the two types are currently almost the same (the glaring difference is the lack of macaddr8 sort support). We'll need to draw the line somewhere, and that is likely to be a bit arbitrary. This was what I meant by "weird". -- Peter Geoghegan
Hi, On 2019-06-04 15:10:07 -0700, Peter Geoghegan wrote: > On Tue, Jun 4, 2019 at 2:55 PM Andres Freund <andres@anarazel.de> wrote: > > Yea, I don't immediately see a problem with doing that on a major > > version boundary. Obviously that'd only be possible for sizeof(Datum) == > > 8 == sizeof(macaddr8) platforms, but that's the vast majority these > > days. And generally, I think it's just about *always* worth to go for a > > pass-by-value for the cases where that doesn't imply space wastage. > > It would be faster to do it that way, I think. You would need a more > complicated comparator than a classic abbreviated comparator (i.e. a > 3-way unsigned int comparator) that way, but it would very probably be > faster on balance. I'd be surprised if it weren't. > I'm glad to hear that it isn't *obviously* a problem from a > compatibility perspective -- I really wasn't sure about that, since > retrofitting a type to be pass-by-value like this is something that > may never have been attempted before now (at least not since we > started to care about pg_upgrade). Obviously we have to test it, but I don't really see any compat problems. Both have the same size on disk, after all. We couldn't make such a change in a minor version, as DatumGetMacaddr*, DatumGetItemPointer obviously need to change, but it ought to otherwise be transparent. It would, I think, be different if we still supported v0 calling conventions, but we don't... > This is half the reason why I ended up implementing itemptr_encode() > to accelerate the TID sort used by CREATE INDEX CONCURRENTLY some > years back -- TID is 6 bytes wide, which doesn't have the necessary > macro support within postgres.h. There is no reason why that couldn't > be added for the benefit of both TID and macaddr types, though it > probably wouldn't be worth it. I think we should definitely do that. It seems not unlikely that other people want to write new fixed width types, and we shouldn't have them deal with all this complexity unnecessarily. Greetings, Andres Freund
On Tue, Jun 4, 2019 at 3:23 PM Andres Freund <andres@anarazel.de> wrote: > > This is half the reason why I ended up implementing itemptr_encode() > > to accelerate the TID sort used by CREATE INDEX CONCURRENTLY some > > years back -- TID is 6 bytes wide, which doesn't have the necessary > > macro support within postgres.h. There is no reason why that couldn't > > be added for the benefit of both TID and macaddr types, though it > > probably wouldn't be worth it. > > I think we should definitely do that. It seems not unlikely that other > people want to write new fixed width types, and we shouldn't have them > deal with all this complexity unnecessarily. On second thought, maybe there is something to be said for being exhaustive here. It seems like there is a preference for making macaddr8 pass-by-value instead of adding abbreviated keys support to macaddr8, and possibly doing the same with the original macaddr type. Do you think that you'll be able to work on the project with this expanded scope, Melanie? -- Peter Geoghegan
On Tue, Jun 4, 2019 at 3:50 PM Peter Geoghegan <pg@bowt.ie> wrote:
On Tue, Jun 4, 2019 at 3:23 PM Andres Freund <andres@anarazel.de> wrote:
> > This is half the reason why I ended up implementing itemptr_encode()
> > to accelerate the TID sort used by CREATE INDEX CONCURRENTLY some
> > years back -- TID is 6 bytes wide, which doesn't have the necessary
> > macro support within postgres.h. There is no reason why that couldn't
> > be added for the benefit of both TID and macaddr types, though it
> > probably wouldn't be worth it.
>
> I think we should definitely do that. It seems not unlikely that other
> people want to write new fixed width types, and we shouldn't have them
> deal with all this complexity unnecessarily.
On second thought, maybe there is something to be said for being
exhaustive here.
It seems like there is a preference for making macaddr8 pass-by-value
instead of adding abbreviated keys support to macaddr8, and possibly
doing the same with the original macaddr type.
Do you think that you'll be able to work on the project with this
expanded scope, Melanie?
I can take on making macaddr8 pass-by-value
I tinkered a bit last night and got in/out mostly working (I think).
I'm not sure about macaddr, TID, and user-defined types.
I tinkered a bit last night and got in/out mostly working (I think).
I'm not sure about macaddr, TID, and user-defined types.
--
Melanie Plageman
On 2019-Jun-05, Melanie Plageman wrote: > I can take on making macaddr8 pass-by-value > I tinkered a bit last night and got in/out mostly working (I think). > I'm not sure about macaddr, TID, and user-defined types. Yeah, let's see what macaddr8 looks like, and we can move from there -- I suppose adapting for macaddr would not be terribly different, but we don't have to do both in a single commit. I don't expect that TID would necessarily be similar since we have lots of bespoke code for that in lots of places; it might not affect anything (it should not!) but then it might. No reason not to move forward incrementally. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On 2019-06-05 09:18:34 -0700, Melanie Plageman wrote: > On Tue, Jun 4, 2019 at 3:50 PM Peter Geoghegan <pg@bowt.ie> wrote: > > > On Tue, Jun 4, 2019 at 3:23 PM Andres Freund <andres@anarazel.de> wrote: > > > > This is half the reason why I ended up implementing itemptr_encode() > > > > to accelerate the TID sort used by CREATE INDEX CONCURRENTLY some > > > > years back -- TID is 6 bytes wide, which doesn't have the necessary > > > > macro support within postgres.h. There is no reason why that couldn't > > > > be added for the benefit of both TID and macaddr types, though it > > > > probably wouldn't be worth it. > > > > > > I think we should definitely do that. It seems not unlikely that other > > > people want to write new fixed width types, and we shouldn't have them > > > deal with all this complexity unnecessarily. > > > > On second thought, maybe there is something to be said for being > > exhaustive here. > > > > It seems like there is a preference for making macaddr8 pass-by-value > > instead of adding abbreviated keys support to macaddr8, and possibly > > doing the same with the original macaddr type. > > > > Do you think that you'll be able to work on the project with this > > expanded scope, Melanie? > > > > > I can take on making macaddr8 pass-by-value > I tinkered a bit last night and got in/out mostly working (I think). > I'm not sure about macaddr, TID, and user-defined types. I'd much rather see this tackled in a general way than fiddling with individual datatypes. I think we should: 1) make fetch_att(), store_att_byval() etc support datums of any length between 1 and <= sizeof(Datum). Probably also convert them to inline functions. There's a few more functions to be adjusted, but not many, I think. 2) Remove ability to pass PASSEDBYVALUE to CREATE TYPE, but instead compute whether attyval is possible, solely based on INTERNALLENGTH (when INTERNALLENGTH > 0 obviously). 3) Fix the fallout, by fixing a few of the Datum<->type conversion functions affected by this change. That'll require a bit of work, but not too much. We should write those conversion routines in a way that'll keep them working for the scenarios where the type is actually passable by value, and not (required for > 4 byte datums). Greetings, Andres Freund
On 2019-Jun-05, Andres Freund wrote: > I'd much rather see this tackled in a general way than fiddling with > individual datatypes. I think we should: > > 1) make fetch_att(), store_att_byval() etc support datums of any length > between 1 and <= sizeof(Datum). Probably also convert them to inline > functions. There's a few more functions to be adjusted, but not many, > I think. Does this mean that datatypes that are >4 and <=8 bytes need to handle both cases? Is it possible for them to detect the current environment? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On June 5, 2019 12:14:42 PM PDT, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >On 2019-Jun-05, Andres Freund wrote: > >> I'd much rather see this tackled in a general way than fiddling with >> individual datatypes. I think we should: >> >> 1) make fetch_att(), store_att_byval() etc support datums of any >length >> between 1 and <= sizeof(Datum). Probably also convert them to >inline >> functions. There's a few more functions to be adjusted, but not >many, >> I think. > >Does this mean that datatypes that are >4 and <=8 bytes need to handle >both cases? Is it possible for them to detect the current environment? Well, the conversion macros need to know. You can look at float8 for an example of the difference - it's pretty centralized.We should provide a few helper macros to abstract that away. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Andres Freund <andres@anarazel.de> writes: > On June 5, 2019 12:14:42 PM PDT, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >> Does this mean that datatypes that are >4 and <=8 bytes need to handle >> both cases? Is it possible for them to detect the current environment? > Well, the conversion macros need to know. You can look at float8 for an example of the difference - it's pretty centralized.We should provide a few helper macros to abstract that away. FWIW, I disagree with Andres on this being a reasonable way to proceed. The fact that we support both pass-by-value and pass-by-ref for float8 and int8 is because those are important data types that are worth taking extra pains to optimize. It's a very long way from there to insisting that every datatype between 5 and 8 bytes long must get the same treatment, and even further to Andres' apparent position that we should force third-party types to do it whether they care about micro-optimization or not. And I *entirely* fail to get the point of adding such support for datatypes of 5 or 7 bytes. No such types exist, or are on the horizon AFAIK. Lastly, I don't think adding additional allowed widths of pass-by-value types would be cost-free, because it would add cycles to the inner loops of the tuple forming and deforming functions. (No, I don't believe that JIT makes that an ignorable concern.) I'm not really sure that either macaddr or macaddr8 are used widely enough to justify expending optimization effort on them. But if they are, let's just do that, not move the goal posts into the next stadium. regards, tom lane
Hi, On 2019-06-06 13:39:50 -0400, Tom Lane wrote: > Lastly, I don't think adding additional allowed widths of pass-by-value > types would be cost-free, because it would add cycles to the inner loops > of the tuple forming and deforming functions. I'm not sure I quite buy that. I think that we have branches over a fixed number of lengths is largely unnecessary. att_addlength_pointer() doesn't care - it just uses the length. And I think we should just consider doing the same for fetch_att(). E.g. by using memcpy(). That'd also have the advantage that we'd not be *forced* to rely alignment of byval types. The only reason we actually need that is the heaptuple-to-struct mapping for catalogs. Outside of that we don't have pointers to individual byval tuples, and waste a fair bit of padding due to that. Additionally we'd get rid of needing separate versions for SIZEOF_DATUM != 8/not. > (No, I don't believe that JIT makes that an ignorable concern.) Obviously not. Greetings, Andres Freund