Thread: Fortify float4 and float8 regression tests by ordering test results
Hi, hackers! I noticed that SELECT results in float4 and float8 tests lack ORDER BY clauses. This makes test results depend on the current heap/MVCC implementation. If I try to run the float8 test on a table created with a different access method provided by an extension, I'm getting results ordered differently. It's not a big problem, but propose a simple fix for the tests. It just adds ORDER BY 1 to all relevant float4 and floa8 queries. I don't have a strong opinion about backpatching this, but as the patch changes only regression tests, it's maybe also worth backpatching. Regards, Pavel Borisov, Supabase.
Attachment
Re: Fortify float4 and float8 regression tests by ordering test results
From
Aleksander Alekseev
Date:
Hi Pavel, > It's not a big problem, but propose a simple fix for the tests. It > just adds ORDER BY 1 to all relevant float4 and floa8 queries. Thanks for the patch. That's a good change IMO. > I don't > have a strong opinion about backpatching this, but as the patch > changes only regression tests, it's maybe also worth backpatching. I don't see a reason for backpatching this but I'm not strongly opposed to the idea either. -- Best regards, Aleksander Alekseev
Pavel Borisov <pashkin.elfe@gmail.com> writes: > It's not a big problem, but propose a simple fix for the tests. It > just adds ORDER BY 1 to all relevant float4 and floa8 queries. You do realize that this problem is hardly confined to the float4 and float8 tests? To a first approximation, a table AM that fails to preserve insertion order would break every single one of our regression tests. (I speak from experience, as Salesforce had to deal with this when I was there...) The reason why we don't simply add ORDER BY to everything is explained at [1]: You might wonder why we don't order all the regression test queries explicitly to get rid of this issue once and for all. The reason is that that would make the regression tests less useful, not more, since they'd tend to exercise query plan types that produce ordered results to the exclusion of those that don't. At some point we will probably have to think through what we want to do about running the regression tests with table AMs that don't wish to preserve ordering as much as heapam does. It's going to need careful balancing of multiple concerns, and I don't think "blindly slap ORDER BY everywhere" is going to be an acceptable answer. regards, tom lane [1] https://www.postgresql.org/docs/devel/regress-evaluation.html#REGRESS-EVALUATION-ORDERING-DIFFERENCES
Hi, Tom! On Tue, 22 Apr 2025 at 18:23, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Pavel Borisov <pashkin.elfe@gmail.com> writes: > > It's not a big problem, but propose a simple fix for the tests. It > > just adds ORDER BY 1 to all relevant float4 and floa8 queries. > > You do realize that this problem is hardly confined to the > float4 and float8 tests? To a first approximation, a table AM > that fails to preserve insertion order would break every single > one of our regression tests. (I speak from experience, as > Salesforce had to deal with this when I was there...) > > The reason why we don't simply add ORDER BY to everything is > explained at [1]: > > You might wonder why we don't order all the regression test > queries explicitly to get rid of this issue once and for all. The > reason is that that would make the regression tests less useful, > not more, since they'd tend to exercise query plan types that > produce ordered results to the exclusion of those that don't. > > At some point we will probably have to think through what we > want to do about running the regression tests with table AMs that > don't wish to preserve ordering as much as heapam does. It's going > to need careful balancing of multiple concerns, and I don't think > "blindly slap ORDER BY everywhere" is going to be an acceptable > answer. > I agree we might need to elaborate different AM support in regression tests. Also, I agree that these are not the only tests to be fixed. What I hardly agree with, is that we suppose it's right for regression tests to require some fixed results ordering for so simple cases. Maybe for more complicated plans it's useful, but for the float tests mentioned in the patch it's this requirement is a total loss IMO. I understand your sentiment against blindly slapping order by's, but I don't see a useful alternative for this time. Also a large number of tests in PG were already fortified with ORDER BY 1. Regards, Pavel Borisov Supabase
On Tue, 22 Apr 2025 at 18:40, Pavel Borisov <pashkin.elfe@gmail.com> wrote: > > Hi, Tom! > > On Tue, 22 Apr 2025 at 18:23, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > Pavel Borisov <pashkin.elfe@gmail.com> writes: > > > It's not a big problem, but propose a simple fix for the tests. It > > > just adds ORDER BY 1 to all relevant float4 and floa8 queries. > > > > You do realize that this problem is hardly confined to the > > float4 and float8 tests? To a first approximation, a table AM > > that fails to preserve insertion order would break every single > > one of our regression tests. (I speak from experience, as > > Salesforce had to deal with this when I was there...) > > > > The reason why we don't simply add ORDER BY to everything is > > explained at [1]: > > > > You might wonder why we don't order all the regression test > > queries explicitly to get rid of this issue once and for all. The > > reason is that that would make the regression tests less useful, > > not more, since they'd tend to exercise query plan types that > > produce ordered results to the exclusion of those that don't. > > > > At some point we will probably have to think through what we > > want to do about running the regression tests with table AMs that > > don't wish to preserve ordering as much as heapam does. It's going > > to need careful balancing of multiple concerns, and I don't think > > "blindly slap ORDER BY everywhere" is going to be an acceptable > > answer. > > > > I agree we might need to elaborate different AM support in regression tests. > Also, I agree that these are not the only tests to be fixed. > > What I hardly agree with, is that we suppose it's right for regression > tests to require some fixed results ordering for so simple cases. > Maybe for more complicated plans it's useful, but for the float tests > mentioned in the patch it's this requirement is a total loss IMO. > > I understand your sentiment against blindly slapping order by's, but I > don't see a useful alternative for this time. Also a large number of > tests in PG were already fortified with ORDER BY 1. I forgot to mention that it's not only a question of INSERTs ordering. Extension AM can have some internal ordering e.g. index-organized tables. And besides SELECT query results following internal AM ordering just being allowed, more importantly they are good performance-wise and justify table AM extensibility. Regards, Pavel.
Re: Fortify float4 and float8 regression tests by ordering test results
From
Alexander Korotkov
Date:
On Tue, Apr 22, 2025 at 5:57 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
> On Tue, 22 Apr 2025 at 18:40, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
> > On Tue, 22 Apr 2025 at 18:23, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >
> > > Pavel Borisov <pashkin.elfe@gmail.com> writes:
> > > > It's not a big problem, but propose a simple fix for the tests. It
> > > > just adds ORDER BY 1 to all relevant float4 and floa8 queries.
> > >
> > > You do realize that this problem is hardly confined to the
> > > float4 and float8 tests? To a first approximation, a table AM
> > > that fails to preserve insertion order would break every single
> > > one of our regression tests. (I speak from experience, as
> > > Salesforce had to deal with this when I was there...)
> > >
> > > The reason why we don't simply add ORDER BY to everything is
> > > explained at [1]:
> > >
> > > You might wonder why we don't order all the regression test
> > > queries explicitly to get rid of this issue once and for all. The
> > > reason is that that would make the regression tests less useful,
> > > not more, since they'd tend to exercise query plan types that
> > > produce ordered results to the exclusion of those that don't.
> > >
> > > At some point we will probably have to think through what we
> > > want to do about running the regression tests with table AMs that
> > > don't wish to preserve ordering as much as heapam does. It's going
> > > to need careful balancing of multiple concerns, and I don't think
> > > "blindly slap ORDER BY everywhere" is going to be an acceptable
> > > answer.
> > >
> >
> > I agree we might need to elaborate different AM support in regression tests.
> > Also, I agree that these are not the only tests to be fixed.
> >
> > What I hardly agree with, is that we suppose it's right for regression
> > tests to require some fixed results ordering for so simple cases.
> > Maybe for more complicated plans it's useful, but for the float tests
> > mentioned in the patch it's this requirement is a total loss IMO.
> >
> > I understand your sentiment against blindly slapping order by's, but I
> > don't see a useful alternative for this time. Also a large number of
> > tests in PG were already fortified with ORDER BY 1.
>
> I forgot to mention that it's not only a question of INSERTs ordering.
> Extension AM can have some internal ordering e.g. index-organized
> tables. And besides SELECT query results following internal AM
> ordering just being allowed, more importantly they are good
> performance-wise and justify table AM extensibility.
+1,
I'd like to add that float4.out not only assumes that insert-ordering is preserved (this could be more-or-less portable between table AMs). It also assumes the way UPDATE moves updated rows. That seems quite heap-specific. You can see in the following fragment, updated rows jump to the bottom.
-- test the unary float4abs operator
SELECT f.f1, @f.f1 AS abs_f1 FROM FLOAT4_TBL f;
f1 | abs_f1
---------------+---------------
0 | 0
1004.3 | 1004.3
-34.84 | 34.84
1.2345679e+20 | 1.2345679e+20
1.2345679e-20 | 1.2345679e-20
(5 rows)
UPDATE FLOAT4_TBL
SET f1 = FLOAT4_TBL.f1 * '-1'
WHERE FLOAT4_TBL.f1 > '0.0';
SELECT * FROM FLOAT4_TBL;
f1
----------------
0
-34.84
-1004.3
-1.2345679e+20
-1.2345679e-20
(5 rows)
------
Regards,
Alexander Korotkov
Supabase
> On Tue, 22 Apr 2025 at 18:40, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
> > On Tue, 22 Apr 2025 at 18:23, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >
> > > Pavel Borisov <pashkin.elfe@gmail.com> writes:
> > > > It's not a big problem, but propose a simple fix for the tests. It
> > > > just adds ORDER BY 1 to all relevant float4 and floa8 queries.
> > >
> > > You do realize that this problem is hardly confined to the
> > > float4 and float8 tests? To a first approximation, a table AM
> > > that fails to preserve insertion order would break every single
> > > one of our regression tests. (I speak from experience, as
> > > Salesforce had to deal with this when I was there...)
> > >
> > > The reason why we don't simply add ORDER BY to everything is
> > > explained at [1]:
> > >
> > > You might wonder why we don't order all the regression test
> > > queries explicitly to get rid of this issue once and for all. The
> > > reason is that that would make the regression tests less useful,
> > > not more, since they'd tend to exercise query plan types that
> > > produce ordered results to the exclusion of those that don't.
> > >
> > > At some point we will probably have to think through what we
> > > want to do about running the regression tests with table AMs that
> > > don't wish to preserve ordering as much as heapam does. It's going
> > > to need careful balancing of multiple concerns, and I don't think
> > > "blindly slap ORDER BY everywhere" is going to be an acceptable
> > > answer.
> > >
> >
> > I agree we might need to elaborate different AM support in regression tests.
> > Also, I agree that these are not the only tests to be fixed.
> >
> > What I hardly agree with, is that we suppose it's right for regression
> > tests to require some fixed results ordering for so simple cases.
> > Maybe for more complicated plans it's useful, but for the float tests
> > mentioned in the patch it's this requirement is a total loss IMO.
> >
> > I understand your sentiment against blindly slapping order by's, but I
> > don't see a useful alternative for this time. Also a large number of
> > tests in PG were already fortified with ORDER BY 1.
>
> I forgot to mention that it's not only a question of INSERTs ordering.
> Extension AM can have some internal ordering e.g. index-organized
> tables. And besides SELECT query results following internal AM
> ordering just being allowed, more importantly they are good
> performance-wise and justify table AM extensibility.
+1,
I'd like to add that float4.out not only assumes that insert-ordering is preserved (this could be more-or-less portable between table AMs). It also assumes the way UPDATE moves updated rows. That seems quite heap-specific. You can see in the following fragment, updated rows jump to the bottom.
-- test the unary float4abs operator
SELECT f.f1, @f.f1 AS abs_f1 FROM FLOAT4_TBL f;
f1 | abs_f1
---------------+---------------
0 | 0
1004.3 | 1004.3
-34.84 | 34.84
1.2345679e+20 | 1.2345679e+20
1.2345679e-20 | 1.2345679e-20
(5 rows)
UPDATE FLOAT4_TBL
SET f1 = FLOAT4_TBL.f1 * '-1'
WHERE FLOAT4_TBL.f1 > '0.0';
SELECT * FROM FLOAT4_TBL;
f1
----------------
0
-34.84
-1004.3
-1.2345679e+20
-1.2345679e-20
(5 rows)
------
Regards,
Alexander Korotkov
Supabase
Alexander Korotkov <aekorotkov@gmail.com> writes: > I'd like to add that float4.out not only assumes that insert-ordering is > preserved (this could be more-or-less portable between table AMs). It also > assumes the way UPDATE moves updated rows. That seems quite > heap-specific. You can see in the following fragment, updated rows jump to > the bottom. I'd be willing to consider a policy that we don't want to depend on exactly where UPDATE moves rows to. The proposed patch is not that, however. regards, tom lane
Re: Fortify float4 and float8 regression tests by ordering test results
From
Alexander Korotkov
Date:
On Tue, Apr 22, 2025 at 7:20 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alexander Korotkov <aekorotkov@gmail.com> writes: > > I'd like to add that float4.out not only assumes that insert-ordering is > > preserved (this could be more-or-less portable between table AMs). It also > > assumes the way UPDATE moves updated rows. That seems quite > > heap-specific. You can see in the following fragment, updated rows jump to > > the bottom. > > I'd be willing to consider a policy that we don't want to depend on > exactly where UPDATE moves rows to. The proposed patch is not that, > however. OK, that makes sense for me. ------ Regards, Alexander Korotkov Supabase
Hi! On Tue, 22 Apr 2025 at 20:22, Alexander Korotkov <aekorotkov@gmail.com> wrote: > > On Tue, Apr 22, 2025 at 7:20 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Alexander Korotkov <aekorotkov@gmail.com> writes: > > > I'd like to add that float4.out not only assumes that insert-ordering is > > > preserved (this could be more-or-less portable between table AMs). It also > > > assumes the way UPDATE moves updated rows. That seems quite > > > heap-specific. You can see in the following fragment, updated rows jump to > > > the bottom. > > > > I'd be willing to consider a policy that we don't want to depend on > > exactly where UPDATE moves rows to. The proposed patch is not that, > > however. > > OK, that makes sense for me. Thanks for this input! This was my first intention to fix only the test that was affected by UPDATE-order specifics, broke when runnung on an extension AM. Maybe I was too liberal and added ORDER BY's more than needed. I definitely agree to the proposal. Please find attached v2 that fixes only UPDATE-specific part of float4/float8 test.
Attachment
Pavel Borisov <pashkin.elfe@gmail.com> writes: > On Tue, 22 Apr 2025 at 20:22, Alexander Korotkov <aekorotkov@gmail.com> wrote: >> On Tue, Apr 22, 2025 at 7:20 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Alexander Korotkov <aekorotkov@gmail.com> writes: >>>> I'd like to add that float4.out not only assumes that insert-ordering is >>>> preserved (this could be more-or-less portable between table AMs). It also >>>> assumes the way UPDATE moves updated rows. That seems quite >>>> heap-specific. You can see in the following fragment, updated rows jump to >>>> the bottom. >>> I'd be willing to consider a policy that we don't want to depend on >>> exactly where UPDATE moves rows to. The proposed patch is not that, >>> however. >> OK, that makes sense for me. > Thanks for this input! > This was my first intention to fix only the test that was affected by > UPDATE-order specifics, broke when runnung on an extension AM. > Maybe I was too liberal and added ORDER BY's more than needed. > I definitely agree to the proposal. On further reflection, this policy makes plenty of sense because even heapam is not all that stable about row order after UPDATE. With tables large enough to draw the attention of autovacuum, we've had to use ORDER BY or other techniques to stabilize the results, because the timing of background autovacuums affected where rows got put. These tests are different only because the tables are small enough and the updates few enough that autovacuum doesn't get involved. I think it's reasonable that at a project-policy level we should not consider that an excuse. For example, a change in autovacuum's rules could probably break these tests even with heapam. On the other hand, as I mentioned earlier, a table AM that doesn't reproduce original insertion order would break a whole lot more tests than these. So that's a bridge I'd rather not cross, at least not without a lot of consideration. BTW, you forgot to update expected/float4-misrounded-input.out. regards, tom lane
Hi, Tom! On Tue, 22 Apr 2025 at 21:13, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Pavel Borisov <pashkin.elfe@gmail.com> writes: > > On Tue, 22 Apr 2025 at 20:22, Alexander Korotkov <aekorotkov@gmail.com> wrote: > >> On Tue, Apr 22, 2025 at 7:20 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >>> Alexander Korotkov <aekorotkov@gmail.com> writes: > >>>> I'd like to add that float4.out not only assumes that insert-ordering is > >>>> preserved (this could be more-or-less portable between table AMs). It also > >>>> assumes the way UPDATE moves updated rows. That seems quite > >>>> heap-specific. You can see in the following fragment, updated rows jump to > >>>> the bottom. > > >>> I'd be willing to consider a policy that we don't want to depend on > >>> exactly where UPDATE moves rows to. The proposed patch is not that, > >>> however. > > >> OK, that makes sense for me. > > > Thanks for this input! > > This was my first intention to fix only the test that was affected by > > UPDATE-order specifics, broke when runnung on an extension AM. > > Maybe I was too liberal and added ORDER BY's more than needed. > > I definitely agree to the proposal. > > On further reflection, this policy makes plenty of sense because even > heapam is not all that stable about row order after UPDATE. With > tables large enough to draw the attention of autovacuum, we've had > to use ORDER BY or other techniques to stabilize the results, because > the timing of background autovacuums affected where rows got put. > These tests are different only because the tables are small enough > and the updates few enough that autovacuum doesn't get involved. > I think it's reasonable that at a project-policy level we should > not consider that an excuse. For example, a change in autovacuum's > rules could probably break these tests even with heapam. > > On the other hand, as I mentioned earlier, a table AM that doesn't > reproduce original insertion order would break a whole lot more > tests than these. So that's a bridge I'd rather not cross, > at least not without a lot of consideration. > > BTW, you forgot to update expected/float4-misrounded-input.out. Added in v3. Thanks for a mention! Regards, Pavel Borisov
Attachment
Pavel Borisov <pashkin.elfe@gmail.com> writes: > On Tue, 22 Apr 2025 at 21:13, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> BTW, you forgot to update expected/float4-misrounded-input.out. > Added in v3. Thanks for a mention! v3 LGTM, pushed. regards, tom lane
On Tue, 22 Apr 2025 at 22:25, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Pavel Borisov <pashkin.elfe@gmail.com> writes: > > On Tue, 22 Apr 2025 at 21:13, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> BTW, you forgot to update expected/float4-misrounded-input.out. > > > Added in v3. Thanks for a mention! > > v3 LGTM, pushed. Thank you! Regards, Pavel