Thread: SELECT query results are different depending on whether tablestatistics are available.
SELECT query results are different depending on whether tablestatistics are available.
From
James Brauman
Date:
I've ran into a bit of a head scratching situation and was hoping that someone with more knowledge that I could help me understand the behaviour I'm seeing. I'm running on PostgreSQL 12.2. I have a SELECT query that returns different results depending on whether statistics for the table have been collected or not.The query uses several CTEs and returns a single integer. This integer changes depending on whether the table has been analyzed. As far as I can tell I am not using any 'volatile' functions in my SELECT query. It took me a while to find a way to reproduce the issue. How I eventually reproduced it was: -- Delete all statistics. DELETE FROM pg_statistic; -- Truncate table and insert values into table. TRUNCATE TABLE target_table; INSERT INTO target_table (...) VALUES (...); -- The results of the SELECT are different depending on whether ANALYZE is called. ANALYZE target_table; -- Run select query (involving several CTEs). SELECT ...; I haven't generated a minimal test case yet, but I did notice that if all CTEs in the SELECT query are defined using AS NOT MATERIALIZED the results are always the same regardless of whether the table has been ANALYZED yet. Could anyone share knowledge about why this is happening? Thanks, James Brauman
Re: SELECT query results are different depending on whether tablestatistics are available.
From
"David G. Johnston"
Date:
On Wed, May 27, 2020 at 8:09 PM James Brauman <james.brauman@envato.com> wrote:
-- Run select query (involving several CTEs).
SELECT ...;
I haven't generated a minimal test case yet, but I did notice that if
all CTEs in the SELECT query are defined using AS NOT MATERIALIZED the
results are always the same regardless of whether the table has been
ANALYZED yet.
Could anyone share knowledge about why this is happening?
A likely scenario is you are missing an ORDER BY in a location where you are depending on deterministic row ordering and its changing out from underneath you.
David J.
Re: SELECT query results are different depending on whether tablestatistics are available.
From
James Brauman
Date:
Thanks for the help David, the query was missing a column in a PARTITION BY expression. On Thu, May 28, 2020 at 1:14 PM David G. Johnston <david.g.johnston@gmail.com> wrote: > > On Wed, May 27, 2020 at 8:09 PM James Brauman <james.brauman@envato.com> wrote: >> >> -- Run select query (involving several CTEs). >> SELECT ...; >> >> I haven't generated a minimal test case yet, but I did notice that if >> all CTEs in the SELECT query are defined using AS NOT MATERIALIZED the >> results are always the same regardless of whether the table has been >> ANALYZED yet. >> >> Could anyone share knowledge about why this is happening? > > > A likely scenario is you are missing an ORDER BY in a location where you are depending on deterministic row ordering andits changing out from underneath you. > > David J. >