Thread: SELECT query results are different depending on whether tablestatistics are available.

I've ran into a bit of a head scratching situation and was hoping that
someone with more knowledge that I could help me understand the
behaviour I'm seeing.

I'm running on PostgreSQL 12.2.

I have a SELECT query that returns different results depending on
whether statistics for the table have been collected or not.The query
uses several CTEs and returns a single integer. This integer changes
depending on whether the table has been analyzed.

As far as I can tell I am not using any 'volatile' functions in my SELECT query.

It took me a while to find a way to reproduce the issue. How I
eventually reproduced it was:

-- Delete all statistics.
DELETE FROM pg_statistic;

-- Truncate table and insert values into table.
TRUNCATE TABLE target_table;
INSERT INTO target_table (...)
VALUES
(...);

-- The results of the SELECT are different depending on whether
ANALYZE is called.
ANALYZE target_table;

-- Run select query (involving several CTEs).
SELECT ...;

I haven't generated a minimal test case yet, but I did notice that if
all CTEs in the SELECT query are defined using AS NOT MATERIALIZED the
results are always the same regardless of whether the table has been
ANALYZED yet.

Could anyone share knowledge about why this is happening?

Thanks,
James Brauman



On Wed, May 27, 2020 at 8:09 PM James Brauman <james.brauman@envato.com> wrote:
-- Run select query (involving several CTEs).
SELECT ...;

I haven't generated a minimal test case yet, but I did notice that if
all CTEs in the SELECT query are defined using AS NOT MATERIALIZED the
results are always the same regardless of whether the table has been
ANALYZED yet.

Could anyone share knowledge about why this is happening?

A likely scenario is you are missing an ORDER BY in a location where you are depending on deterministic row ordering and its changing out from underneath you.

David J.

Thanks for the help David, the query was missing a column in a
PARTITION BY expression.

On Thu, May 28, 2020 at 1:14 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Wed, May 27, 2020 at 8:09 PM James Brauman <james.brauman@envato.com> wrote:
>>
>> -- Run select query (involving several CTEs).
>> SELECT ...;
>>
>> I haven't generated a minimal test case yet, but I did notice that if
>> all CTEs in the SELECT query are defined using AS NOT MATERIALIZED the
>> results are always the same regardless of whether the table has been
>> ANALYZED yet.
>>
>> Could anyone share knowledge about why this is happening?
>
>
> A likely scenario is you are missing an ORDER BY in a location where you are depending on deterministic row ordering
andits changing out from underneath you.
 
>
> David J.
>