Thread: Performance differential when 0 values present vs when 1 values present. Planner return 52k rows when 0 expected.

Dear mailing list

We are investigating a strange performance issue with our database.

Our use case is a sensor reading database where we have sensor location 
(called channels), parameter settings (called valueseries) and reading 
(called datavalues). Datavalues is partitioned per month.
Like this: channel <-(FK)- valueseries <-(FK)- datavalues

We have a query that returns the latest sensor reading. When we have no 
readings in datavalues the query is SLOW, when we have 1 or more 
readings in datavalues the query is FAST. (slow being ~1second, fast 
~5ms) It isn't the slowness that is the main problem, but rather the odd 
behaviour.

The query that is giving us issues is the following, channel 752433 has 
NO values, 752431 has values.
(Channel 752433 only has valueseries 752434)

select * from datavalue
where dataview in ( select id from valueseries where channel = 
%channel_idx%)
ORDER BY VALUETIMESTAMP DESC
FETCH FIRST ROW only;
        
Running explain analyze shows strange numbers, 52'000 rows are being 
returned but there are no rows there.

For channel 752433
->  Index Scan Backward using 
datavalue_2022_03_valuetimestamp_dataview_idx on datavalue_2022_03 
datavalue_6  (cost=0.42..7166.19 rows=119673 width=226) (actual 
time=0.008..32.831 rows=119601 loops=1)
->  Index Scan Backward using 
datavalue_2022_04_valuetimestamp_dataview_idx on datavalue_2022_04 
datavalue_7  (cost=0.29..4002.79 rows=52499 width=227) (actual 
time=0.011..15.005 rows=52499 loops=1)
    
For channel 752431
->  Index Scan Backward using 
datavalue_2022_03_valuetimestamp_dataview_idx on datavalue_2022_03 
datavalue_6  (cost=0.42..7166.19 rows=119673 width=226) (actual 
time=0.008..0.008 rows=1 loops=1)
->  Index Scan Backward using 
datavalue_2022_04_valuetimestamp_dataview_idx on datavalue_2022_04 
datavalue_7  (cost=0.29..4002.79 rows=52499 width=227) (actual 
time=0.011..0.011 rows=1 loops=1)        
    
Inserting even a single row changes the offending rows to the expected 
values:

insert into maclient.datavalue (dataview, valuetimestamp, datavalue) 
values (752434, '2022-03-01 00:00:00', 234);
->  Index Scan Backward using 
datavalue_2022_03_valuetimestamp_dataview_idx on datavalue_2022_03 
datavalue_6  (cost=0.42..7166.19 rows=119673 width=226) (actual 
time=0.006..0.006 rows=1 loops=1)
                
Full explain analyze on https://paste.depesz.com/s/ZwJ
with buffers and track_io_timing: https://paste.depesz.com/s/Ss

Disabling indexscan (set enable_indexscan=false;) hides the problem, it 
does not show up with a bitmap index scan.
Running autovacuum analyze doesn't seem to help, the results are the same.


SELECT version();
"PostgreSQL 14.2, compiled by Visual C++ build 1914, 64-bit"


Can anyone explain what is going on here.
* Why is the database returning 52'000+ rows when it should be returning 0?
* Is my query badly formulated?
* Is there something wrong with the indexes and I need to rebuild them?
    
We are stumped, and would greatly appreciate any input.

Regards
Emil



Emil Iggland <emil.iggland@metrima.com> writes:
> The query that is giving us issues is the following, channel 752433 has 
> NO values, 752431 has values.
> (Channel 752433 only has valueseries 752434)

> select * from datavalue
> where dataview in ( select id from valueseries where channel = 
> %channel_idx%)
> ORDER BY VALUETIMESTAMP DESC
> FETCH FIRST ROW only;
        
> Running explain analyze shows strange numbers, 52'000 rows are being 
> returned but there are no rows there.

> For channel 752433
> ->  Index Scan Backward using 
> datavalue_2022_03_valuetimestamp_dataview_idx on datavalue_2022_03 
> datavalue_6  (cost=0.42..7166.19 rows=119673 width=226) (actual 
> time=0.008..32.831 rows=119601 loops=1)

You've got the wrong column order (for this query anyway) in that
index.  It'd work a lot better if dataview were the first column;
or at least, it wouldn't tempt the planner to try this unstably-
performing plan.  It's trying to use the index ordering to satisfy
the ORDER BY, which works great as long as it finds a dataview
match in some reasonably recent index entry.  Otherwise, it's
going to crawl the whole index to discover that there's no match.

            regards, tom lane



 > You've got the wrong column order (for this query anyway) in that
 > index.  It'd work a lot better if dataview were the first column;
I might be misunderstanding you, but I assume that you are suggesting an 
index on (dataview, valuetimestamp).
We have that index, it is the primary key. For some reason it isn't 
being selected.

I can understand that it has to go through the whole index, potentially 
even the whole table, but I do not why it takes so long.

Even a query that should take equally long (probably longer) is 
substantially faster:

explain (analyze, buffers)
select valuetimestamp from datavalue
where valuetimestamp <> '1965-01-07 05:50:59';

Completes in less than 500ms using a sequential scan,

...
->  Seq Scan on datavalue_2022_04 datavalue_7  (cost=0.00..1450.39 
rows=56339 width=8) (actual time=0.013..5.988 rows=56109 loops=1)"
    Filter: (valuetimestamp <> '1965-01-07 05:50:59'::timestamp without 
time zone)
    Buffers: shared hit=742 read=4
...
Planning Time: 0.781 ms
Execution Time: 394.408 ms


while the original query takes over 1 second.
...
->  Index Scan Backward using 
datavalue_2022_04_valuetimestamp_dataview_idx on datavalue_2022_04 
datavalue_7  (cost=0.29..4292.48 rows=56351 width=227) (actual 
time=0.166..17.340 rows=56109 loops=1)
    Buffers: shared hit=42013 read=278
...
Planning Time: 0.964 ms
Execution Time: 1291.509 ms

I do not understand how looking at every value in the index and 
returning none be slower than looking at every table in the table and 
returning none. If it takes 500ms to return every value in the table via 
a sequential scan, then it should take less via an index scan.


In case we never solve it, and someone else runs into similiar problems, 
we (hopefully temporarily) worked around it by reformulating the query 
to use a lateral join:

EXPLAIN (analyze, buffers)
SELECT dv.* FROM valueseries vs
LEFT JOIN LATERAL (
    SELECT * FROM datavalue dv WHERE dv.dataview = vs.id
    ORDER BY VALUETIMESTAMP
    FETCH FIRST 1 ROWS ONLY
) dv ON TRUE
where vs.channel = 752433

This causes it to use the correct index:
->  Index Scan using datavalue_2022_01_pkey on datavalue_2022_01 dv_4 
(cost=0.42..2951.17 rows=1032 width=228) (actual time=0.034..0.034 
rows=0 loops=1)
    Index Cond: (dataview = vs.id)
    Buffers: shared read=3
...
Planning Time: 1.169 ms
Execution Time: 0.524 ms


Regards
Emil


On 2022-04-25 18:00, Tom Lane wrote:
> Emil Iggland <emil.iggland@metrima.com> writes:
>> The query that is giving us issues is the following, channel 752433 has
>> NO values, 752431 has values.
>> (Channel 752433 only has valueseries 752434)
> 
>> select * from datavalue
>> where dataview in ( select id from valueseries where channel =
>> %channel_idx%)
>> ORDER BY VALUETIMESTAMP DESC
>> FETCH FIRST ROW only;
>         
>> Running explain analyze shows strange numbers, 52'000 rows are being
>> returned but there are no rows there.
> 
>> For channel 752433
>> ->  Index Scan Backward using
>> datavalue_2022_03_valuetimestamp_dataview_idx on datavalue_2022_03
>> datavalue_6  (cost=0.42..7166.19 rows=119673 width=226) (actual
>> time=0.008..32.831 rows=119601 loops=1)
> 
> You've got the wrong column order (for this query anyway) in that
> index.  It'd work a lot better if dataview were the first column;
> or at least, it wouldn't tempt the planner to try this unstably-
> performing plan.  It's trying to use the index ordering to satisfy
> the ORDER BY, which works great as long as it finds a dataview
> match in some reasonably recent index entry.  Otherwise, it's
> going to crawl the whole index to discover that there's no match.
> 
>             regards, tom lane



On Wed, 27 Apr 2022 at 19:54, Emil Iggland <emil.iggland@metrima.com> wrote:
>
>  > You've got the wrong column order (for this query anyway) in that
>  > index.  It'd work a lot better if dataview were the first column;

> I might be misunderstanding you, but I assume that you are suggesting an
> index on (dataview, valuetimestamp).
> We have that index, it is the primary key. For some reason it isn't
> being selected.

I don't think that index can be used for your original query. It could
only be used if "channel" is unique in "valueseries" and you'd written
the query as:

select * from datavalue
where dataview = (select id from valueseries where channel = 752433)
ORDER BY VALUETIMESTAMP DESC
FETCH FIRST ROW only;

that would allow a backwards index scan using the (dataview,
valuetimestamp) index.  Because you're using the IN clause to possibly
look for multiple "dataview" values matching the given "channel", the
index range scan does not have a single point to start at.  What
you've done with the LATERAL query allows the index to be scanned once
for each "valueseries" row with a "channel" value matching your WHERE
clause.

I guess "channel" must not be the primary key to "valueseries" and
that's why you use an IN().

The above query would return an error if multiple rows were returned
by the subquery.

David



 > I don't think that index can be used for your original query. It could
 > only be used if "channel" is unique in "valueseries" and you'd written
 > the query as:

Thanks! That explanation I can understand, now I know how to avoid this 
in future.

 > I guess "channel" must not be the primary key to "valueseries" and
 > that's why you use an IN().
Correct. We create a new valueseries in some circumstances, so multiple 
valueseries can point to the same channel.




On 2022-04-27 10:22, David Rowley wrote:
> On Wed, 27 Apr 2022 at 19:54, Emil Iggland <emil.iggland@metrima.com> wrote:
>>
>>   > You've got the wrong column order (for this query anyway) in that
>>   > index.  It'd work a lot better if dataview were the first column;
> 
>> I might be misunderstanding you, but I assume that you are suggesting an
>> index on (dataview, valuetimestamp).
>> We have that index, it is the primary key. For some reason it isn't
>> being selected.
> 
> I don't think that index can be used for your original query. It could
> only be used if "channel" is unique in "valueseries" and you'd written
> the query as:
> 
> select * from datavalue
> where dataview = (select id from valueseries where channel = 752433)
> ORDER BY VALUETIMESTAMP DESC
> FETCH FIRST ROW only;
> 
> that would allow a backwards index scan using the (dataview,
> valuetimestamp) index.  Because you're using the IN clause to possibly
> look for multiple "dataview" values matching the given "channel", the
> index range scan does not have a single point to start at.  What
> you've done with the LATERAL query allows the index to be scanned once
> for each "valueseries" row with a "channel" value matching your WHERE
> clause.
> 
> I guess "channel" must not be the primary key to "valueseries" and
> that's why you use an IN().
> 
> The above query would return an error if multiple rows were returned
> by the subquery.
> 
> David