Thread: Fast distinct not working as expected

Fast distinct not working as expected

From
Franck Routier
Date:
Hi,

we are using a mono-column index on a huge table to try to make a quick
'select distinct ' on the column.

This used to work fine, but... it does not anymore. We don't know what
happened.

Here are the facts:

- request:
SELECT  dwhinv___rfovsnide::varchar FROM dwhinv WHERE dwhinv___rfovsnide
> '201212_cloture' ORDER BY dwhinv___rfovsnide LIMIT 1

- Plan :
Limit  (cost=0.00..1.13 rows=1 width=12) (actual time=5798.915..5798.916
rows=1 loops=1)
  ->  Index Scan using vsn_idx on dwhinv  (cost=0.00..302591122.05
rows=267473826 width=12) (actual time=5798.912..5798.912 rows=1 loops=1)
        Index Cond: ((dwhinv___rfovsnide)::text > '201212_cloture'::text)
Total runtime: 5799.141 ms

- default_statistics_target = 200;

- postgresql Version 8.4

- Index used :
CREATE INDEX vsn_idx
  ON dwhinv
  USING btree
  (dwhinv___rfovsnide);

There are 26 distinct values of the column.
This query used to take some milliseconds at most. The index has been
freshly recreated.

What could be the problem ?

Franck


Re: Fast distinct not working as expected

From
Jeff Janes
Date:
On Thu, Apr 17, 2014 at 8:11 AM, Franck Routier <franck.routier@axege.com> wrote:
Hi,

we are using a mono-column index on a huge table to try to make a quick
'select distinct ' on the column.

This used to work fine, but... it does not anymore. We don't know what
happened.

Here are the facts:

- request:
SELECT  dwhinv___rfovsnide::varchar FROM dwhinv WHERE dwhinv___rfovsnide
> '201212_cloture' ORDER BY dwhinv___rfovsnide LIMIT 1

That is not equivalent to a distinct.  There must be more to it than that.
 

- Plan :
Limit  (cost=0.00..1.13 rows=1 width=12) (actual time=5798.915..5798.916
rows=1 loops=1)
  ->  Index Scan using vsn_idx on dwhinv  (cost=0.00..302591122.05
rows=267473826 width=12) (actual time=5798.912..5798.912 rows=1 loops=1)
        Index Cond: ((dwhinv___rfovsnide)::text > '201212_cloture'::text)
Total runtime: 5799.141 ms


My best guess would be that the index got stuffed full of entries for rows that are not visible, either because they are not yet committed, or have been deleted but are not yet vacuumable.  Do you have any long-lived transactions?
 

- postgresql Version 8.4

Newer versions have better diagnostic tools.  An explain (analyze, buffers)  would be nice, especially with track_io_timing on.

Cheers,

Jeff

Re: Fast distinct not working as expected

From
Franck Routier
Date:
Hi,

That is not equivalent to a distinct.  There must be more to it than that.
Indeed, this query is used in a loop:

CREATE OR REPLACE FUNCTION small_distinct(IN tablename character varying, IN fieldname character varying, IN sample anyelement DEFAULT ''::character varying)
  RETURNS SETOF anyelement AS
$BODY$
BEGIN
   EXECUTE 'SELECT '||fieldName||' FROM '||tableName||' ORDER BY '||fieldName
      ||' LIMIT 1'  INTO result;
   WHILE result IS NOT NULL LOOP
      RETURN NEXT;
      EXECUTE 'SELECT '||fieldName||' FROM '||tableName
         ||' WHERE '||fieldName||' > $1 ORDER BY ' || fieldName || ' LIMIT 1'
         INTO result USING result;
   END LOOP;
END;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100
  ROWS 1000;


Since we have the problem, some iteration of the query are still quick (< 1ms), but others are long (> 5s).
 




My best guess would be that the index got stuffed full of entries for rows that are not visible, either because they are not yet committed, or have been deleted but are not yet vacuumable.  Do you have any long-lived transactions?
There has been a delete on the table (about 20% of the records). Then a manual VACUUM.
We have recreated the index, but it did not help.

In the explain analyze output, the index scan begins at 5798.912. What can be happening before that ?
Index Scan using vsn_idx on dwhinv  (cost=0.00..302591122.05
rows=267473826 width=12) (actual time=5798.912..5798.912 rows=1 loops=1)
(Notice the delay is not planning itself, as explain is instantaneous)

 

- postgresql Version 8.4

Newer versions have better diagnostic tools.  An explain (analyze, buffers)  would be nice, especially with track_io_timing on.
Yep, we certainly would like to, but this is a distant prod box, with no access to an online upgrade source, and no planned upgrade for now :-((

Regards,
Franck

Re: Fast distinct not working as expected

From
Jeff Janes
Date:
On Thu, Apr 17, 2014 at 10:17 AM, Franck Routier <franck.routier@axege.com> wrote:


My best guess would be that the index got stuffed full of entries for rows that are not visible, either because they are not yet committed, or have been deleted but are not yet vacuumable.  Do you have any long-lived transactions?
There has been a delete on the table (about 20% of the records). Then a manual VACUUM.
We have recreated the index, but it did not help.

If there are any open transactions (even ones that have never touched this particular table) which started before the delete was committed, then the vacuum was obliged to keep those deleted records around, in case that open transaction happens to start caring about them.  I assume that the deleted rows were not randomly distributed, but rather were concentrated in the exact range you are now inspecting.

The reindex was also obliged to carry those deleted but not yet uninteresting rows along to the new index.

 

In the explain analyze output, the index scan begins at 5798.912. What can be happening before that ?

The index scan reports it first *visible* row at 5798.912.  Before that, it was probably digging through thousands or millions of deleted rows, labouriously verifying that they are not visible to it (but still theoretically visible to someone else).  

It could be blocked on a lock or something, or you could have really horrible IO contention that takes 5 seconds to read two blocks.  But I think the other scenario is more likely.

By the way, many people don't like silent cross-posting, as then we end up unknowningly answering a question here that was already answered elsewhere.




Cheers,

Jeff

Re: Fast distinct not working as expected

From
Franck Routier
Date:
Hi,

Le 17/04/2014 20:17, Jeff Janes a écrit :


If there are any open transactions (even ones that have never touched this particular table) which started before the delete was committed, then the vacuum was obliged to keep those deleted records around, in case that open transaction happens to start caring about them.  I assume that the deleted rows were not randomly distributed, but rather were concentrated in the exact range you are now inspecting.

The reindex was also obliged to carry those deleted but not yet uninteresting rows along to the new index.
Ok, I understand now
 

In the explain analyze output, the index scan begins at 5798.912. What can be happening before that ?

The index scan reports it first *visible* row at 5798.912.  Before that, it was probably digging through thousands or millions of deleted rows, labouriously verifying that they are not visible to it (but still theoretically visible to someone else). 
Very clear explaination, thank you.

It could be blocked on a lock or something, or you could have really horrible IO contention that takes 5 seconds to read two blocks.  But I think the other scenario is more likely.
Yes, I also think so.

By the way, many people don't like silent cross-posting, as then we end up unknowningly answering a question here that was already answered elsewhere.
Yes, sorry for that. I don't like it either, this was posted by a colleague of mine. One of those young foolish guys that prefer web interfaces to plain mailing-lists... Gonna scold him :-)

Best regards,
Franck

Re: Fast distinct not working as expected

From
Franck Routier
Date:
I have found the problem, using this query (found here http://stackoverflow.com/questions/3312929/postgresql-idle-in-transaction-diagnosis-and-reading-pg-locks)
select pg_class.relname, pg_locks.transactionid, pg_locks.mode,      pg_locks.granted as "g", pg_stat_activity.current_query,      pg_stat_activity.query_start,      age(now(),pg_stat_activity.query_start) as "age",      pg_stat_activity.procpid 
from pg_stat_activity,pg_locks
left outer join pg_class on (pg_locks.relation = pg_class.oid)  
where pg_locks.pid=pg_stat_activity.procpid
and pg_stat_activity.procpid = <AN IDLE TRANSACTION PROCESS>
order by query_start;

And indeed, we constantly have idle transcations. They all use the same dummy table, a dual table substitute containing only one column, and one row.
We use this table with tomcat-jdbc-pool to check connections health with 'select 1 from dual' (we don't use 'select 1' for portability reasons, to work with oracle also).
And these transactions are never commited. So we have a bunch of running transactions, constantly running and recreated by tomcat-jdbc-pool. Some of them run for hours.
This seems to impact significally the ability of postgresql to vacuum... and thus to keep efficient indexes!

Changing the configration of tomcat-jdbc-pool to 'select 1 from dual; commit;' seems to resolve the problem.

I'm going to ask on tomcat-jdbc-pool mailing-list if this is ok.

Thanks a lot for your help.

Franck

Re: Fast distinct not working as expected

From
Merlin Moncure
Date:
On Fri, Apr 18, 2014 at 3:07 AM, Franck Routier
<franck.routier@axege.com> wrote:
> I have found the problem, using this query (found here
> http://stackoverflow.com/questions/3312929/postgresql-idle-in-transaction-diagnosis-and-reading-pg-locks)
>
> select pg_class.relname, pg_locks.transactionid, pg_locks.mode,
>        pg_locks.granted as "g", pg_stat_activity.current_query,
>        pg_stat_activity.query_start,
>        age(now(),pg_stat_activity.query_start) as "age",
>        pg_stat_activity.procpid
> from pg_stat_activity,pg_locks
> left outer join pg_class on (pg_locks.relation = pg_class.oid)
> where pg_locks.pid=pg_stat_activity.procpid
> and pg_stat_activity.procpid = <AN IDLE TRANSACTION PROCESS>
> order by query_start;
>
>
> And indeed, we constantly have idle transcations. They all use the same
> dummy table, a dual table substitute containing only one column, and one
> row.
> We use this table with tomcat-jdbc-pool to check connections health with
> 'select 1 from dual' (we don't use 'select 1' for portability reasons, to
> work with oracle also).
> And these transactions are never commited. So we have a bunch of running
> transactions, constantly running and recreated by tomcat-jdbc-pool. Some of
> them run for hours.
> This seems to impact significally the ability of postgresql to vacuum... and
> thus to keep efficient indexes!

It affects a lot of other things too.  All locks held by those
transactions are still held.  Failure to release transactions is a
major application error that can and will explode the database.  It's
similar in severity to a memory leak.  The basic rule of thumb is that
transactions should be held for the shortest possible duration --
especially those that write to the database.

merlin