On Tue, 2 Apr 2019 at 20:00, tank.zhang <6220104@qq.com> wrote:
> 2、 Adding a DISTINCT response time was very slow
>
> qis3_dp2=# SELECT COUNT(DISTINCT SVIN) AS CHECKCARNUM FROM
> QIS_CARPASSEDSTATION A WHERE 1=1 AND A.SSTATIONCD = 'VQ3_LYG' AND A.SLINENO
> IN ( '1F' , '2F' , '3F' ) AND A.DWORKDATE >=
> TO_DATE('2017-02-11','YYYY-MM-DD') AND A.DWORKDATE <=
> TO_DATE('2019-03-11','YYYY-MM-DD');
> checkcarnum
> -------------
> 1071367
> (1 row)
That's because of how DISTINCT is implemented within an aggregate
function in PostgreSQL. Internally within the aggregate code in the
executor, a sort is performed on the entire input to the aggregate
node. The planner is currently unable to make use of any indexes that
provide pre-sorted input.
One way to work around this would be to perform the DISTINCT and
COUNT(*) in separate stages using a subquery.
From your original query, something like:
SELECT COUNT(SVIN) AS CHECKCARNUM,SMTOC
FROM (
SELECT SMTOC,SVIN
FROM QIS_CARPASSEDSTATION A
WHERE 1=1 AND A.SSTATIONCD = 'VQ3_LYG'
AND A.SLINENO IN ( '1F' , '2F' , '3F' )
AND A.DWORKDATE >= TO_DATE('2017-02-11','YYYY-MM-DD')
AND A.DWORKDATE <= TO_DATE('2019-03-11','YYYY-MM-DD')
GROUP BY SMTOC,SVIN
) A GROUP BY SMTOC;
An index something like:
CREATE INDEX ON QIS_CARPASSEDSTATION (SMTOC, SVIN, SSTATIONCD, DWORKDATE);
Should help speed up the subquery and provide pre-sorted input to the
outer aggregate. If you like, you could add SLINENO to the end of the
index to allow an index-only scan which may result in further
performance improvements.
Without the index, you're forced to sort, but at least it's just one
sort instead of two.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services