Home > mailing lists

Re: faster way to calculate top "tags" for a "resource" based on a column - Mailing list pgsql-general

From	Jonathan Vanasco
Subject	Re: faster way to calculate top "tags" for a "resource" based on a column
Date	October 7, 2014 15:42:52
Msg-id	F6D7998D-44FD-495E-8900-CCDE561999F1@2xlp.com Whole thread Raw
In response to	Re: faster way to calculate top "tags" for a "resource" based on a column (Marc Mamin <M.Mamin@intershop.de>)
List	pgsql-general

Tree view

On Oct 7, 2014, at 10:02 AM, Marc Mamin wrote:

> Hi,
> it seems to me that your subquery may deliver duplicate ids.
> And with the selectivity of your example, I would expect an index usage
> instead of a table scan. You may check how up to date your statistics are
> and try to raise the statistic target on the column resource_2_tag.tag_id.
> Also try a CTE form for your query:

It shouldn't be able to deliver duplicate ids.

=> SELECT COUNT(*) FROM (SELECT DISTINCT id FROM resource WHERE resource_attribute1_id = 614) AS foo;
count -------  5184

=> SELECT COUNT(*) FROM (SELECT id FROM resource WHERE resource_attribute1_id = 614) AS foo;
count -------  5184

However, adding in the DISTINCT drastically changed the query plan, and did give a speedup.

Your comment made me focus on the notion of a Table Scan. I assumed it did the seq scan - and there would not be much
savingsotherwise - because the table is just 2 ids. 

I was wrong.

I noticed that I never put a PRIMARY KEY constraint on that table.

So i tried adding a PRIMARY KEY constraint, then running vacuum analyze...

And that solved all my problems.

the original query ended up being the fastest at 260ms ( down from 1760 )

Join - 260ms
Subquery w/DISTINCT - 300ms
CTE - 330
CTE w/DISTINCT - 345ms
Subquery (no DISTINCT) - 1500ms

pgsql-general by date:

From: Sergey Konoplev
Date: 07 October 2014, 15:30:22
Subject: Re: Converting char to varchar automatically

From: Tom Lane
Date: 07 October 2014, 17:16:52
Subject: Re: Converting char to varchar automatically

Re: faster way to calculate top "tags" for a "resource" based on a column - Mailing list pgsql-general

Previous

Next