Home > mailing lists

Re: seq vs index scan in join query - Mailing list pgsql-general

From	Laurenz Albe
Subject	Re: seq vs index scan in join query
Date	November 29, 2017 20:17:18
Msg-id	1511975838.2309.12.camel@cybertec.at Whole thread Raw
In response to	seq vs index scan in join query (Emanuel Alvarez <ema@abductedcow.com.ar>)
Responses	Re: seq vs index scan in join query
List	pgsql-general

Tree view

Emanuel Alvarez wrote:
> the problematic query looks like this:
> 
> SELECT keywords.strategy_id, results.position, results.created_at FROM results
>   JOIN  keywords ON results.keyword_id = keywords.id
>   WHERE results.account_id = 1
>      AND results.created_at >= '2017-10-25 00:00:00.000000'
>      AND results.created_at <= '2017-11-10 23:59:59.999999';
> 
> 
> as you can see in the query plan [1] a sequential scan is preferred.
> as we understand it, this happens because the number of rows returned
> from results is too large. if we reduce this number by either
> selecting a smaller created_at range, or another account_id with fewer
> keywords, the planner falls back to an index scan, confirming that the
> number of rows returned from results has a direct influence in this
> choice.
> 
> on the other hand, if we disable sequential scans (SET enable_seqscan
> = 0), we see than not only the query runs faster but the cost seems to
> be lower, as seen in the query plan [2].

The optimizer is right here.

Even though your second execution without sequential scans ran faster,
it is worse.

That is because the execution with the sequential scan touched
26492  + 80492 = 106984 blocks, while the second execution touched
311301 + 48510 = 359811 blocks, more than three times as many.

The second execution was just lucky because most of these blocks were
already cached, and it had to read only half as many blocks from disk.

If you repeat the execution a couple of times, you should see that
the execution using the sequential scans becomes faster.


You can boost performance even more by increasing work_mem
so that the hash can be created in memory.

Yours,
Laurenz Albe

pgsql-general by date:

From: Steven Lembark
Date: 29 November 2017, 20:03:17
Subject: Re: large numbers of inserts out of memory strategy

From: Andres Freund
Date: 29 November 2017, 20:46:40
Subject: Re: seq vs index scan in join query

Re: seq vs index scan in join query - Mailing list pgsql-general

Previous

Next