Thread: Why the planner does not use index for a large amount of data?

Why the planner does not use index for a large amount of data?

From
hmidi slim
Date:
Hi,
When I used explain I found that the query planner use a seq scan to execute a query on a table containing about 2 millions rows.However I'm creating an index.Why does the planner uses seq scan in place of index scan?Does the execution of index scan is slower with table containing a huge amount of data?

Re: Why the planner does not use index for a large amount of data?

From
Melvin Davidson
Date:


On Wed, Dec 6, 2017 at 9:37 AM, hmidi slim <hmidi.slim2@gmail.com> wrote:
Hi,
When I used explain I found that the query planner use a seq scan to execute a query on a table containing about 2 millions rows.However I'm creating an index.Why does the planner uses seq scan in place of index scan?Does the execution of index scan is slower with table containing a huge amount of data?

>Why does the planner uses seq scan in place of index scan?

It is strongly suggested that you provide the PostgreSQL version and O/S when posing questions to this mail list.
In addition, your question cannot be answered unless you also provide
A. The explain plan in question.
B. The structure of all indexes involved
C, Have you run ANALYZE on all tables involved?



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: Why the planner does not use index for a large amount of data?

From
Melvin Davidson
Date:


On Wed, Dec 6, 2017 at 9:37 AM, hmidi slim <hmidi.slim2@gmail.com> wrote:
Hi,
When I used explain I found that the query planner use a seq scan to execute a query on a table containing about 2 millions rows.However I'm creating an index.Why does the planner uses seq scan in place of index scan?Does the execution of index scan is slower with table containing a huge amount of data?

>Why does the planner uses seq scan in place of index scan?

It is strongly suggested that you provide the PostgreSQL version and O/S when posing questions to this mail list.
In addition, your question cannot be answered unless you also provide
A. The explain plan in question.
B. The structure of all indexes involved
C, Have you run ANALYZE on all tables involved?



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: Why the planner does not use index for a large amount of data?

From
"David G. Johnston"
Date:
On Wed, Dec 6, 2017 at 7:37 AM, hmidi slim <hmidi.slim2@gmail.com> wrote:
Hi,
When I used explain I found that the query planner use a seq scan to execute a query on a table containing about 2 millions rows.However I'm creating an index.Why does the planner uses seq scan in place of index scan?Does the execution of index scan is slower with table containing a huge amount of data?

​Please avoid posting to multiple lists at once.

An index doesn't contain visibility information so every record located in the index must also be checked on the storage table to determine if it is visible to the current transaction and thus is valid to be returned.  If it does need to be returned the table itself is also needed to get the rest of the information.  Thus the index scan itself involves additional non-value-added (NVA) effort so far as the query is concerned.  If a large fraction (I've seen estimates of 10%) of the table is estimated to be returned the additional cost involved with scanning the entire table will less than the NVA cost of walking through the index and then fetching records from the table anyway.

Also: "Fetching rows separately is much more expensive than reading them sequentially."


David J.

Re: Why the planner does not use index for a large amount of data?

From
"David G. Johnston"
Date:
On Wed, Dec 6, 2017 at 7:37 AM, hmidi slim <hmidi.slim2@gmail.com> wrote:
Hi,
When I used explain I found that the query planner use a seq scan to execute a query on a table containing about 2 millions rows.However I'm creating an index.Why does the planner uses seq scan in place of index scan?Does the execution of index scan is slower with table containing a huge amount of data?

​Please avoid posting to multiple lists at once.

An index doesn't contain visibility information so every record located in the index must also be checked on the storage table to determine if it is visible to the current transaction and thus is valid to be returned.  If it does need to be returned the table itself is also needed to get the rest of the information.  Thus the index scan itself involves additional non-value-added (NVA) effort so far as the query is concerned.  If a large fraction (I've seen estimates of 10%) of the table is estimated to be returned the additional cost involved with scanning the entire table will less than the NVA cost of walking through the index and then fetching records from the table anyway.

Also: "Fetching rows separately is much more expensive than reading them sequentially."


David J.

Re: Why the planner does not use index for a large amount of data?

From
Jim Finnerty
Date:
Hi Hmidi,

    Your description omits some important information, such as whether you
defined your index to include all columns needed by the request, but the
short answer is 'yes', if you retrieve enough rows, then it will be faster
to read the entire heap instead of reading the necessary parts of the index
and all the referenced pages of the heap, in index order.  The decision is
more complex than just this in PostgreSQL due to MVCC, the visibility bits
on each page, and the alternative of doing a bitmap heap scan when the
selectivity is moderate.  The planner is designed to consider all these
factors, cost them, and choose a minimum cost plan.

    If you're convinced that the planner has not picked the fastest plan,
you can either try to force the plan by setting the enable_* guc variables,
or you can use the pg_hint_plan extension (which sets the enable_* variables
on your behalf), to get the plan you want, and then measure the performance
of the query with each alternative.

    /Jim F



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html