Thread: usage of indexes for inner joins

usage of indexes for inner joins

From

"Jan Theodore Galkowski"

Date:

28 September 2007, 01:50:19

I fear this has been asked many times about PostgreSQL, and I have read
the docs about how indexes are supposed to be defined and used, but I
don't understand why the engine and optimizer is doing what it does in
the simplest of situations.  Is it that its tuning is heavily data
dependent?

My case of interest is more complicated, but I decided to create a toy
case to try to understand.  Here it is:


  -- Table "foo" DDL

  CREATE TABLE "public"."foo"(

  "projectid" int4 NOT NULL ,

  "uid" int4 NOT NULL ,

  "name" varchar(254) NOT NULL ,

  "ver" varchar(127) NOT NULL ,

  "startdate" date NOT NULL ,

  "enddate" date NOT NULL ,

  "status" varchar(254) NOT NULL ,

  "percentdone" numeric(7,2) NOT NULL ,

  "championuid" int4 NOT NULL ,

  "pmuid" int4 NOT NULL ,

  PRIMARY KEY ("projectid")

  )  WITHOUT OIDS;


  -- Table "bignum" DDL

  CREATE TABLE "public"."bignum"(

  "thing" numeric(100) NOT NULL

  )  WITHOUT OIDS;

  CREATE INDEX "t" ON "public"."bignum" USING btree ("thing");


Running

    EXPLAIN ANALYZE SELECT A.* FROM bignum  B, foo  A WHERE A.projectid
    = B.thing;

yields:

    Nested Loop  (cost=0.00..15.51 rows=1 width=407) (actual
    time=0.041..0.041 rows=0 loops=1)

      Join Filter: ((a.projectid)::numeric = b.thing)  ->

        Seq Scan on bignum b (cost=0.00..1.01 rows=1 width=16) (actual
        time=0.024..0.027 rows=1 loops=1)  ->

        Seq Scan on foo a  (cost=0.00..11.80 rows=180 width=407) (actual
        time=0.003..0.003 rows=0 loops=1)

    Total runtime: .169 ms ;

Like *how* *come*?  There are indexes on both columns of the join.  Is
it the NUMERIC datatype messing things up?  Unlikely, as I've seen the
same with INTEGERs.

If it is data dependent (these tables are presently empty), any
suggestions as to how to tune a database for unknown mixes of data?

This is run on the Windows version of PG, but I'm seeing the same kind
of thing on Linux.

Thanks.

Re: usage of indexes for inner joins

From

Tom Lane

Date:

28 September 2007, 02:10:54

"Jan Theodore Galkowski" <bayesianlogic@acm.org> writes:
>     Total runtime: .169 ms ;

> Like *how* *come*?

You have a problem with 0.1 ms runtime?

But to correct your obvious misunderstanding: yes, the plan depends on
the table size, as well it should.

            regards, tom lane

Re: usage of indexes for inner joins

From

"Ben Trewern"

Date:

01 October 2007, 09:53:25

Sequence scans of an empty table are going to be faster than an index scan,
so the database uses the sequence scan.  Put some data in the tables (some
thousands or millions of records) and then see if it uses an index scan.

Ben

""Jan Theodore Galkowski"" <bayesianlogic@acm.org> wrote in message
news:1190954508.31020.1213039025@webmail.messagingengine.com...
>I fear this has been asked many times about PostgreSQL, and I have read
> the docs about how indexes are supposed to be defined and used, but I
> don't understand why the engine and optimizer is doing what it does in
> the simplest of situations.  Is it that its tuning is heavily data
> dependent?
>
> My case of interest is more complicated, but I decided to create a toy
> case to try to understand.  Here it is:
>
>
>  -- Table "foo" DDL
>
>  CREATE TABLE "public"."foo"(
>
>  "projectid" int4 NOT NULL ,
>
>  "uid" int4 NOT NULL ,
>
>  "name" varchar(254) NOT NULL ,
>
>  "ver" varchar(127) NOT NULL ,
>
>  "startdate" date NOT NULL ,
>
>  "enddate" date NOT NULL ,
>
>  "status" varchar(254) NOT NULL ,
>
>  "percentdone" numeric(7,2) NOT NULL ,
>
>  "championuid" int4 NOT NULL ,
>
>  "pmuid" int4 NOT NULL ,
>
>  PRIMARY KEY ("projectid")
>
>  )  WITHOUT OIDS;
>
>
>  -- Table "bignum" DDL
>
>  CREATE TABLE "public"."bignum"(
>
>  "thing" numeric(100) NOT NULL
>
>  )  WITHOUT OIDS;
>
>  CREATE INDEX "t" ON "public"."bignum" USING btree ("thing");
>
>
> Running
>
>    EXPLAIN ANALYZE SELECT A.* FROM bignum  B, foo  A WHERE A.projectid
>    = B.thing;
>
> yields:
>
>    Nested Loop  (cost=0.00..15.51 rows=1 width=407) (actual
>    time=0.041..0.041 rows=0 loops=1)
>
>      Join Filter: ((a.projectid)::numeric = b.thing)  ->
>
>        Seq Scan on bignum b (cost=0.00..1.01 rows=1 width=16) (actual
>        time=0.024..0.027 rows=1 loops=1)  ->
>
>        Seq Scan on foo a  (cost=0.00..11.80 rows=180 width=407) (actual
>        time=0.003..0.003 rows=0 loops=1)
>
>    Total runtime: .169 ms ;
>
> Like *how* *come*?  There are indexes on both columns of the join.  Is
> it the NUMERIC datatype messing things up?  Unlikely, as I've seen the
> same with INTEGERs.
>
> If it is data dependent (these tables are presently empty), any
> suggestions as to how to tune a database for unknown mixes of data?
>
> This is run on the Windows version of PG, but I'm seeing the same kind
> of thing on Linux.
>
> Thanks.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>

Re: usage of indexes for inner joins

From

"Scott Marlowe"

Date:

01 October 2007, 12:17:05

On 9/27/07, Jan Theodore Galkowski <bayesianlogic@acm.org> wrote:
> I fear this has been asked many times about PostgreSQL, and I have read
> the docs about how indexes are supposed to be defined and used, but I
> don't understand why the engine and optimizer is doing what it does in
> the simplest of situations.  Is it that its tuning is heavily data
> dependent?

This has in fact been discussed many times on the mailing list.
Searching the archives will probably return lots of discussions.

> Like *how* *come*?  There are indexes on both columns of the join.  Is
> it the NUMERIC datatype messing things up?  Unlikely, as I've seen the
> same with INTEGERs.

Postgresql doesn't store "visibility" information in indexes.  this
means that once you find the entry in an index, you then have to see
if it's visible to the current transaction, and that information is
only stored in the tables.  And there are lots of discussions of why
that is in the archives as well.  Basically race conditions make it
impossible to update the table and index concurrently without ugly
locking issues popping up.

So, in pgsql, whether there's an index or not, the db has to hit the
table in the end.

> If it is data dependent (these tables are presently empty), any
> suggestions as to how to tune a database for unknown mixes of data?

No it isn't.  It is range dependent.  If you had a selective enough
where clause then postgresql would choose an index over a sequential
scan.

Your biggest mistake here is thinking the simple solution (use
indexes) is always best.  PostgreSQL uses a cost based planner that
tries to decide ahead of time what plan is going to be fastest.  The
real answer is to give it good information (i.e. run analyze
frequently enough, and have a high enough stats target for the
column(s) you're using)

That means pgsql is paying attention to how big your tables are as
well as what values are in there and what % you're going to get back.

Use explain analyze to see the differences between what the planner
expects and what it gets.  Like this part of your explain analyze
output:

       Seq Scan on foo a  (cost=0.00..11.80 rows=180 width=407) (actual
       time=0.003..0.003 rows=0 loops=1)

note that the planner expected 180 rows but got 0.  that's a sign of
poor stats.  Run analyze and you should see something closer to a
match between expected and actual rows.

Also, try putting some real data in your db, and using a where clause
(unless you really are gonna grab every single row every time...)

Re: usage of indexes for inner joins

From

"Scott Marlowe"

Date:

01 October 2007, 15:15:05

On 10/1/07, Jan Theodore Galkowski <bayesianlogic@acm.org> wrote:
> Scott,
>
> i didn't think this belonged in the general list, but the example i gave
> for discussion was a toy, for illustration.  i could not very well post
> the actual example for many reasons, including proprietary ones and,
> given this is how things work, because the 1.5 million row table in
> question is its own smallest description.

This is the exact kind of question that belongs on -general.  But it
does get asked a lot, especially by people coming from other
databases.

> while indexes are being used on that table, there's a companion table
> which is much smaller -- a "mere" 75000 rows -- which is suffering a
> sequential scan, and i was trying to eliminate those.

Well, don't think of sequential scans as plain bad.  Sometimes they're
the best choice, sometimes they're not.

Now, if an index scan is provably and noticeably faster than the
sequential scan, then the planner is making the wrong decision.  Have
you tried running your query with

set enable_seqscan=off;

to see how it behaves?  I've found many somewhat slow queries got
really fast or really slow when I did that.

Note that you shouldn't blindly run a query all the time with that
setting, as there are many instances where seqscan is the right
answer.  Also, your explain cost estimates will all be way off.

> perhaps it is true that ANALYZE isn't being done often enough.  perhaps
> VACUUMs aren't being done often enough either.  we're leary of
> scheduling repeated VACUUMs having encountered a case where the VACUUM
> took over an hour to complete.

Run "analyze verbose" on your db and see what it says about number of
page slots needed versus used.  that will help you tell if you're
vacuuming enough.

How long vacuum takes isn't really that important.  What is important
is how much of an impact it's having on the system.  there are several
vacuum parameters in the postgresql.conf file that can lower the
impact vacuum has on your system I/O wise while increasing its run
time.

Vacuum full is another story.  Think of it as a recovery tool, not a
periodic maintenance tool.

> it may, too, be because the tables use user-defined types heavily and
> the original UPDATE involved a SELECT ... IN ... having a GROUP BY with
> a few references to columns deep within user-defined types.

Hard to say without a query and an explain analyze output.  It's
common for user defined functions to produce estimates in the planner
that are way off.  user defined types, not so much.  But the more
complex the query the more likely it is that the query planner will
make a bad estimate of the number of rows somewhere and choose a bad
method.

>  that
> wouldn't have been my choice, but, then, they were given to me to work,
> not my design.  in fact, PG is the first relational database
> implementation i've used that offered such things in a big way.

Extensibility is quite a useful tool.

> i also don't understand some other things, which are surprising, like
> why some UPDATEs take so much longer when wrapped in a BEGIN
> TRANSACTION-
> COMMIT than when having the transaction at a statement level.

that is strange.  I'd expect that maybe you've got something happening
with the transaction waiting on other transactions, so that it's not
so much running hard as just tapping its toe waiting for the other
transaction to commit or roll back.

> I come from an Oracle, DB2, Informix world, and in my experience
> plans for queries are more stable.   i have loitered in and around MySQL
> for a while.   i'm not surprised there's a learning curve with PG.  i am
> surprised it breaks so marked with mainstay database experience.

Oh, I've seen Oracle get stupid due to lack of proper statistics as
well.  You like had a good DBA who kept all that stuff hidden from you
though.  But PostgreSQL and mainline db experience are often
incompatible.  The very things that people are used to using to make
other dbs fast (forcing index usage for instance) can make postgresql
noticeably slower.

You might find that partial index help for some circumstances.  If you
are using a query that has a where clause that looks at a field that
has one value 99% of the time and another value 1% of the time, you
can index that 1% only, and an index scan will be ultra quick.  The
standard case for that is a boolean field.

create table test (id int, info text, btest bool);
insert 100,000 rows, with 1% having btest=true, the rest false.
create index test_btest_true on test(btest) where btest IS TRUE;
analyze test;
explain analyze select * from test where btest is true;

Generally, postgresql offers different ways to solve the same problems
as other database, and knowing those ways can really help troubleshoot
and fix poorly performing queries.

Re: usage of indexes for inner joins

From

"Jan Theodore Galkowski"

Date:

01 October 2007, 15:36:07

thanks for all your useful comments.  i will study all of them.

a couple of inline comments below, just for clarification to the group,
marked with asterisks.

On Mon, 1 Oct 2007 13:13:23 -0500, "Scott Marlowe"
<scott.marlowe@gmail.com> said:
> On 10/1/07, Jan Theodore Galkowski <bayesianlogic@acm.org> wrote:
> > Scott,
> >
> > i didn't think this belonged in the general list, but the example i
> > gave for discussion was a toy, for illustration.  i could not very
> > well post the actual example for many reasons, including proprietary
> > ones and, given this is how things work, because the 1.5 million row
> > table in question is its own smallest description.
>
> This is the exact kind of question that belongs on -general.  But it
> does get asked a lot, especially by people coming from other
> databases.
>
> > while indexes are being used on that table, there's a companion
> > table which is much smaller -- a "mere" 75000 rows -- which is
> > suffering a sequential scan, and i was trying to eliminate those.
>
> Well, don't think of sequential scans as plain bad.  Sometimes they're
> the best choice, sometimes they're not.
>
> Now, if an index scan is provably and noticeably faster than the
> sequential scan, then the planner is making the wrong decision.  Have
> you tried running your query with
>
> set enable_seqscan=off;

***actually, yes.  the engine just ignored it.***
>
> to see how it behaves?  I've found many somewhat slow queries got
> really fast or really slow when I did that.
>
> Note that you shouldn't blindly run a query all the time with that
> setting, as there are many instances where seqscan is the right
> answer.  Also, your explain cost estimates will all be way off.
>
> > perhaps it is true that ANALYZE isn't being done often enough.
> > perhaps VACUUMs aren't being done often enough either.  we're leary
> > of scheduling repeated VACUUMs having encountered a case where the
> > VACUUM took over an hour to complete.
>
> Run "analyze verbose" on your db and see what it says about number of
> page slots needed versus used.  that will help you tell if you're
> vacuuming enough.
>
> How long vacuum takes isn't really that important.  What is important
> is how much of an impact it's having on the system.  there are
> several vacuum parameters in the postgresql.conf file that can lower
> the impact vacuum has on your system I/O wise while increasing its
> run time.
>
> Vacuum full is another story.  Think of it as a recovery tool, not a
> periodic maintenance tool.
>
> > it may, too, be because the tables use user-defined types heavily
> > and the original UPDATE involved a SELECT ... IN ... having a GROUP
> > BY with a few references to columns deep within user-defined types.
>
> Hard to say without a query and an explain analyze output.  It's
> common for user defined functions to produce estimates in the
> planner that are way off.  user defined types, not so much.  But the
> more complex the query the more likely it is that the query planner
> will make a bad estimate of the number of rows somewhere and choose
> a bad method.
>
> >  that wouldn't have been my choice, but, then, they were given to
> >  me to work, not my design.  in fact, PG is the first relational
> >  database implementation i've used that offered such things in a
> >  big way.
>
> Extensibility is quite a useful tool.
>
> > i also don't understand some other things, which are surprising,
> > like why some UPDATEs take so much longer when wrapped in a BEGIN
> > TRANSACTION- COMMIT than when having the transaction at a statement
> > level.
>
> that is strange.  I'd expect that maybe you've got something happening
> with the transaction waiting on other transactions, so that it's not
> so much running hard as just tapping its toe waiting for the other
> transaction to commit or roll back.

*** yes, i thought it was odd, too.  there wasn't anything else in that
transaction, and the table was set up for an experiment.  of course, the
experiment was one of those "UPDATE foo SET foo.x = 1 + foo.x WHERE
foo.y < k" things. ***
>
> > I come from an Oracle, DB2, Informix world, and in my experience
> > plans for queries are more stable.   i have loitered in and around
> > MySQL for a while.   i'm not surprised there's a learning curve with
> > PG.  i am surprised it breaks so marked with mainstay database
> > experience.
>
> Oh, I've seen Oracle get stupid due to lack of proper statistics as
> well.  You like had a good DBA who kept all that stuff hidden from
> you  though.

No comment, actually, since I worked right alongside the the DBA, and
sometimes did things myself.

> .... But PostgreSQL and mainline db experience are often
> incompatible.  The very things that people are used to using to make
> other dbs fast (forcing index usage for instance) can make postgresql
> noticeably slower.
>
> You might find that partial index help for some circumstances.  If you
> are using a query that has a where clause that looks at a field that
> has one value 99% of the time and another value 1% of the time, you
> can index that 1% only, and an index scan will be ultra quick.  The
> standard case for that is a boolean field.
>
> create table test (id int, info text, btest bool); insert 100,000
> rows, with 1% having btest=true, the rest false. create index
> test_btest_true on test(btest) where btest IS TRUE; analyze test;
> explain analyze select * from test where btest is true;
>
> Generally, postgresql offers different ways to solve the same problems
> as other database, and knowing those ways can really help troubleshoot
> and fix poorly performing queries.