Thread: Dissapearing indexes, what's that all about?

Dissapearing indexes, what's that all about?

From
Daniel ?erud
Date:
Hey people,
have a very strange problem now...
It's that when my database with the script i'm using a
explain select * from acc where username='britta' yields an
index scan. Very nice. But after some time being up the
same thing yields a SEQUENTIAL scan. What's happening
folks? =) No more than update's select's and vacuums has
been invoked.

Thanks! *hug*

Daniel Åkerud


Re: Dissapearing indexes, what's that all about?

From
"Len Morgan"
Date:
>Hey people,
>have a very strange problem now...
>It's that when my database with the script i'm using a
>explain select * from acc where username='britta' yields an
>index scan. Very nice. But after some time being up the
>same thing yields a SEQUENTIAL scan. What's happening
>folks? =) No more than update's select's and vacuums has
>been invoked.

Are you doing vacuum or vacuum analyze?  I believe only the 'vacuum analyze'
function cleans up the indexing statistics.

len


Re: Re: Dissapearing indexes, what's that all about?

From
Daniel ?erud
Date:

> >Hey people,
> >have a very strange problem now...
> >It's that when my database with the script i'm using a
> >explain select * from acc where username='britta' yields
an
> >index scan. Very nice. But after some time being up the
> >same thing yields a SEQUENTIAL scan. What's happening
> >folks? =) No more than update's select's and vacuums has
> >been invoked.
>
> Are you doing vacuum or vacuum analyze?  I believe only
the 'vacuum analyze'
> function cleans up the indexing statistics.
>
> len
>

Please look at this:
I just ran an interesting little test here...
watch this:

<DROP DB><CREATE DB><RUN DB SCRIPT>

explain select * from acc where username='britta';

Index scan using acc_username_key on acc...

VACUUM;

explain select * from acc where username='britta';

Seq Scan on acc

<DROP DB><CREATE DB><RUN DB SCRIPT>

explain select * from acc where username='britta';

Index scan

VACUUM ANALYZE

explain select * from acc where username='britta';

Seq scan

What the wakk is goin on here people? =)

version: PostgreSQL 7.0.0 on i686-pc-linux-gnu, compiled by
gcc 2.95.2

Thanks

Daniel Åkerud


RE: Re: Dissapearing indexes, what's that all about?

From
Mike Mascari
Date:
What are the costs associated with the EXPLAIN output? Perhaps a sequential scan is *faster* then an index scan.

Mike Mascari
mascarm@mascari.com

-----Original Message-----
From:    Daniel ?erud [SMTP:zilch@home.se]
Sent:    Sunday, April 01, 2001 12:31 PM
To:    pgsql-general@postgresql.org
Subject:    Re: Re: [GENERAL] Dissapearing indexes, what's that all about?



> >Hey people,
> >have a very strange problem now...
> >It's that when my database with the script i'm using a
> >explain select * from acc where username='britta' yields
an
> >index scan. Very nice. But after some time being up the
> >same thing yields a SEQUENTIAL scan. What's happening
> >folks? =) No more than update's select's and vacuums has
> >been invoked.
>
> Are you doing vacuum or vacuum analyze?  I believe only
the 'vacuum analyze'
> function cleans up the indexing statistics.
>
> len
>

Please look at this:
I just ran an interesting little test here...
watch this:

<DROP DB><CREATE DB><RUN DB SCRIPT>

explain select * from acc where username='britta';

Index scan using acc_username_key on acc...

VACUUM;

explain select * from acc where username='britta';

Seq Scan on acc

<DROP DB><CREATE DB><RUN DB SCRIPT>

explain select * from acc where username='britta';

Index scan

VACUUM ANALYZE

explain select * from acc where username='britta';

Seq scan

What the wakk is goin on here people? =)

version: PostgreSQL 7.0.0 on i686-pc-linux-gnu, compiled by
gcc 2.95.2

Thanks

Daniel Akerud


---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster


Re: RE: Re: Dissapearing indexes, what's that all about?

From
Daniel ?erud
Date:
after a refresh database the explain yields:
index scan using xXxX (cost=0.00..8.14 rows=10 width=147)
after a vacuum + vacuum analyze the explain yields:
seq scan on acc xXxX Ä(cost=0.00..1.23 rows=1 width=147)

humm, seems you are right here... but why is it choosing a
index scan in the first place then?

> What are the costs associated with the EXPLAIN output?
Perhaps a sequential scan is *faster* then an index scan.
>
> Mike Mascari
> mascarm@mascari.com
>
> -----Original Message-----
> From:    Daniel ?erud [SMTP:zilch@home.se]
> Sent:    Sunday, April 01, 2001 12:31 PM
> To:    pgsql-general@postgresql.org
> Subject:    Re: Re: [GENERAL] Dissapearing indexes,
what's that all about?
>
>
>
> > >Hey people,
> > >have a very strange problem now...
> > >It's that when my database with the script i'm using a
> > >explain select * from acc where username='britta'
yields
> an
> > >index scan. Very nice. But after some time being up the
> > >same thing yields a SEQUENTIAL scan. What's happening
> > >folks? =) No more than update's select's and vacuums
has
> > >been invoked.
> >
> > Are you doing vacuum or vacuum analyze?  I believe only
> the 'vacuum analyze'
> > function cleans up the indexing statistics.
> >
> > len
> >
>
> Please look at this:
> I just ran an interesting little test here...
> watch this:
>
> <DROP DB><CREATE DB><RUN DB SCRIPT>
>
> explain select * from acc where username='britta';
>
> Index scan using acc_username_key on acc...
>
> VACUUM;
>
> explain select * from acc where username='britta';
>
> Seq Scan on acc
>
> <DROP DB><CREATE DB><RUN DB SCRIPT>
>
> explain select * from acc where username='britta';
>
> Index scan
>
> VACUUM ANALYZE
>
> explain select * from acc where username='britta';
>
> Seq scan
>
> What the wakk is goin on here people? =)
>
> version: PostgreSQL 7.0.0 on i686-pc-linux-gnu, compiled
by
> gcc 2.95.2
>
> Thanks
>
> Daniel Akerud
>
>
> ---------------------------(end of broadcast)-------------
--------------
> TIP 4: Don't 'kill -9' the postmaster
>
>



RE: RE: Re: Dissapearing indexes, what's that all about?

From
Mike Mascari
Date:
If I recall correctly, when the optimizer was changed (greatly enhanced),
there was a debate about what the default behavior should be. The problem
was that a large number of users would populate they're database after
index creation and see sluggishness because the statistics had not yet been
updated vs. the much smaller number of users that would suffer at the hands
of an index scan against a table that would be better served with a
sequential scan. I *think* the result of assuming 0 rows in a newly created
table, until the next vacuum, would yield a significant increase in
mailing-list traffic complaints to the tune of:

"Why isn't PostgreSQL using my index?"

followed by the usual

"Did you run VACUUM ANALYZE?"

So an assumption of 1000 rows was made,  with 10 rows matching your WHERE
clause.

Mike Mascari
mascarm@mascari.com

-----Original Message-----
From:    Daniel ?erud [SMTP:zilch@home.se]
Sent:    Sunday, April 01, 2001 12:43 PM
To:    pgsql-general@postgresql.org
Subject:    Re: RE: Re: [GENERAL] Dissapearing indexes, what's that all about?


after a refresh database the explain yields:
index scan using xXxX (cost=0.00..8.14 rows=10 width=147)
after a vacuum + vacuum analyze the explain yields:
seq scan on acc xXxX A(cost=0.00..1.23 rows=1 width=147)

humm, seems you are right here... but why is it choosing a
index scan in the first place then?

> What are the costs associated with the EXPLAIN output?
Perhaps a sequential scan is *faster* then an index scan.
>
> Mike Mascari
> mascarm@mascari.com


Re: RE: RE: Re: Dissapearing indexes, what's that all about?

From
Daniel ?erud
Date:
I appriciate all the help i've gotten here...
anyway,
creating a table:

CREATE TABLE index_with (
  id SERIAL,
  name text
);

CREATE INDEX name_index ON index_with(name);

and filling it with 10000 rows made out of

$pwgen 8 10000 > data [enter]

and then running VACUUM and VACUUM ANALYZE
still yields a sequential scan doing a
select * from index_with where name > 'm';
namely
seq scan on index_with (cost=0.00..189 rows 5170 width=16)

Sorry to bother ýou guys this much.

Daniel Åkerud

> If I recall correctly, when the optimizer was changed
(greatly enhanced),
> there was a debate about what the default behavior should
be. The problem
> was that a large number of users would populate they're
database after
> index creation and see sluggishness because the
statistics had not yet been
> updated vs. the much smaller number of users that would
suffer at the hands
> of an index scan against a table that would be better
served with a
> sequential scan. I *think* the result of assuming 0 rows
in a newly created
> table, until the next vacuum, would yield a significant
increase in
> mailing-list traffic complaints to the tune of:
>
> "Why isn't PostgreSQL using my index?"
>
> followed by the usual
>
> "Did you run VACUUM ANALYZE?"
>
> So an assumption of 1000 rows was made,  with 10 rows
matching your WHERE
> clause.
>
> Mike Mascari
> mascarm@mascari.com
>
> -----Original Message-----
> From:    Daniel ?erud [SMTP:zilch@home.se]
> Sent:    Sunday, April 01, 2001 12:43 PM
> To:    pgsql-general@postgresql.org
> Subject:    Re: RE: Re: [GENERAL] Dissapearing indexes,
what's that all about?
>
>
> after a refresh database the explain yields:
> index scan using xXxX (cost=0.00..8.14 rows=10 width=147)
> after a vacuum + vacuum analyze the explain yields:
> seq scan on acc xXxX A(cost=0.00..1.23 rows=1 width=147)
>
> humm, seems you are right here... but why is it choosing a
> index scan in the first place then?
>
> > What are the costs associated with the EXPLAIN output?
> Perhaps a sequential scan is *faster* then an index scan.
> >
> > Mike Mascari
> > mascarm@mascari.com
>
>
> ---------------------------(end of broadcast)-------------
--------------
> TIP 2: you can get off all lists at once with the
unregister command
>     (send "unregister YourEmailAddressHere" to
majordomo@postgresql.org)
>



Re: Dissapearing indexes, what's that all about?

From
Tom Lane
Date:
Mike Mascari <mascarm@mascari.com> writes:
> I *think* the result of assuming 0 rows in a newly created
> table, until the next vacuum, would yield a significant increase in
> mailing-list traffic complaints to the tune of:
> "Why isn't PostgreSQL using my index?"
> followed by the usual
> "Did you run VACUUM ANALYZE?"
> So an assumption of 1000 rows was made,  with 10 rows matching your WHERE
> clause.

Yup, exactly.  The initial default statistics are set (with malice
aforethought) to provoke an indexscan.  After you VACUUM, the optimizer
knows how large the table really is (ie, tiny), and so it decides that
looking at the index is a waste of time, it might as well just scan the
table.  Load up some more data, VACUUM again, and you'll probably see an
indexscan used.

> after a refresh database the explain yields:
> index scan using xXxX (cost=0.00..8.14 rows=10 width=147)
> after a vacuum + vacuum analyze the explain yields:
> seq scan on acc xXxX A(cost=0.00..1.23 rows=1 width=147)

BTW, comparing those two cost numbers is pretty pointless since they are
based on different information about the size of the table.

            regards, tom lane

Re: Re: Dissapearing indexes, what's that all about?

From
Daniel ?erud
Date:
Cool!!
Can you guess where the limit is?
ten thousand is not enought obviously, and putting 10000
more in there takes 10 minutes... even on a clean database.
That is another problem however, bevuase on a slower
machine it takes 13 seconds

> Mike Mascari <mascarm@mascari.com> writes:
> > I *think* the result of assuming 0 rows in a newly
created
> > table, until the next vacuum, would yield a significant
increase in
> > mailing-list traffic complaints to the tune of:
> > "Why isn't PostgreSQL using my index?"
> > followed by the usual
> > "Did you run VACUUM ANALYZE?"
> > So an assumption of 1000 rows was made,  with 10 rows
matching your WHERE
> > clause.
>
> Yup, exactly.  The initial default statistics are set
(with malice
> aforethought) to provoke an indexscan.  After you VACUUM,
the optimizer
> knows how large the table really is (ie, tiny), and so it
decides that
> looking at the index is a waste of time, it might as well
just scan the
> table.  Load up some more data, VACUUM again, and you'll
probably see an
> indexscan used.
>
> > after a refresh database the explain yields:
> > index scan using xXxX (cost=0.00..8.14 rows=10
width=147)
> > after a vacuum + vacuum analyze the explain yields:
> > seq scan on acc xXxX A(cost=0.00..1.23 rows=1 width=147)
>
> BTW, comparing those two cost numbers is pretty pointless
since they are
> based on different information about the size of the
table.
>
>             regards, tom lane
>



Re: Dissapearing indexes, what's that all about?

From
Tom Lane
Date:
Daniel ?erud <zilch@home.se> writes:
> and filling it with 10000 rows made out of
> $pwgen 8 10000 > data [enter]
> and then running VACUUM and VACUUM ANALYZE
> still yields a sequential scan doing a
> select * from index_with where name > 'm';
> namely
> seq scan on index_with (cost=0.00..189 rows 5170 width=16)

So?  You're asking it to retrieve over half of the table (or at least
the planner estimates so, and I don't see any evidence here that its
estimate is wildly off).  An indexscan would still be a loser in this
scenario.

If you want to see an indexscan with an inequality query, try giving
it a reasonably tight range.  Probably

select * from index_with where name > 'm' and name < 'n';

would use the index in this example.

            regards, tom lane

Re: Re: Dissapearing indexes, what's that all about?

From
Daniel ?erud
Date:
Wohooo,
deluxe :-)

THANKS EVERYBODY!!

Can't see the logic behind that though
The jump in the b-tree must save about 5000 checks... half
the table??

Thanks!

Daniel Åkerud

> Daniel ?erud <zilch@home.se> writes:
> > and filling it with 10000 rows made out of
> > $pwgen 8 10000 > data [enter]
> > and then running VACUUM and VACUUM ANALYZE
> > still yields a sequential scan doing a
> > select * from index_with where name > 'm';
> > namely
> > seq scan on index_with (cost=0.00..189 rows 5170
width=16)
>
> So?  You're asking it to retrieve over half of the table
(or at least
> the planner estimates so, and I don't see any evidence
here that its
> estimate is wildly off).  An indexscan would still be a
loser in this
> scenario.
>
> If you want to see an indexscan with an inequality query,
try giving
> it a reasonably tight range.  Probably
>
> select * from index_with where name > 'm' and name < 'n';
>
> would use the index in this example.
>
>             regards, tom lane
>



Re: Dissapearing indexes, what's that all about?

From
Tom Lane
Date:
Daniel ?erud <zilch@home.se> writes:
> Can't see the logic behind that though
> The jump in the b-tree must save about 5000 checks... half
> the table??

CPUs are fast.  Disks are slow.  If you think about CPU time rather
than disk accesses, you will usually draw the wrong conclusions.

Even more to the point, disks do not like random access.  A seqscan
can typically fetch four or more sequential blocks from disk in the
time it takes an indexscan to fetch one block on a random-access basis.

When you do the math it turns out seqscan wins unless you are fetching
just a small percentage of the rows.

            regards, tom lane