Thread: why postgresql is so slow?

why postgresql is so slow?

From

"Jinqiang Han"

Date:

20 September 2003, 06:32:22

hello, all.

I have a table about 2 million rows. when I run "select * from table1" in psql, it will take me about 10 minutes to get
theresult. I wonder if postgresql can immediately return result like db2.
 

After that I create a index on  a column named id. The time executing "selct * from table1 where id=10" in psql is much
fasterthan that of executing "select * from table1 where id <10" in psql. why?
 

By the way I'm using postgresql 7.3.

Who can tell me the reason. 3x.

Jinqiang Han

Re: why postgresql is so slow?

From

Richard Huxton

Date:

20 September 2003, 07:09:47

On Saturday 20 September 2003 10:38, Jinqiang Han wrote:
> hello, all.

This isn't really a hackers question - perhaps try the "general","sql" etc 
lists in future. This list is for questions about the source-code of PG.

> I have a table about 2 million rows. when I run "select * from table1" in
> psql, it will take me about 10 minutes to get the result. I wonder if
> postgresql can immediately return result like db2.

PG is returning *all* the rows you requested. If you want one row at a time, 
perhaps use a cursor.

> After that I create a index on  a column named id. The time executing
> "selct * from table1 where id=10" in psql is much faster than that of
> executing "select * from table1 where id <10" in psql. why?

Because there are more records returned? You don't supply enough information. 
If you'd like to repost to the general list, along with details of the tables 
and the output of EXPLAIN ANALYSE for your queries, I'll be glad to go into 
detail there.
--  Richard Huxton Archonet Ltd

Re: why postgresql is so slow?

From

Jeff

Date:

20 September 2003, 07:40:15

Jinqiang Han wrote:

> hello, all.
> 
> I have a table about 2 million rows. when I run "select * from table1" in
> psql, it will take me about 10 minutes to get the result. I wonder if
> postgresql can immediately return result like db2.

If you're executing in psql, it's probably trying to load the whole result
set into RAM at once, and then try to format it, and then pipe it through a
pager (i.e. more). That's not really a test of PostgreSQL, I would
recommend using a cursor or something so that psql doesn't give you a false
impression. 

> 
> After that I create a index on  a column named id. The time executing
> "selct * from table1 where id=10" in psql is much faster than that of
> executing "select * from table1 where id <10" in psql. why?
> 

Well, that depends. First, turn on stats collecting and run "VACUUM
ANALYZE". That will collect some data about your data which helps the
planner make a good choice.

My initial guess is that the second query is not using the index. That may
happen because if "id" is an INT column, than negatives are allowed, and
the planner figures that "id<10" only narrows it down to about half the
possible rows (in which case it would choose sequential scan). However, we
humans know that the "id" column is really a positive INT (or at least I
assume it is, based on your implication that "id<10" should be about as
fast as "id=10"), therefore "id<10" really means a max 10 rows out of 2M
(which means it should do an index scan). Turning on stats and running
VACUUM ANALYZE will give the planner the information that it needs to make
an intelligent choice.

If that doesn't help you, give us the table schema and run:
EXPLAIN ANALYZE SELECT * FROM table1 WHERE id=10
and then run:
EXPLAIN ANALYZE SELECT * FROM table1 WHERE id<10
and give us the output. 

Then run "VACUUM ANALYZE" every once in a while (depending on how fast your
data changes), like every night for instance. 

Regards,       Jeff Davis

Re: why postgresql is so slow?

From

Christopher Browne

Date:

20 September 2003, 11:40:16

Jeff <jdavis-pgsql@empires.org> wrote:
> Then run "VACUUM ANALYZE" every once in a while (depending on how
> fast your data changes), like every night for instance.

Consider VACUUM and ANALYZE somewhat separately.

You need to ANALYZE any time the distribution of the data changes.

You need to VACUUM any time a lot of data is invalidated by UPDATEs or
DELETEs.  (And when you VACUUM, it's cheap to throw in an ANALYZE.)

That turns into a variety of policies depending on what the pattern of
activity for a particular table is:
- If a table only ever gets INSERTs, you need only ever ANALYZE it,  as VACUUM should be useless.  (Caveat:  If INSERTs
couldget  rolled back due to other processing nuking transactions, that  leaves dead tuples for VACUUM to work
with...)
- A table that sees enormous numbers of updates (for instance, a list  of account balances) should be vacuumed REALLY
frequently.
- Any time you do a VACUUM, you might as well also do an ANALYZE.

pg_autovacuum applies these policies, so it's a reasonable thing to
use in 7.3/7.4...
-- 
If this was helpful, <http://svcs.affero.net/rm.php?r=cbbrowne> rate me
http://www3.sympatico.ca/cbbrowne/
"One  of the  most dangerous things  in   the universe is  an ignorant
people with real   grievances. That  is   nowhere near as   dangerous,
however, as  an informed and  intelligent society with grievances. The
damage that vengeful intelligence can wreak, you cannot even imagine."
-- Miles Teg, Heretics of Dune

Re: why postgresql is so slow?

From

Neil Conway

Date:

20 September 2003, 12:35:05

On Sat, 2003-09-20 at 06:14, Jeff wrote:
> Well, that depends. First, turn on stats collecting and run "VACUUM
> ANALYZE". That will collect some data about your data which helps the
> planner make a good choice.

The statistics collector and the statistics collected by ANALYZE have
nothing to do with one another -- stats collector stats aren't used by
the query planner. So just running ANALYZE or VACUUM ANALYZE should be
sufficient.

-Neil