Re: PLEASE GOD HELP US! - Mailing list pgsql-admin

From Uwe C. Schroeder
Subject Re: PLEASE GOD HELP US!
Date
Msg-id 200410010948.52351.uwe@oss4u.com
Whole thread Raw
In response to PLEASE GOD HELP US!  ("Shane | SkinnyCorp" <shanew@skinnycorp.com>)
List pgsql-admin
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Friday 01 October 2004 09:04 am, you wrote:

>         SQL:  SELECT * FROM thread_listing AS t ORDER BY t.status=5
> DESC,t.lastreply desc LIMIT 25 OFFSET 0
>         Num Rows:    25
>         Affected Rows:    0
>         Exec Time:  9.1602659225464

EXPLAIN is your friend here.
Well, the above query is nice for MySQL - being a fast reader. I bet it
doesn't use an index at all. You're selecting ALL rows from the table, then
sort ALL rows and only retrieve 25. This means if you have 1 million rows in
the table, postgres has to sort 1 million rows just to return the latest 25.
Maybe you want to rethink the approach to this. Databases are designed to pull
out specific records from a large set of records and that's where most
databases are good at.  Design the query in a way that allows to isolate the
records you want in a "where clause".
In the above query you could already put an index on thread_listing.status and
write:
select * from thread_listing where status=5 order by lastreply desc limit 25
offset 0
Although not yet good, it will already only take the rows with status=5 into
consideration. If you only have 20% of records with status=5 it will not sort
1 million records anymore, but only 200000.

Your problem is not the speed of the DB, it's bad query design (and possibly
bad database design not using keys and indexes properly). Using indexes makes
write operations slower, but in your case that doesn't matter because nobody
will post 1000 threads in a second, however many may want to read them, which
is where the index will increase speed exponentially
Example: I have a data-warehouse on postgres. One particular operation would
store 32 records to one table and afterwards read the records just stored.
Without indexes the read took about 20 seconds, because a lot of joins to
other tables were involved. After I properly indexed the keys used to join to
the other tables the execution time went down to about a second - and the
database isn't even optimized for my application, it's a vanilla standard out
of the box configuration in postgresql.conf (except for increased shared
memory to allow more simultaneous connections). I bet I could get the
execution-time to less than 1/2 second if I'd care about this.


UC

- --
Open Source Solutions 4U, LLC    2570 Fleetwood Drive
Phone:  +1 650 872 2425        San Bruno, CA 94066
Cell:   +1 650 302 2405        United States
Fax:    +1 650 872 2417
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFBXYp0jqGXBvRToM4RAgJkAJ4tu7tPyzieAjQ1+pwFr+6sa1CffgCgxCYb
K6szMn9iVelBmL5tqrE0YS4=
=tYYW
-----END PGP SIGNATURE-----


pgsql-admin by date:

Previous
From: "Goulet, Dick"
Date:
Subject: Re: PLEASE GOD HELP US!
Next
From: Bruno Wolff III
Date:
Subject: Re: query on multiple tables in schema