Re: Slow count(*) again... - Mailing list pgsql-performance
From | Reid Thompson |
---|---|
Subject | Re: Slow count(*) again... |
Date | |
Msg-id | 4CB206E0.6080707@ateb.com Whole thread Raw |
In response to | Re: Slow count(*) again... (Reid Thompson <reid.thompson@ateb.com>) |
List | pgsql-performance |
On 10/10/2010 11:02 AM, Reid Thompson wrote: >>>> On Sat, Oct 9, 2010 at 5:26 PM, Neil Whelchel<neil.whelchel@gmail.com> >>>> >> On the other hand, I copied a table out of one of my production servers that >> has about 60,000 rows with 6 columns (numeric, numeric, bool, bool, timestamp, >> text). The first numeric column has numbers evenly spread between 0 and 100 >> and it is indexed. I put the table in a pair of database servers both running >> on the same physical hardware. One server is Postgres, the other is a popular >> server (I am not mentioning names here). on Postgres: SELECT count(*) FROM >> table where column>50; takes about 8 seconds to run. The other database server >> took less than one second (about 25 ms) as it is using the index (I assume) to >> come up with the results. It is true that this is not a fair test because both >> servers were tested with their default settings, and the defaults for Postgres >> are much more conservative, however, I don't think that any amount of settings >> tweaking will bring them even in the same ball park. There has been discussion >> about the other server returning an incorrect count because all of the indexed >> rows may not be live at the time. This is not a problem for the intended use, >> that is why I suggested another function like estimate(*). It's name suggests >> that the result will be close, not 100% correct, which is plenty good enough >> for generating a list of results pages in most cases. I am faced with a very >> serious problem here. If the query to make a list of pages takes say 6 seconds >> and it takes another 6 seconds to generate a page of results, the customer is >> waiting 12 seconds. This is not going to work. If count made a quick estimate, >> say less than a second, and it took 6 seconds to come up with the actual >> results, I could live with that. Or if coming up with the window of results >> via (OFFSET and LIMIT) and returned the total number of rows that would have >> matched the query, then I would still have everything I need to render the >> page in a reasonable time. I really think that this needs to be addressed >> somewhere. It's not like I am the only one that does this. You see it nearly >> everywhere a long list of results is (expected to be) returned in a web site. >> Among the people I work with, this seems to be the most mentioned reason that >> they claim that they don't use Postgres for their projects. t anyway. > > How big is your DB? > How fast is your disk access? > Any chance disks/RAM can be addressed? > > My disk access is pitiful... > first run, 2.3 million rows.. 0m35.38s, subsequent runs.. real 0m2.55s > > rthompso@hw-prod-repdb1> time psql -c "select count(*) from my_production_table" reporting > count > --------- > 2340704 > (1 row) > > > real 0m35.38s > user 0m0.25s > sys 0m0.03s > > subsequent runs.... (count changes due to inserts.) > > rthompso@hw-prod-repdb1> time psql -c "select count(*) from my_production_table" reporting > count > --------- > 2363707 > (1 row) > > > real 0m2.70s > user 0m0.27s > sys 0m0.02s > rthompso@hw-prod-repdb1> time psql -c "select count(*) from my_production_table" reporting > count > --------- > 2363707 > (1 row) > > > real 0m2.55s > user 0m0.26s > sys 0m0.02s > rthompso@hw-prod-repdb1> time psql -c "select count(*) from my_production_table" reporting > count > --------- > 2363707 > (1 row) > > > real 0m2.50s > user 0m0.26s > sys 0m0.02s > > reporting=# SELECT pg_size_pretty(pg_total_relation_size('my_production_table')); > pg_size_pretty > ---------------- > 1890 MB > (1 row) > > forgot to note, my table schema is significantly larger. rthompso@hw-prod-repdb1> time psql -c "\d my_production_table_201010" reporting Table "public.my_production_table_201010" Column | Type | Modifiers -----------------------------+-----------------------------+---------------------------------------------------------------- | integer | not null default nextval('my_production_table_parent_id_seq'::regclass) | character varying(20) | | character(1) | | character varying(32) | | character varying(32) | | character varying(20) | | character varying(5) | | character varying(5) | | date | | character(1) | | character varying(32) | | character varying(32) | | character varying(32) | | character varying(2) | | character varying(10) | | character varying(10) | | character varying(32) | | character varying(7) | | character varying(10) | | character varying(2) | | character varying(9) | | character varying(9) | | character varying(9) | | character varying(10) | | character varying(32) | | character varying(32) | | character varying(20) | | character varying(5) | | character varying(5) | | character varying(32) | | character varying(32) | | character varying(32) | | character varying(2) | | character varying(10) | | character varying(10) | | character varying(10) | | character varying(10) | | integer | | character varying(2) | | character varying(32) | | character varying(32) | | integer | | integer | | text | | character varying(3) | | date | | date | | date | | integer | | integer | | integer | | integer | | character varying(6) | | character varying(10) | | character varying(32) | | character varying(32) | | character varying(32) | | character varying(10) | | character varying(6) | | character varying(8) | | boolean | | character(1) | | date | | integer | | date | | character varying(11) | | character varying(4) | | character(1) | | date | | character varying(5) | | character varying(20) | | date | | character(1) | | character(1) | | character varying(2) | | text | | integer | | integer | | timestamp without time zone | default now() | timestamp without time zone | | character varying(64) | | character varying(64) | | character varying(64) | Indexes: "my_production_table_201010_pkey" PRIMARY KEY, btree (id) "my_production_table_201010_date_idx" btree (xxxxdate), tablespace "indexspace" "my_production_table_201010_epatient_idx" btree (storeid, xxxxxxxxxxxxx), tablespace "indexspace" "my_production_table_201010_medicationname_idx" btree (xxxxxxxxxxxxxx), tablespace "indexspace" "my_production_table_201010_ndc_idx" btree (xxx), tablespace "indexspace" Check constraints: "my_production_table_201010_filldate_check" CHECK (xxxxdate >= '2010-10-01'::date AND xxxxdate < '2010-11-01'::date) Foreign-key constraints: "my_production_table_201010_pkgfileid_fkey" FOREIGN KEY (pkgfileid) REFERENCES my_production_tablefiles(id) Inherits: my_production_table_parent
pgsql-performance by date: