Thread: Performance of query
I have two tables in Postgres 9.2 on a Linux server with 8GB of RAM. The first table has 60 million records:
CREATE TABLE table1
(
id integer,
update date,
company character(35),
address character(35),
city character(20),
state character(2),
zip character(9),
phone character(10),
fips character(5),
tract character(6),
block character(4),
status character(1),
pre_title character(2),
contact character(35),
title character(20),
pstat character(1),
id integer NOT NULL,
pkone character(2),
pktwo character(2),
pkthree character(2),
pkfour character(2),
centract character(15),
CONSTRAINT table1_pkey PRIMARY KEY (id ),
CONSTRAINT fipsc FOREIGN KEY (fips)
REFERENCES fips (fips) MATCH FULL
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT statec FOREIGN KEY (state)
REFERENCES state (state) MATCH FULL
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT tractc FOREIGN KEY (centract)
REFERENCES tract (centract) MATCH FULL
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT zipc FOREIGN KEY (zip)
REFERENCES zip (zip) MATCH FULL
ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
OIDS=FALSE
);
ALTER TABLE table1
OWNER TO postgres;
-- Index: statidx2
-- DROP INDEX statidx2;
CREATE INDEX statidx2
ON table1
USING btree
(state COLLATE pg_catalog."default" );
The second table just has the 51 state records:
CREATE TABLE state
(
state character(2) NOT NULL,
state_name character(15),
CONSTRAINT state_pkey PRIMARY KEY (state )
)
WITH (
OIDS=FALSE
);
ALTER TABLE state
OWNER TO postgres;
-- Index: stateidx
-- DROP INDEX stateidx;
CREATE UNIQUE INDEX stateidx
ON state
USING btree
(state COLLATE pg_catalog."default" );
When I run this query:
select state.state, count(table1.id) from state,table1 where table1.state = state.state group by state.state
It takes almost 4 minutes with this output from explain:
"HashAggregate (cost=7416975.58..7416976.09 rows=51 width=7) (actual time=284891.955..284891.964 rows=51 loops=1)"
" -> Hash Join (cost=2.15..7139961.94 rows=55402728 width=7) (actual time=0.049..269049.678 rows=60057057 loops=1)"
" Hash Cond: (busbase.state = state.state)"
" -> Seq Scan on busbase (cost=0.00..6378172.28 rows=55402728 width=7) (actual time=0.004..250046.673 rows=60057057 loops=1)"
" -> Hash (cost=1.51..1.51 rows=51 width=3) (actual time=0.032..0.032 rows=51 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 2kB"
" -> Seq Scan on state (cost=0.00..1.51 rows=51 width=3) (actual time=0.003..0.012 rows=51 loops=1)"
"Total runtime: 284892.024 ms"
I've tried playing around with the settings in the config file for shared_buffers, work_mem, etc restarting Postgres each time and nothing seems to help.
Thanks for any help.
On 03/22/2013 12:46 PM, Cindy Makarowsky wrote: > I've tried playing around with the settings in the config file for > shared_buffers, work_mem, etc restarting Postgres each time and nothing > seems to help. Well, you're summarizing 55 million rows on an unindexed table: " -> Seq Scan on busbase (cost=0.00..6378172.28 rows=55402728 width=7) (actual time=0.004..250046.673 rows=60057057 loops=1)" ... that's where your time is going. My only suggestion would be to create a composite index which matches the group by condition on table1, and vacuum freeze the whole table so that you can use index-only scan on 9.2. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
But, I do have an index on Table1 on the state field which is in my group by condition:
I have vacuumed the table too.
CREATE INDEX statidx2
ON table1
USING btree
(state COLLATE pg_catalog."default" );
On Fri, Mar 22, 2013 at 5:13 PM, Josh Berkus <josh@agliodbs.com> wrote:
On 03/22/2013 12:46 PM, Cindy Makarowsky wrote:Well, you're summarizing 55 million rows on an unindexed table:
> I've tried playing around with the settings in the config file for
> shared_buffers, work_mem, etc restarting Postgres each time and nothing
> seems to help.... that's where your time is going.
" -> Seq Scan on busbase (cost=0.00..6378172.28 rows=55402728
width=7) (actual time=0.004..250046.673 rows=60057057 loops=1)"
My only suggestion would be to create a composite index which matches
the group by condition on table1, and vacuum freeze the whole table so
that you can use index-only scan on 9.2.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Hi,
there is something mixed..
your index is on table1....
Explain Analyze reports about table called: busbase....
Kind Regards,
Misa
2013/3/22 Cindy Makarowsky <cindymakarowsky@gmail.com>
But, I do have an index on Table1 on the state field which is in my group by condition:I have vacuumed the table too.CREATE INDEX statidx2ON table1USING btree(state COLLATE pg_catalog."default" );On Fri, Mar 22, 2013 at 5:13 PM, Josh Berkus <josh@agliodbs.com> wrote:On 03/22/2013 12:46 PM, Cindy Makarowsky wrote:Well, you're summarizing 55 million rows on an unindexed table:
> I've tried playing around with the settings in the config file for
> shared_buffers, work_mem, etc restarting Postgres each time and nothing
> seems to help.... that's where your time is going.
" -> Seq Scan on busbase (cost=0.00..6378172.28 rows=55402728
width=7) (actual time=0.004..250046.673 rows=60057057 loops=1)"
My only suggestion would be to create a composite index which matches
the group by condition on table1, and vacuum freeze the whole table so
that you can use index-only scan on 9.2.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
I changed the name of the table for the post but forgot to change it in the results of the explain. Table1 is busbase.
On Fri, Mar 22, 2013 at 6:25 PM, Misa Simic <misa.simic@gmail.com> wrote:
Hi,there is something mixed..your index is on table1....Explain Analyze reports about table called: busbase....Kind Regards,Misa2013/3/22 Cindy Makarowsky <cindymakarowsky@gmail.com>But, I do have an index on Table1 on the state field which is in my group by condition:I have vacuumed the table too.CREATE INDEX statidx2ON table1USING btree(state COLLATE pg_catalog."default" );On Fri, Mar 22, 2013 at 5:13 PM, Josh Berkus <josh@agliodbs.com> wrote:On 03/22/2013 12:46 PM, Cindy Makarowsky wrote:Well, you're summarizing 55 million rows on an unindexed table:
> I've tried playing around with the settings in the config file for
> shared_buffers, work_mem, etc restarting Postgres each time and nothing
> seems to help.... that's where your time is going.
" -> Seq Scan on busbase (cost=0.00..6378172.28 rows=55402728
width=7) (actual time=0.004..250046.673 rows=60057057 loops=1)"
My only suggestion would be to create a composite index which matches
the group by condition on table1, and vacuum freeze the whole table so
that you can use index-only scan on 9.2.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
On Friday, March 22, 2013, Cindy Makarowsky wrote:
I have two tables in Postgres 9.2 on a Linux server with 8GB of RAM. The first table has 60 million records:
You have over 40GB of data in that table, so there is no way you are going to get it into 8GB RAM without some major reorganization.
company character(35),address character(35),city character(20),contact character(35),title character(20),
All of those fixed width fields are probably taking up needless space, and in your case, space is time. Varchar would probably be better. (And probably longer maximum lengths as well. Most people don't need more than 35 characters for their addresses, but the people who do are going to be cheesed off when you inform them that you deem their address to be unreasonable. Unless your mailing labels only hold 35 characters)
When I run this query:select state.state, count(table1.id) from state,table1 where table1.state = state.state group by state.state
The join to the "state" table is not necessary. Between the foreign key and the primary key, you know that every state exists, and that every state exists only once. But, that will not solve your problem, as the join to the state table is not where the time goes.
" -> Seq Scan on busbase (cost=0.00..6378172.28 rows=55402728 width=7) (actual time=0.004..250046.673 rows=60057057 loops=1)"
Assuming that your cost parameters are all default, this means you have
(6378172.28 - 0.01* 55402728)/1 = 5.8e6 pages, or 44.4 GB of table. That is, less than 10 tuples per page.
Tightly packed, you should be able to hold over 30 tuples per page. You are probably not vacuuming aggressively enough, or you were not doing so in the past and never did a "vacuum full" to reclaim the bloated space.
In any event, your sequential scan is running at 181 MB/s. Is this what you would expect given your IO hardware?
I've tried playing around with the settings in the config file for shared_buffers, work_mem, etc restarting Postgres each time and nothing seems to help.
How fast do you think it should run? How fast do you need it to run? This seems like the type of query that would get run once per financial quarter, or maybe once per day on off-peak times.
Cheers,
Jeff
Hi Jeff,
It seems my previous mail has not showed up in the list... copied/pasted again belloew
However, you said something important:
"The join to the "state" table is not necessary. Between the foreign key and the primary key, you know that every state exists, and that every state exists only once. But, that will not solve your problem, as the join to the state table is not where the time goes."
I think it is something what planner could/should be "aware off"... and discard the join
" Merge Join (cost=0.00..2310285.02 rows=60057056 width=3) (actual time=38.424..41992.070 rows=60057057 loops=1)"
" Merge Cond: (state.state = busbase.state)"
this part from bellow plan would save significant time if planner didn't decide to take this step at all ....
Kind regards,
Misa
"
Hi Cindy
TBH - I don't know...
I have added this to list so maybe someone else can help...
To recap:
from start situation (table structure and indexes are in the first mail in this thread)
EXPLAIN ANALYZE
SELECT busbase.state AS c0, count(busbase.id) AS m0 FROM busbase INNER JOIN state USING (state)
GROUP BY busbase.state
says:
"HashAggregate (cost=7416975.58..7416976.09 rows=51 width=7) (actual time=285339.465..285339.473 rows=51 loops=1)"
" -> Hash Join (cost=2.15..7139961.94 rows=55402728 width=7) (actual time=0.066..269527.934 rows=60057057 loops=1)"
" Hash Cond: (busbase.state = state.state)"
" -> Seq Scan on busbase (cost=0.00..6378172.28 rows=55402728 width=7) (actual time=0.022..251029.307 rows=60057057 loops=1)"
" -> Hash (cost=1.51..1.51 rows=51 width=3) (actual time=0.028..0.028 rows=51 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 2kB"
" -> Seq Scan on state (cost=0.00..1.51 rows=51 width=3) (actual time=0.003..0.019 rows=51 loops=1)"
"Total runtime: 285339.516 ms"
on created composite index
CREATE INDEX comp_statidx2
ON busbase
USING btree
(state, id );
we got:
"GroupAggregate (cost=0.00..2610570.81 rows=51 width=3) (actual time=98.923..51033.888 rows=51 loops=1)"
" -> Merge Join (cost=0.00..2310285.02 rows=60057056 width=3) (actual time=38.424..41992.070 rows=60057057 loops=1)"
" Merge Cond: (state.state = busbase.state)"
" -> Index Only Scan using state_pkey on state (cost=0.00..13.02 rows=51 width=3) (actual time=0.008..0.148 rows=51 loops=1)"
" Heap Fetches: 51"
" -> Index Only Scan using comp_statidx2 on busbase (cost=0.00..1559558.68 rows=60057056 width=3) (actual time=38.408..12883.575 rows=60057057 loops=1)"
" Heap Fetches: 0"
"Total runtime: 51045.648 ms"
Question is - is it possible to improve it more?
"
I assume there are reasons not to throw away join to state. May be it still can be done as the last thing. This should help further:
SELECT counts.* FROM (
SELECT busbase.state AS state, count(busbase.id) AS m0 FROM busbase
GROUP BY busbase.state ) AS counts
INNER JOIN state USING (state)
Regards,
Roman Konoval
On Sun, Mar 24, 2013 at 12:27 AM, Misa Simic <misa.simic@gmail.com> wrote:
Hi Jeff,It seems my previous mail has not showed up in the list... copied/pasted again belloewHowever, you said something important:"The join to the "state" table is not necessary. Between the foreign key and the primary key, you know that every state exists, and that every state exists only once. But, that will not solve your problem, as the join to the state table is not where the time goes."I think it is something what planner could/should be "aware off"... and discard the join" Merge Join (cost=0.00..2310285.02 rows=60057056 width=3) (actual time=38.424..41992.070 rows=60057057 loops=1)"" Merge Cond: (state.state = busbase.state)"this part from bellow plan would save significant time if planner didn't decide to take this step at all ....Kind regards,Misa"Hi CindyTBH - I don't know...I have added this to list so maybe someone else can help...To recap:from start situation (table structure and indexes are in the first mail in this thread)EXPLAIN ANALYZESELECT busbase.state AS c0, count(busbase.id) AS m0 FROM busbase INNER JOIN state USING (state)GROUP BY busbase.statesays:"HashAggregate (cost=7416975.58..7416976.09 rows=51 width=7) (actual time=285339.465..285339.473 rows=51 loops=1)"" -> Hash Join (cost=2.15..7139961.94 rows=55402728 width=7) (actual time=0.066..269527.934 rows=60057057 loops=1)"" Hash Cond: (busbase.state = state.state)"" -> Seq Scan on busbase (cost=0.00..6378172.28 rows=55402728 width=7) (actual time=0.022..251029.307 rows=60057057 loops=1)"" -> Hash (cost=1.51..1.51 rows=51 width=3) (actual time=0.028..0.028 rows=51 loops=1)"" Buckets: 1024 Batches: 1 Memory Usage: 2kB"" -> Seq Scan on state (cost=0.00..1.51 rows=51 width=3) (actual time=0.003..0.019 rows=51 loops=1)""Total runtime: 285339.516 ms"on created composite indexCREATE INDEX comp_statidx2ON busbaseUSING btree(state, id );we got:"GroupAggregate (cost=0.00..2610570.81 rows=51 width=3) (actual time=98.923..51033.888 rows=51 loops=1)"" -> Merge Join (cost=0.00..2310285.02 rows=60057056 width=3) (actual time=38.424..41992.070 rows=60057057 loops=1)"" Merge Cond: (state.state = busbase.state)"" -> Index Only Scan using state_pkey on state (cost=0.00..13.02 rows=51 width=3) (actual time=0.008..0.148 rows=51 loops=1)"" Heap Fetches: 51"" -> Index Only Scan using comp_statidx2 on busbase (cost=0.00..1559558.68 rows=60057056 width=3) (actual time=38.408..12883.575 rows=60057057 loops=1)"" Heap Fetches: 0""Total runtime: 51045.648 ms"Question is - is it possible to improve it more?"
On Sat, Mar 23, 2013 at 3:27 PM, Misa Simic <misa.simic@gmail.com> wrote:
Hi Jeff,It seems my previous mail has not showed up in the list... copied/pasted again belloewHowever, you said something important:"The join to the "state" table is not necessary. Between the foreign key and the primary key, you know that every state exists, and that every state exists only once. But, that will not solve your problem, as the join to the state table is not where the time goes."I think it is something what planner could/should be "aware off"... and discard the join
I thought that this was on the To Do list (http://wiki.postgresql.org/wiki/Todo) but if it is, I can't find it.
I think the main concern was that it might add substantial planning time to all queries, even ones that would not benefit from it. I don't know if there is a way to address this concern, other then to implement it and see what happens.
...
EXPLAIN ANALYZESELECT busbase.state AS c0, count(busbase.id) AS m0 FROM busbase INNER JOIN state USING (state)GROUP BY busbase.state
In the original email, the table definition listed "id" twice, once with a not null constraint. If it is truly not null, then this count could be replaced with count(1), in which case the original index on (state) would be sufficient, the composite on (count, id) would not be necessary. (Yes, this is another thing the planner could, in theory, recognize on your behalf)
Based on the use of column aliases which are less meaningful than the original column names were, I'm assuming that this is generated SQL that you have no control over?
on created composite indexCREATE INDEX comp_statidx2ON busbaseUSING btree(state, id );
we got:"GroupAggregate (cost=0.00..2610570.81 rows=51 width=3) (actual time=98.923..51033.888 rows=51 loops=1)"" -> Merge Join (cost=0.00..2310285.02 rows=60057056 width=3) (actual time=38.424..41992.070 rows=60057057 loops=1)"" Merge Cond: (state.state = busbase.state)"" -> Index Only Scan using state_pkey on state (cost=0.00..13.02 rows=51 width=3) (actual time=0.008..0.148 rows=51 loops=1)"" Heap Fetches: 51"" -> Index Only Scan using comp_statidx2 on busbase (cost=0.00..1559558.68 rows=60057056 width=3) (actual time=38.408..12883.575 rows=60057057 loops=1)"" Heap Fetches: 0""Total runtime: 51045.648 ms"
I don't understand why you are getting a merge join rather than a hash join. Nor why there is such a big difference between the actual time of the index only scan and of the merge join itself. I would think the two should be about equal. Perhaps I just don't understand the semantics of reported actual time for merge joins.
During normal operations, how much of busbase is going to be all_visible at any given time? If that table sees high turnover, this plan might not work well on the production system.
Cheers,
Jeff
I basically don't have any control over the generated select statement. I'm using Mondrian and that is the select statement that gets passed to Postgres. You're right that if you remove the count(id), the query is faster but I can't do that since the select statement is being executed from Mondrian.
On Mon, Mar 25, 2013 at 2:18 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
On Sat, Mar 23, 2013 at 3:27 PM, Misa Simic <misa.simic@gmail.com> wrote:Hi Jeff,It seems my previous mail has not showed up in the list... copied/pasted again belloewHowever, you said something important:"The join to the "state" table is not necessary. Between the foreign key and the primary key, you know that every state exists, and that every state exists only once. But, that will not solve your problem, as the join to the state table is not where the time goes."I think it is something what planner could/should be "aware off"... and discard the joinI thought that this was on the To Do list (http://wiki.postgresql.org/wiki/Todo) but if it is, I can't find it.I think the main concern was that it might add substantial planning time to all queries, even ones that would not benefit from it. I don't know if there is a way to address this concern, other then to implement it and see what happens....EXPLAIN ANALYZESELECT busbase.state AS c0, count(busbase.id) AS m0 FROM busbase INNER JOIN state USING (state)GROUP BY busbase.stateIn the original email, the table definition listed "id" twice, once with a not null constraint. If it is truly not null, then this count could be replaced with count(1), in which case the original index on (state) would be sufficient, the composite on (count, id) would not be necessary. (Yes, this is another thing the planner could, in theory, recognize on your behalf)Based on the use of column aliases which are less meaningful than the original column names were, I'm assuming that this is generated SQL that you have no control over?on created composite indexCREATE INDEX comp_statidx2ON busbaseUSING btree(state, id );we got:"GroupAggregate (cost=0.00..2610570.81 rows=51 width=3) (actual time=98.923..51033.888 rows=51 loops=1)"" -> Merge Join (cost=0.00..2310285.02 rows=60057056 width=3) (actual time=38.424..41992.070 rows=60057057 loops=1)"" Merge Cond: (state.state = busbase.state)"" -> Index Only Scan using state_pkey on state (cost=0.00..13.02 rows=51 width=3) (actual time=0.008..0.148 rows=51 loops=1)"" Heap Fetches: 51"" -> Index Only Scan using comp_statidx2 on busbase (cost=0.00..1559558.68 rows=60057056 width=3) (actual time=38.408..12883.575 rows=60057057 loops=1)"" Heap Fetches: 0""Total runtime: 51045.648 ms"I don't understand why you are getting a merge join rather than a hash join. Nor why there is such a big difference between the actual time of the index only scan and of the merge join itself. I would think the two should be about equal. Perhaps I just don't understand the semantics of reported actual time for merge joins.During normal operations, how much of busbase is going to be all_visible at any given time? If that table sees high turnover, this plan might not work well on the production system.Cheers,Jeff