Thread: Question about postmaster's CPU usage
Hi all,
When I use postgres and issue a simple sequential scan for a table inventory using query "select * from inventory;", I can see from "top" that postmaster is using 100% CPU, which limits the query execution time. My question is that, why CPU is the bottleneck here and what is postmaster doing? Is there any way to improve the performance? Thanks!
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15401 postgres 20 0 4371m 517m 515m R 99.8 3.2 0:30.14 postmaster
Query: select * from inventory;
explain analyze select * from inventory;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
------------
Seq Scan on inventory (cost=0.00..180937.00 rows=11745000 width=16) (actual time=0.005..1030.403 rows=117450
00 loops=1)
Total runtime: 1750.889 ms
(2 rows)
Table "public.inventory"
Column | Type | Modifiers
----------------------+---------+-----------
inv_date_sk | integer | not null
inv_item_sk | integer | not null
inv_warehouse_sk | integer | not null
inv_quantity_on_hand | integer |
Indexes:
"inventory_pkey" PRIMARY KEY, btree (inv_date_sk, inv_item_sk, inv_warehouse_sk)
When I use postgres and issue a simple sequential scan for a table inventory using query "select * from inventory;", I can see from "top" that postmaster is using 100% CPU, which limits the query execution time. My question is that, why CPU is the bottleneck here and what is postmaster doing? Is there any way to improve the performance? Thanks!
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15401 postgres 20 0 4371m 517m 515m R 99.8 3.2 0:30.14 postmaster
Query: select * from inventory;
explain analyze select * from inventory;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
------------
Seq Scan on inventory (cost=0.00..180937.00 rows=11745000 width=16) (actual time=0.005..1030.403 rows=117450
00 loops=1)
Total runtime: 1750.889 ms
(2 rows)
Table "public.inventory"
Column | Type | Modifiers
----------------------+---------+-----------
inv_date_sk | integer | not null
inv_item_sk | integer | not null
inv_warehouse_sk | integer | not null
inv_quantity_on_hand | integer |
Indexes:
"inventory_pkey" PRIMARY KEY, btree (inv_date_sk, inv_item_sk, inv_warehouse_sk)
kelphet xiong <kelphet@gmail.com> wrote: > When I use postgres and issue a simple sequential scan for a > table inventory using query "select * from inventory;", I can see > from "top" that postmaster is using 100% CPU, which limits the > query execution time. My question is that, why CPU is the > bottleneck here and what is postmaster doing? Is there any way to > improve the performance? Thanks! > explain analyze select * from inventory; > > Seq Scan on inventory (cost=0.00..180937.00 rows=11745000 width=16) (actual time=0.005..1030.403 rows=11745000 loops=1) > Total runtime: 1750.889 ms So it is reading and returning 11.7 million rows in about 1 second, or about 88 nanoseconds (billionths of a second) per row. You can't be waiting for a hard drive for many of those reads, or it would take a lot longer, so the bottleneck is the CPU pushing the data around in RAM. I'm not sure why 100% CPU usage would surprise you. Are you wondering why the CPU works on the query straight through until it is done, rather than taking a break periodically and letting the unfinished work sit there? -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Mar 28, 2013 at 02:03:42PM -0700, Kevin Grittner wrote: > kelphet xiong <kelphet@gmail.com> wrote: > > > When I use postgres and issue a simple sequential scan for a > > table inventory using query "select * from inventory;", I can see > > from "top" that postmaster is using 100% CPU, which limits the > > query execution time. My question is that, why CPU is the > > bottleneck here and what is postmaster doing? Is there any way to > > improve the performance? Thanks! > > > explain analyze select * from inventory; > > > > Seq Scan on inventory (cost=0.00..180937.00 rows=11745000 width=16) (actual time=0.005..1030.403 rows=11745000 loops=1) > > Total runtime: 1750.889 ms > > So it is reading and returning 11.7 million rows in about 1 second, > or about 88 nanoseconds (billionths of a second) per row. You > can't be waiting for a hard drive for many of those reads, or it > would take a lot longer, so the bottleneck is the CPU pushing the > data around in RAM. I'm not sure why 100% CPU usage would surprise > you. Are you wondering why the CPU works on the query straight > through until it is done, rather than taking a break periodically > and letting the unfinished work sit there? > > -- > Kevin Grittner > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > Alternatively, purchase a faster CPU if CPU is the bottleneck as it is in this case or partition the work into parallel queuries that can each use a processor. Regards, Ken
On Mar 28, 2013 9:07 PM, "kelphet xiong" <kelphet@gmail.com> wrote:
> explain analyze select * from inventory;
> QUERY PLAN
>
> --------------------------------------------------------------------------------------------------------------
> ------------
> Seq Scan on inventory (cost=0.00..180937.00 rows=11745000 width=16) (actual time=0.005..1030.403 rows=117450
> 00 loops=1)
> Total runtime: 1750.889 ms
> (2 rows)
A large fraction of that time, if not most is due to timing overhead. You can try the same query without timing by using explain (analyze on, timing off) select * from inventory;
Regards,
Ants Aasma
Thanks a lot for replies from Kevin, Ken, and Ants Aasma. I really aappreciate your suggestions and comments.
My server configuration is two physical quad-core CPUs with hyper-threading enabled.
Each CPU is Intel(R) Xeon(R) CPU E5620@2.40GHz. Physical memory is 16GB.
I set shared_buffers as 4GB, effective_cache_size as 10GB and inventory table is around 500MB.
From the information provided by top command, although the row for postmaster shows that postmaster is using 100%CPU,
the total CPU user time for the whole server never goes beyond 6.6%us.
I guess it is because postgres only uses a single thread to read
the data or “pushing the data around in RAM” according to Kevin’s statement.
Then my question is actually why postgres can not use the remaining 93.4%CPU.
Btw, I also tried the command suggested by Ants Aasma, but got an error:
explain (analyze on, timing off) select * from inventory;
ERROR: syntax error at or near "analyze"
LINE 1: explain (analyze on, timing off) select * from inventory;
^
Thanks!
Best regards
Kelphet Xiong
On Thu, Mar 28, 2013 at 2:03 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
kelphet xiong <kelphet@gmail.com> wrote:
> When I use postgres and issue a simple sequential scan for a
> table inventory using query "select * from inventory;", I can see
> from "top" that postmaster is using 100% CPU, which limits the
> query execution time. My question is that, why CPU is the
> bottleneck here and what is postmaster doing? Is there any way to
> improve the performance? Thanks!
> explain analyze select * from inventory;> Seq Scan on inventory (cost=0.00..180937.00 rows=11745000 width=16) (actual time=0.005..1030.403 rows=11745000 loops=1)
>
> Total runtime: 1750.889 ms
So it is reading and returning 11.7 million rows in about 1 second,
or about 88 nanoseconds (billionths of a second) per row. You
can't be waiting for a hard drive for many of those reads, or it
would take a lot longer, so the bottleneck is the CPU pushing the
data around in RAM. I'm not sure why 100% CPU usage would surprise
you. Are you wondering why the CPU works on the query straight
through until it is done, rather than taking a break periodically
and letting the unfinished work sit there?
--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Sat, Mar 30, 2013 at 11:00 PM, Kelphet Xiong <kelphet@gmail.com> wrote: > I guess it is because postgres only uses a single thread to read > the data or “pushing the data around in RAM” according to Kevin’s statement. > Then my question is actually why postgres can not use the remaining > 93.4%CPU. postgres can use an arbitrary amount of threads to read data, but only one per database connection. > Btw, I also tried the command suggested by Ants Aasma, but got an error: > > explain (analyze on, timing off) select * from inventory; > ERROR: syntax error at or near "analyze" > > LINE 1: explain (analyze on, timing off) select * from inventory; > > ^ Ability to manipulate timing was added in 9.2. merlin