Thread: Join with lower/upper limits doesn't scale well

Join with lower/upper limits doesn't scale well

From
Craig James
Date:
I have the same schema in two different databases.  In "smalldb", the two tables of interest have about 430,000 rows,
in"bigdb", the two tables each contain about 5.5 million rows.  I'm processing the data, and for various reasons it
worksout well to process it in 100,000 row chunks.  However, it turns out for the big schema, selecting 100,000 rows is
thelongest single step of the processing. 

Below is the explain/analyze output of the query from each database.  Since both tables are indexed on the joined
columns,I don't understand why the big table should be so much slower -- I hoped this would scale well, or at least
O(log(N)),not O(N). 

What's going on here?  I don't know if I'm reading this right, but it looks like the sort is taking all the time, but
thatdoesn't make sense because in both cases it's sorting 100,000 rows. 

Thanks,
Craig


bigdb=> explain analyze
bigdb->   select r.row_num, m.molkeys from my_rownum r
bigdb->   join my_molkeys m on (r.version_id = m.version_id)
bigdb->   where r.row_num >= 100000 AND r.row_num < 200000
bigdb->   order by r.row_num;

 Sort  (cost=431000.85..431248.23 rows=98951 width=363) (actual time=46306.748..46417.448 rows=100000 loops=1)
   Sort Key: r.row_num
   ->  Hash Join  (cost=2583.59..422790.68 rows=98951 width=363) (actual time=469.010..45752.131 rows=100000 loops=1)
         Hash Cond: ("outer".version_id = "inner".version_id)
         ->  Seq Scan on my_molkeys m  (cost=0.00..323448.30 rows=5472530 width=363) (actual time=11.243..33299.933
rows=5472532loops=1) 
         ->  Hash  (cost=2336.21..2336.21 rows=98951 width=8) (actual time=442.260..442.260 rows=100000 loops=1)
               ->  Index Scan using i_chm_rownum_row_num on my_rownum r  (cost=0.00..2336.21 rows=98951 width=8)
(actualtime=47.551..278.736 rows=100000 loops=1) 
                     Index Cond: ((row_num >= 100000) AND (row_num < 200000))
 Total runtime: 46543.163 ms


smalldb=> explain analyze
smalldb->   select r.row_num, m.molkeys from my_rownum r
smalldb->   join my_molkeys m on (r.version_id = m.version_id)
smalldb->   where r.row_num >= 100000 AND r.row_num < 200000
smalldb->   order by r.row_num;

 Sort  (cost=43598.23..43853.38 rows=102059 width=295) (actual time=4097.180..4207.733 rows=100000 loops=1)
   Sort Key: r.row_num
   ->  Hash Join  (cost=2665.09..35107.41 rows=102059 width=295) (actual time=411.635..3629.756 rows=100000 loops=1)
         Hash Cond: ("outer".version_id = "inner".version_id)
         ->  Seq Scan on my_molkeys m  (cost=0.00..23378.90 rows=459590 width=295) (actual time=8.563..2011.455
rows=459590loops=1) 
         ->  Hash  (cost=2409.95..2409.95 rows=102059 width=8) (actual time=402.867..402.867 rows=100000 loops=1)
               ->  Index Scan using i_chm_rownum_row_num_8525 on my_rownum r  (cost=0.00..2409.95 rows=102059 width=8)
(actualtime=37.122..242.528 rows=100000 loops=1) 
                     Index Cond: ((row_num >= 100000) AND (row_num < 200000))
 Total runtime: 4333.501 ms



Table "bigdb.my_rownum"
   Column   |  Type   | Modifiers
------------+---------+-----------
 version_id | integer |
 parent_id  | integer |
 row_num    | integer |
Indexes:
    "i_chm_rownum_row_num" UNIQUE, btree (row_num)
    "i_chm_rownum_version_id" UNIQUE, btree (version_id)
    "i_chm_rownum_parent_id" btree (parent_id)



Table "bigdb.my_molkeys"
   Column   |  Type   | Modifiers
------------+---------+-----------
 version_id | integer |
 molkeys    | text    |
Indexes:
    "i_chm_molkeys_version_id" UNIQUE, btree (version_id)

Re: Join with lower/upper limits doesn't scale well

From
Gregory Stark
Date:
"Craig James" <craig_james@emolecules.com> writes:

> Below is the explain/analyze output of the query from each database. Since
> both tables are indexed on the joined columns, I don't understand why the
> big table should be so much slower -- I hoped this would scale well, or at
> least O(log(N)), not O(N).
...
> Sort  (cost=431000.85..431248.23 rows=98951 width=363) (actual time=46306.748..46417.448 rows=100000 loops=1)
>   Sort Key: r.row_num
>   ->  Hash Join  (cost=2583.59..422790.68 rows=98951 width=363) (actual time=469.010..45752.131 rows=100000 loops=1)
>         Hash Cond: ("outer".version_id = "inner".version_id)
>         ->  Seq Scan on my_molkeys m  (cost=0.00..323448.30 rows=5472530 width=363) (actual time=11.243..33299.933
rows=5472532loops=1) 
>         ->  Hash  (cost=2336.21..2336.21 rows=98951 width=8) (actual time=442.260..442.260 rows=100000 loops=1)
>               ->  Index Scan using i_chm_rownum_row_num on my_rownum r  (cost=0.00..2336.21 rows=98951 width=8)
(actualtime=47.551..278.736 rows=100000 loops=1) 
>                     Index Cond: ((row_num >= 100000) AND (row_num < 200000))
> Total runtime: 46543.163 ms

It looks like most of the time is being spent in the sequential scan of the
my_molkeys at least 33 seconds out of 46 seconds is. The actual sort is taking
under a second (the hash join finishes after 45.7s and the sort finishes after
46.4s).

The rest of the time (about 13s) is actually being spent in the hash join
which makes me think it's overflowing work_mem and having to process the join
in two batches. You might be able to speed it up by raising work_mem for this
query (you can set work_mem locally using SET)

The row_num where clause only narrows down the set of rows coming from the
my_rownum table. If you want sublinear performance you would have to provide
some way for Postgres to narrow down the rows from my_molkeys without actually
having to read them all in.

With the query as is the only way I can see that happening would be if you had
an index on "my_molkey(version_id)" and "my_rownum(version_id) WHERE row_num
between 100000 and 200000". Then it could do a merge join between two index
scans.

Note than even then I'm surprised the optimizer is bothering with the index
for these queries, at least for the 400k case. Do you have enable_seqscan=off
or random_page_cost dialled way down?

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com


Equivalent queries produce different plans

From
Craig James
Date:
The two queries below produce different plans.

select r.version_id, r.row_num, m.molkeys from my_rownum r
join my_molkeys m on (r.version_id = m.version_id)
where r.version_id >= 3200000
and   r.version_id <  3300000
order by r.version_id;


select r.version_id, r.row_num, m.molkeys from my_rownum r
join my_molkeys m on (r.version_id = m.version_id)
where r.version_id >= 3200000
and   r.version_id <  3300000
and   m.version_id >= 3200000
and   m.version_id <  3300000
order by r.version_id;

I discovered this while looking at the plans for the first query.  It seemed to be ignoring the fact that it could push
the"between" condition along to the second table, since the condition and the join are on the same indexed columns.
So,I added a redundant condition, and bingo, it was a lot faster.  In the analysis shown below, the timing (about 1.0
and1.5 seconds respectively) are for a "hot" database that's been queried a couple of times.  In real life on a "cold"
database,the times are more like 10 seconds and 21 seconds, so it's quite significant. 

Thanks,
Craig



db=> explain analyze
db-> select r.version_id, r.row_num, m.molkeys from my_rownum r
db-> join my_molkeys m on (r.version_id = m.version_id)
db-> where r.version_id >= 3200000
db-> and   r.version_id <  3300000
db-> order by r.version_id;

 Sort  (cost=264979.51..265091.06 rows=44620 width=366) (actual time=1424.126..1476.048 rows=46947 loops=1)
   Sort Key: r.version_id
   ->  Nested Loop  (cost=366.72..261533.64 rows=44620 width=366) (actual time=41.649..1186.331 rows=46947 loops=1)
         ->  Bitmap Heap Scan on my_rownum r  (cost=366.72..41168.37 rows=44620 width=8) (actual time=41.616..431.783
rows=46947loops=1) 
               Recheck Cond: ((version_id >= 3200000) AND (version_id < 3300000))
               ->  Bitmap Index Scan on i_chm_rownum_version_id_4998  (cost=0.00..366.72 rows=44620 width=0) (actual
time=21.244..21.244rows=46947 loops=1) 
                     Index Cond: ((version_id >= 3200000) AND (version_id < 3300000))
         ->  Index Scan using i_chm_molkeys_version_id on my_molkeys m  (cost=0.00..4.93 rows=1 width=362) (actual
time=0.009..0.010rows=1 loops=46947) 
               Index Cond: ("outer".version_id = m.version_id)
 Total runtime: 1534.638 ms
(10 rows)


db=> explain analyze
db-> select r.version_id, r.row_num, m.molkeys from my_rownum r
db-> join my_molkeys m on (r.version_id = m.version_id)
db-> where r.version_id >= 3200000
db-> and r.version_id <    3300000
db-> and m.version_id >=   3200000
db-> and m.version_id <    3300000
db-> order by r.version_id;

 Sort  (cost=157732.20..157732.95 rows=298 width=366) (actual time=985.383..1037.423 rows=46947 loops=1)
   Sort Key: r.version_id
   ->  Hash Join  (cost=41279.92..157719.95 rows=298 width=366) (actual time=502.875..805.402 rows=46947 loops=1)
         Hash Cond: ("outer".version_id = "inner".version_id)
         ->  Index Scan using i_chm_molkeys_version_id on my_molkeys m  (cost=0.00..115717.85 rows=47947 width=362)
(actualtime=0.023..117.270 rows=46947 loops=1) 
               Index Cond: ((version_id >= 3200000) AND (version_id < 3300000))
         ->  Hash  (cost=41168.37..41168.37 rows=44620 width=8) (actual time=502.813..502.813 rows=46947 loops=1)
               ->  Bitmap Heap Scan on my_rownum r  (cost=366.72..41168.37 rows=44620 width=8) (actual
time=41.621..417.508rows=46947 loops=1) 
                     Recheck Cond: ((version_id >= 3200000) AND (version_id < 3300000))
                     ->  Bitmap Index Scan on i_chm_rownum_version_id_4998  (cost=0.00..366.72 rows=44620 width=0)
(actualtime=21.174..21.174 rows=46947 loops=1) 
                           Index Cond: ((version_id >= 3200000) AND (version_id < 3300000))
 Total runtime: 1096.031 ms
(12 rows)

Re: Equivalent queries produce different plans

From
Craig James
Date:
Sorry, I forgot to mention: This is 8.1.4, with a fairly ordinary configuration on a 4 GB system.

Craig


Craig James wrote:
> The two queries below produce different plans.
>
> select r.version_id, r.row_num, m.molkeys from my_rownum r
> join my_molkeys m on (r.version_id = m.version_id)
> where r.version_id >= 3200000
> and   r.version_id <  3300000
> order by r.version_id;
>
>
> select r.version_id, r.row_num, m.molkeys from my_rownum r
> join my_molkeys m on (r.version_id = m.version_id)
> where r.version_id >= 3200000
> and   r.version_id <  3300000
> and   m.version_id >= 3200000
> and   m.version_id <  3300000
> order by r.version_id;
>
> I discovered this while looking at the plans for the first query.  It
> seemed to be ignoring the fact that it could push the "between"
> condition along to the second table, since the condition and the join
> are on the same indexed columns.  So, I added a redundant condition, and
> bingo, it was a lot faster.  In the analysis shown below, the timing
> (about 1.0 and 1.5 seconds respectively) are for a "hot" database that's
> been queried a couple of times.  In real life on a "cold" database, the
> times are more like 10 seconds and 21 seconds, so it's quite significant.
>
> Thanks,
> Craig
>
>
>
> db=> explain analyze db-> select r.version_id, r.row_num, m.molkeys from
> my_rownum r
> db-> join my_molkeys m on (r.version_id = m.version_id)
> db-> where r.version_id >= 3200000
> db-> and   r.version_id <  3300000
> db-> order by r.version_id;
>
> Sort  (cost=264979.51..265091.06 rows=44620 width=366) (actual
> time=1424.126..1476.048 rows=46947 loops=1)
>   Sort Key: r.version_id
>   ->  Nested Loop  (cost=366.72..261533.64 rows=44620 width=366) (actual
> time=41.649..1186.331 rows=46947 loops=1)
>         ->  Bitmap Heap Scan on my_rownum r  (cost=366.72..41168.37
> rows=44620 width=8) (actual time=41.616..431.783 rows=46947 loops=1)
>               Recheck Cond: ((version_id >= 3200000) AND (version_id <
> 3300000))
>               ->  Bitmap Index Scan on i_chm_rownum_version_id_4998
> (cost=0.00..366.72 rows=44620 width=0) (actual time=21.244..21.244
> rows=46947 loops=1)
>                     Index Cond: ((version_id >= 3200000) AND (version_id
> < 3300000))
>         ->  Index Scan using i_chm_molkeys_version_id on my_molkeys m
> (cost=0.00..4.93 rows=1 width=362) (actual time=0.009..0.010 rows=1
> loops=46947)
>               Index Cond: ("outer".version_id = m.version_id)
> Total runtime: 1534.638 ms
> (10 rows)
>
>
> db=> explain analyze db-> select r.version_id, r.row_num, m.molkeys from
> my_rownum r
> db-> join my_molkeys m on (r.version_id = m.version_id)
> db-> where r.version_id >= 3200000
> db-> and r.version_id <    3300000
> db-> and m.version_id >=   3200000
> db-> and m.version_id <    3300000
> db-> order by r.version_id;
>
> Sort  (cost=157732.20..157732.95 rows=298 width=366) (actual
> time=985.383..1037.423 rows=46947 loops=1)
>   Sort Key: r.version_id
>   ->  Hash Join  (cost=41279.92..157719.95 rows=298 width=366) (actual
> time=502.875..805.402 rows=46947 loops=1)
>         Hash Cond: ("outer".version_id = "inner".version_id)
>         ->  Index Scan using i_chm_molkeys_version_id on my_molkeys m
> (cost=0.00..115717.85 rows=47947 width=362) (actual time=0.023..117.270
> rows=46947 loops=1)
>               Index Cond: ((version_id >= 3200000) AND (version_id <
> 3300000))
>         ->  Hash  (cost=41168.37..41168.37 rows=44620 width=8) (actual
> time=502.813..502.813 rows=46947 loops=1)
>               ->  Bitmap Heap Scan on my_rownum r
> (cost=366.72..41168.37 rows=44620 width=8) (actual time=41.621..417.508
> rows=46947 loops=1)
>                     Recheck Cond: ((version_id >= 3200000) AND
> (version_id < 3300000))
>                     ->  Bitmap Index Scan on
> i_chm_rownum_version_id_4998  (cost=0.00..366.72 rows=44620 width=0)
> (actual time=21.174..21.174 rows=46947 loops=1)
>                           Index Cond: ((version_id >= 3200000) AND
> (version_id < 3300000))
> Total runtime: 1096.031 ms
> (12 rows)
>



Re: Equivalent queries produce different plans

From
Tom Lane
Date:
Craig James <craig_james@emolecules.com> writes:
> The two queries below produce different plans.

> select r.version_id, r.row_num, m.molkeys from my_rownum r
> join my_molkeys m on (r.version_id = m.version_id)
> where r.version_id >= 3200000
> and   r.version_id <  3300000
> order by r.version_id;

> select r.version_id, r.row_num, m.molkeys from my_rownum r
> join my_molkeys m on (r.version_id = m.version_id)
> where r.version_id >= 3200000
> and   r.version_id <  3300000
> and   m.version_id >= 3200000
> and   m.version_id <  3300000
> order by r.version_id;

Yeah, the planner does not make any attempt to infer implied
inequalities, so it will not generate the last two clauses for you.
There is machinery in there to infer implied *equalities*, which
is cheaper (fewer operators to consider) and much more useful across
typical queries such as multiway joins on the same keys.  I'm pretty
dubious that it'd be worth the cycles to search for implied
inequalities.

            regards, tom lane

pg_restore causes 100

From
Craig James
Date:
Here's an oddity.  I have 10 databases, each with about a dozen connections to Postgres (about 120 connections total),
andat midnight they're all idle.  These are mod_perl programs (like a FastCGI -- they stay connected so they're ready
forinstant service).  So using "ps -ef" and grep, we find one of the databases looks like this: 

postgres 22708  7619  0 Jul11 ?        00:00:06 postgres: chemmega chemmega 192.168.10.192(46915) idle
postgres 22709  7619  0 Jul11 ?        00:00:06 postgres: chemmega chemmega 192.168.10.192(46916) idle
postgres 22710  7619  0 Jul11 ?        00:00:06 postgres: chemmega chemmega 192.168.10.192(46917) idle
postgres 22711  7619  0 Jul11 ?        00:00:06 postgres: chemmega chemmega 192.168.10.192(46918) idle
postgres 22712  7619  0 Jul11 ?        00:00:06 postgres: chemmega chemmega 192.168.10.192(46919) idle
postgres 22724  7619  0 Jul11 ?        00:00:06 postgres: chemmega chemmega 192.168.10.192(42440) idle
postgres 22725  7619  0 Jul11 ?        00:00:06 postgres: chemmega chemmega 192.168.10.192(42441) idle
postgres 22726  7619  0 Jul11 ?        00:00:06 postgres: chemmega chemmega 192.168.10.192(42442) idle
postgres 22727  7619  0 Jul11 ?        00:00:06 postgres: chemmega chemmega 192.168.10.192(42443) idle
postgres 22728  7619  0 Jul11 ?        00:00:06 postgres: chemmega chemmega 192.168.10.192(42444) idle
postgres 22731  7619  0 Jul11 ?        00:00:06 postgres: chemmega chemmega 192.168.10.192(42447) idle

Now here's the weird thing.  I'm running a pg_restore of a database (on the order of 4GB compressed, maybe 34M rows of
ordinarydata, and 15M rows in one BLOB table that's typically 2K per blob).  When I do this, ALL of the postgress
backendsstart working at about 1% CPU apiece.  This means that the 120 "idle" postgres backends are together using
almost100% of one CPU on top of the 100% CPU being used by pg_restore.  See the output of top(1) below. 

Is this normal?  All I can guess at is that something's going on in shared memory that every Postgres backend has to
respondto. 

Thanks,
Craig



Tasks: 305 total,   1 running, 304 sleeping,   0 stopped,   0 zombie
Cpu(s): 33.5% us,  1.5% sy,  0.0% ni, 57.8% id,  6.6% wa,  0.2% hi,  0.4% si
Mem:   4151456k total,  4011020k used,   140436k free,    10096k buffers
Swap:  2104504k total,    94136k used,  2010368k free,  3168596k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
                                                                                     
 6681 postgres  16   0  217m 188m 161m D 50.4  4.6   4:29.30 postmaster
                                                                                      
 1577 root      10  -5     0    0    0 S  1.0  0.0 108:01.97 md0_raid1
                                                                                      
 8487 postgres  15   0  187m 8704 4996 S  1.0  0.2   0:06.56 postmaster
                                                                                      
 8506 postgres  15   0  187m 8604 4892 S  1.0  0.2   0:06.37 postmaster
                                                                                      
 8507 postgres  15   0  187m 8708 5004 S  1.0  0.2   0:06.42 postmaster
                                                                                      
 8512 postgres  15   0  187m 8612 4904 S  1.0  0.2   0:06.65 postmaster
                                                                                      
 8751 postgres  15   0  187m  10m 7520 S  1.0  0.3   0:07.95 postmaster
                                                                                      
 8752 postgres  15   0  187m  10m 7492 S  1.0  0.3   0:07.84 postmaster
                                                                                      
14053 postgres  15   0  187m 8752 5044 S  1.0  0.2   0:06.53 postmaster
                                                                                      
16515 postgres  15   0  187m 8156 4452 S  1.0  0.2   0:06.33 postmaster
                                                                                      
25351 postgres  15   0  187m 9772 6064 S  1.0  0.2   0:06.75 postmaster
                                                                                      
25387 postgres  15   0  187m 8444 4752 S  1.0  0.2   0:06.45 postmaster
                                                                                      
25425 postgres  15   0  187m 9.8m 6340 S  1.0  0.2   0:06.75 postmaster
                                                                                      
30626 postgres  15   0  187m 8472 4792 S  1.0  0.2   0:06.52 postmaster
                                                                                      
30628 postgres  15   0  187m 8536 4840 S  1.0  0.2   0:06.50 postmaster
                                                                                      
30630 postgres  15   0  187m 8524 4844 S  1.0  0.2   0:06.49 postmaster
                                                                                      
30637 postgres  15   0  187m 8692 4880 S  1.0  0.2   0:06.25 postmaster
                                                                                      
31679 postgres  15   0  187m 8544 4860 S  1.0  0.2   0:06.39 postmaster
                                                                                      
31681 postgres  15   0  187m 8528 4848 S  1.0  0.2   0:06.25 postmaster
                                                                                      
 1751 postgres  15   0  187m 8432 4748 S  1.0  0.2   0:06.26 postmaster
                                                                                      
11620 postgres  15   0  187m 8344 4644 S  1.0  0.2   0:06.23 postmaster
                                                                                      
11654 postgres  15   0  187m 8316 4624 S  1.0  0.2   0:06.36 postmaster
                                                                                      
19173 postgres  15   0  187m 9372 5668 S  1.0  0.2   0:06.49 postmaster
                                                                                      
19670 postgres  15   0  187m 9236 5528 S  1.0  0.2   0:06.29 postmaster
                                                                                      
20380 postgres  15   0  187m 8656 4956 S  1.0  0.2   0:06.20 postmaster
                                                                                      
20649 postgres  15   0  187m 8280 4584 S  1.0  0.2   0:06.16 postmaster
                                                                                      
22731 postgres  15   0  187m 8408 4700 S  1.0  0.2   0:06.03 postmaster
                                                                                      
11045 postgres  15   0  185m  71m  68m S  0.7  1.8   0:19.35 postmaster
                                                                                      
 6408 postgres  15   0  187m  11m 7520 S  0.7  0.3   0:07.89 postmaster
                                                                                      
 6410 postgres  15   0  187m  10m 7348 S  0.7  0.3   0:07.53 postmaster
                                                                                      
 6411 postgres  15   0  187m  10m 7380 S  0.7  0.3   0:07.83 postmaster
                                                                                      
 6904 postgres  15   0  187m 8644 4788 S  0.7  0.2   0:06.15 postmaster
                                                                                      
 6905 postgres  15   0  187m 8288 4596 S  0.7  0.2   0:06.15 postmaster
                                                                                      
 6906 postgres  15   0  187m 8488 4764 S  0.7  0.2   0:06.18 postmaster
                                                                                      
 6907 postgres  15   0  187m 8580 4856 S  0.7  0.2   0:06.37 postmaster
                                                                                      
 7049 postgres  15   0  187m 8488 4800 S  0.7  0.2   0:06.07 postmaster
                                                                                      
 7054 postgres  15   0  187m 8376 4672 S  0.7  0.2   0:06.28 postmaster
                                                                                      
 7188 postgres  15   0  187m 8588 4868 S  0.7  0.2   0:06.39 postmaster
                                                                                      
 7190 postgres  15   0  187m 8832 5120 S  0.7  0.2   0:06.52 postmaster
                                                                                      
 7191 postgres  15   0  187m 8632 4916 S  0.7  0.2   0:06.48 postmaster
                                                                                      
 7192 postgres  15   0  187m 8884 5176 S  0.7  0.2   0:06.55 postmaster
                                                                                      
 8511 postgres  15   0  187m 8612 4904 S  0.7  0.2   0:06.39 postmaster
                                                                                      
 8513 postgres  15   0  187m 8776 5064 S  0.7  0.2   0:06.60 postmaster
                                                                                      
 8750 postgres  15   0  187m  10m 7220 S  0.7  0.3   0:07.72 postmaster
                                                                                      
 8768 postgres  15   0  187m  10m 7508 S  0.7  0.3   0:07.77 postmaster
                                                                                      
 8769 postgres  15   0  187m  10m 7448 S  0.7  0.3   0:07.81 postmaster
                                                                                      
 8775 postgres  15   0  187m  10m 7064 S  0.7  0.3   0:07.72 postmaster
                                                                                      
 8782 postgres  15   0  187m  10m 7316 S  0.7  0.3   0:07.84 postmaster
                                                                                      
13947 postgres  15   0  187m 8500 4780 S  0.7  0.2   0:06.36 postmaster
                                                                                      
13949 postgres  15   0  187m 8536 4824 S  0.7  0.2   0:06.36 postmaster
                                                                                      
13951 postgres  15   0  187m 8504 4804 S  0.7  0.2   0:06.35 postmaster
                                                                                      
14041 postgres  15   0  187m 8548 4828 S  0.7  0.2   0:06.45 postmaster
                                                                                      
14046 postgres  15   0  187m 8560 4812 S  0.7  0.2   0:06.39 postmaster
                                                                                      
14052 postgres  15   0  187m 8744 5024 S  0.7  0.2   0:06.54 postmaster
                                                                                      
14055 postgres  15   0  187m 8580 4868 S  0.7  0.2   0:06.52 postmaster
                                                                                      
14061 postgres  15   0  187m 8464 4760 S  0.7  0.2   0:06.45 postmaster
                                                                                      
14092 postgres  15   0  187m 8624 4920 S  0.7  0.2   0:06.52 postmaster
                                                                                      
16358 postgres  15   0  187m 8284 4596 S  0.7  0.2   0:06.54 postmaster
                                                                                      
16367 postgres  15   0  187m 8392 4568 S  0.7  0.2   0:06.24 postmaster
                                                                                      



Re: pg_restore causes 100

From
Tom Lane
Date:
Craig James <craig_james@emolecules.com> writes:
> Now here's the weird thing.  I'm running a pg_restore of a database
> (on the order of 4GB compressed, maybe 34M rows of ordinary data, and
> 15M rows in one BLOB table that's typically 2K per blob).  When I do
> this, ALL of the postgress backends start working at about 1% CPU
> apiece.

It's not surprising that they'd all start eating some CPU, if that's a
schema restore and not just bulk data loading.  Any catalog change is
going to broadcast "shared cache inval" messages that all the backends
have to process to make sure they get rid of any now-obsolete cached
catalog information.

> This means that the 120 "idle" postgres backends are together
> using almost 100% of one CPU on top of the 100% CPU being used by
> pg_restore.  See the output of top(1) below.

Perhaps you need to try to cut down the number of idle processes ...

I don't think anyone's ever spent any time trying to micro-optimize
the shared cache inval code paths.  It could be we could cut your
1% figure some, if we were willing to put effort into that.  But it's
not going to go to zero.

            regards, tom lane