too much context switching? - Mailing list pgsql-general

From Rodrigo Leroux
Subject too much context switching?
Date
Msg-id 3DF125D9.710A24B0@edusoft.com
Whole thread Raw
List pgsql-general
Has anyone seen the following behavior? Do you know of any tips to
prevent it?

I am running postgres v7.2.1 in a box with the following specs:

CPU: dual processor, PentiumIII at 1.4GHz, 512Kb cache
RAM: 4Gb
OS: Redhat Linux 7.3 (kernel 2.4.18-10bigmem #1 SMP)

My problem is that a particular query takes almost twice the time when
running simultaneously from two psql sessions than when running
independently:

The query takes 25 seconds when running independently (all data is
already in memory).  During the 25 seconds, cpu usage for one processor
is close to 100% (remember this is a dual-processor box):

  procs                      memory    swap          io
system         cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us
sy  id
 1  0  0      0 2524432  19208 1263476   0   0     0     0  107    45
46   5  49
 1  0  0      0 2524432  19208 1263476   0   0     0     0  105    43
50   1  50
 1  0  0      0 2524432  19208 1263476   0   0     0     0  106    42
47   3  50
 1  0  0      0 2524432  19208 1263476   0   0     0     0  106    48
45   5  50
 1  0  0      0 2524432  19208 1263476   0   0     0     0  107    47
47   4  50
 ...

When I run the same query simultaneously from 2 psql sessions, each one
takes 42 seconds...  During the 42 seconds, cpu usage for each processor

is at 90%.  vmstat shows that there is heavy context switching going on:

  procs                      memory    swap          io
system         cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us
sy  id
 2  0  0      0 2524640  19152 1263476   0   0     0     0  108 101112
73  18   9
 2  0  0      0 2524632  19152 1263476   0   0     0     0  105 100582
70  15  15
 2  0  0      0 2524624  19152 1263476   0   0     0     0  106 102268
70  18  12
 2  0  0      0 2524612  19160 1263476   0   0     0    24  107 100828
72  16  12
 1  0  0      0 2524612  19160 1263476   0   0     0     0  108 102490
67  21  12
...

Several other long-running queries that I've analyzed do leverage
successfuly both processors (running simultaneously takes just a bit
more than running alone).  The problem query, in particular, seems to be

doing index lookups inside a long loop...  I have tried crafting similar

queries (to trigger index lookups inside a loop) and all perform very
poorly when running simultaneously.  I'm not 100% sure, but it seems
that the problem is that my box spends more time context switching than
doing actual work.  Has anyone experienced the same problem?  Is it
related to the index lookups and context switching?  Is there anything I

can do at the os or pg config level to prevent this?  Am I totally
missing something?

Here's the explain analyze output:

test9=# explain analyze select count(*) from assign_category_2,
import_assign
 where assign_category_2.import_assign_id < '100000000000' and
assign_category_2
.import_assign_id = import_assign.import_assign_id and
assign_category_2.score_1
 > 2 and exists (select * from summary_student_course where
assign_category_2.st
udent_user_id = summary_student_course.student_user_id);
NOTICE:  QUERY PLAN:

Aggregate  (cost=945960.80..945960.80 rows=1 width=16) (actual
time=25390.94..25
390.94 rows=1 loops=1)
  ->  Merge Join  (cost=0.00..945043.06 rows=367095 width=16) (actual
time=0.14.
.24576.64 rows=719968 loops=1)
        ->  Index Scan using acat2_assign_cat_unique_idx on
assign_category_2  (
cost=0.00..932082.14 rows=367095 width=8) (actual time=0.11..18621.33
rows=71996
8 loops=1)
              SubPlan
                ->  Index Scan using sumry_stu_crs_stu_idx on
summary_student_co
urse  (cost=0.00..3.06 rows=4 width=92) (actual time=0.02..0.02 rows=1
loops=728
608)
        ->  Index Scan using import_assign_pkey on import_assign
(cost=0.00..68
36.26 rows=247296 width=8) (actual time=0.02..2534.51 rows=730177
loops=1)
Total runtime: 25391.16 msec

Thanks!

-rodrigo


pgsql-general by date:

Previous
From: CSN
Date:
Subject: createlang fails
Next
From: Tom Lane
Date:
Subject: Re: Another planner bug with subqueries