Re: Wierd context-switching issue on Xeon - Mailing list pgsql-performance
| From | Tom Lane |
|---|---|
| Subject | Re: Wierd context-switching issue on Xeon |
| Date | |
| Msg-id | 1407.1082419316@sss.pgh.pa.us Whole thread Raw |
| In response to | Re: Wierd context-switching issue on Xeon (Josh Berkus <josh@agliodbs.com>) |
| Responses |
Re: Wierd context-switching issue on Xeon
Re: Wierd context-switching issue on Xeon Re: Wierd context-switching issue on Xeon Re: Wierd context-switching issue on Xeon Re: Wierd context-switching issue on Xeon Re: Wierd context-switching issue on Xeon |
| List | pgsql-performance |
Here is a test case. To set up, run the "test_setup.sql" script once;
then launch two copies of the "test_run.sql" script. (For those of
you with more than two CPUs, see whether you need one per CPU to make
trouble, or whether two test_runs are enough.) Check that you get a
nestloops-with-index-scans plan shown by the EXPLAIN in test_run.
In isolation, test_run.sql should do essentially no syscalls at all once
it's past the initial ramp-up. On a machine that's functioning per
expectations, multiple copies of test_run show a relatively low rate of
semop() calls --- a few per second, at most --- and maybe a delaying
select() here and there.
What I actually see on Josh's client's machine is a context swap storm:
"vmstat 1" shows CS rates around 170K/sec. strace'ing the backends
shows a corresponding rate of semop() syscalls, with a few delaying
select()s sprinkled in. top(1) shows system CPU percent of 25-30
and idle CPU percent of 16-20.
I haven't bothered to check how long the test_run query takes, but if it
ends while you're still examining the behavior, just start it again.
Note the test case assumes you've got shared_buffers set to at least
1000; with smaller values, you may get some I/O syscalls, which will
probably skew the results.
regards, tom lane
drop table test_data;
create table test_data(f1 int);
insert into test_data values (random() * 100);
insert into test_data select random() * 100 from test_data;
insert into test_data select random() * 100 from test_data;
insert into test_data select random() * 100 from test_data;
insert into test_data select random() * 100 from test_data;
insert into test_data select random() * 100 from test_data;
insert into test_data select random() * 100 from test_data;
insert into test_data select random() * 100 from test_data;
insert into test_data select random() * 100 from test_data;
insert into test_data select random() * 100 from test_data;
insert into test_data select random() * 100 from test_data;
insert into test_data select random() * 100 from test_data;
insert into test_data select random() * 100 from test_data;
insert into test_data select random() * 100 from test_data;
insert into test_data select random() * 100 from test_data;
insert into test_data select random() * 100 from test_data;
insert into test_data select random() * 100 from test_data;
create index test_index on test_data(f1);
vacuum verbose analyze test_data;
checkpoint;
-- force nestloop indexscan plan
set enable_seqscan to 0;
set enable_mergejoin to 0;
set enable_hashjoin to 0;
explain
select count(*) from test_data a, test_data b, test_data c
where a.f1 = b.f1 and b.f1 = c.f1;
select count(*) from test_data a, test_data b, test_data c
where a.f1 = b.f1 and b.f1 = c.f1;
pgsql-performance by date: