Sun vs a P2. Interesting results. - Mailing list pgsql-performance

From Jeff
Subject Sun vs a P2. Interesting results.
Date
Msg-id Pine.BSF.4.44.0308260832010.18703-100000@torgo.978.org
Whole thread Raw
Responses Re: Sun vs a P2. Interesting results.
Re: Sun vs a P2. Interesting results.
Re: Sun vs a P2. Interesting results.
List pgsql-performance
Here's an interesting situation, and I think it may be just that Sun
stinks.

I was recently given the go ahead to switch from Informix to Postgres on
one of our properties.  (I had dozens of performance comparisons showing
how slow Informix was compared to it and my boss seeing me struggle trying
to make it run fast while Postgres, nearly out of the box, was simply
spanking it.).


Well, in order to facilitate things we were going to run pg on a 4 cpu
(ultrasparc ii 400Mhz) sun with 4gb of memory (also the current informix
box. It isn't very loaded). Now I know FreeBSD/Linux is preferred (and
where I do a lot of development and testing).   But check this out for
interesting results.

The Hardware:
Machine A:  4 CPU Sun Ultrasparc II 400Mhz, 4GB mem, 20GB RAID5, Solaris 8
(32 bit mode)

Machine B:  2 CPU Pentium II, 450Mhz, 512MB mem, 18GB RAID0 (2 old scsi
disks) Linux 2.4.18 (Stock redhat 8.0)

The software: PG 7.3.4 compiled myself.  (Reading specs from
/opt/sfw/lib/gcc-lib/sparc-sun-solaris2.9/2.95.3/specs gcc version 2.95.3
20010315 (release) (The solaris 8 box has no compilers, could this be the
issue?)  and (Reading specs from
/usr/lib/gcc-lib/i386-redhat-linux/3.2/specs
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/u
sr/share/info --enable-shared --enable-threads=posix --disable-checking
--host=i
386-redhat-linux --with-system-zlib --enable-__cxa_atexit
Thread model: posix
gcc version 3.2 20020903 (Red Hat Linux 8.0 3.2-7))

Ok. Maybe the compiler (I'll try installing a newer gcc for sun later
today).

The PG.conf:
shared_buffers = 2000
sort_mem = 8192
effective_cache_size = 32000
tcpip_sockets = true

The Schema:

userprofile:

userkey       | character varying(128)   |
 displayname   | character varying(128)   |
 displayname_v | boolean                  | default 'f'
 name          | character varying(128)   |
 name_v        | boolean                  | default 'f'
 email         | character varying(128)   |
 email_v       | boolean                  | default 'f'
 gender        | character varying(1)     |
 gender_v      | boolean                  | default 'f'
 country       | character varying(64)    |
 country_v     | boolean                  | default 'f'
 zip           | character varying(10)    |
 zip_v         | boolean                  | default 'f'
 city          | character varying(128)   |
 city_v        | boolean                  | default 'f'
 state         | character varying(10)    |
 state_v       | boolean                  | default 'f'
 lang          | character varying(2)     |
 lang_v        | boolean                  | default 'f'
 url           | character varying(255)   |
url_v         | boolean                  | default 'f'
 phone         | character varying(64)    |
 phone_v       | boolean                  | default 'f'
 phonemobile   | character varying(64)    |
 phonemobile_v | boolean                  | default 'f'
 phonefax      | character varying(64)    |
 phonefax_v    | boolean                  | default 'f'
 dob           | timestamp with time zone |
 dob_v         | boolean                  | default 'f'
 interests_v   | boolean                  | default 'f'
 description   | character varying(255)   |
 description2  | character varying(255)   |
 description_v | boolean                  | default 'f'

(Yes, I kknow it isn't good - a lot of it is because it is the same schema
I had to use on informix. Convienantly you cannot do much with a textblob
on infomrix, so I have to use big varchar's, but that is a fiffernt
story).

The magic query:

select userkey, dob, email, gender, country from imuserprofile
where  gender_v  and gender='m'
and  country_v and country = 'br'
and dob_v = 't'
and dob >= 'now'::timestamptz - '29 years'::interval
and dob <= 'now'::timestamptz - '18 years'::interval
order by dob asc
limit 20
offset 100

(Page 5 of male brazillians, 18-29)

Now the P2 runs this in about 0.3 seconds, and hte sun box runs it in 1
second.
Here's the explain analyze's on each:

P2:
Limit  (cost=2484.52..2484.57 rows=20 width=67) (actual
time=377.32..377.41 row
s=20 loops=1)
   ->  Sort  (cost=2484.27..2484.74 rows=186 width=67) (actual
time=377.02..377.
21 rows=121 loops=1)
         Sort Key: dob
         ->  Seq Scan on userprofile  (cost=0.00..2477.28 rows=186
width=67) (
actual time=0.15..350.93 rows=1783 loops=1)
               Filter: (gender_v AND (gender = 'm'::character varying) AND
count
ry_v AND (country = 'br'::character varying) AND (dob_v = true) AND (dob
>= '197
4-08-26 07:13:15.903437-04'::timestamp with time zone) AND (dob <=
'1985-08-26 0
7:13:15.903437-04'::timestamp with time zone))
 Total runtime: 378.21 msec
(6 rows)

Sun:
Limit  (cost=2521.19..2521.24 rows=20 width=67) (actual
time=1041.14..1041.20 r
ows=20 loops=1)
   ->  Sort  (cost=2520.94..2521.39 rows=178 width=67) (actual
time=1040.96..104
1.08 rows=121 loops=1)
         Sort Key: dob
         ->  Seq Scan on userprofile  (cost=0.00..2514.28 rows=178
width=67) (
actual time=0.37..1014.50 rows=1783 loops=1)
               Filter: (gender_v AND (gender = 'm'::character varying) AND
count
ry_v AND (country = 'br'::character varying) AND (dob_v = true) AND (dob
>= '197
4-08-26 08:21:52.158181-04'::timestamp with time zone) AND (dob <=
'1985-08-26 0
8:21:52.158181-04'::timestamp with time zone))
 Total runtime: 1042.54 msec
(6 rows)

They are loaded with the exact same dataset - 53k rows, ~10MB
Notice the estimates are roughly the same, but the execution time is
different.

I don't think it is the IO system, since 10MB will be cached by the OS and
iostat reports no activity on the disks (when running the query many
times over and over and in parellel).  it is a simple query..

Could it just be that the sun sucks? (And for the record - same schema,
nearly same query (modified for datetime syntax) on informix runs in 3
seconds).




--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/



pgsql-performance by date:

Previous
From: JM
Date:
Subject: Best tweak for fast results.. ?
Next
From: Richard Huxton
Date:
Subject: Re: Best tweak for fast results.. ?