Re: Joel's Performance Issues WAS : Opteron vs Xeon - Mailing list pgsql-performance

From Mischa Sandberg
Subject Re: Joel's Performance Issues WAS : Opteron vs Xeon
Date
Msg-id 1114203230.4269645e44160@webmail.telus.net
Whole thread Raw
In response to Re: Joel's Performance Issues WAS : Opteron vs Xeon  (Alvaro Herrera <alvherre@dcc.uchile.cl>)
List pgsql-performance
Quoting Alvaro Herrera <alvherre@dcc.uchile.cl>:

> One further question is: is this really a meaningful test?  I mean, in
> production are you going to query 300000 rows regularly?  And is the
> system always going to be used by only one user?  I guess the question
> is if this big select is representative of the load you expect in
> production.

While there may be some far-out queries that nobody would try,
you might be surprised what becomes the norm for queries,
as soon as the engine feasibly supports them. SQL is used for
warehouse and olap apps, as a data queue, and as the co-ordinator
or bridge for (non-SQL) replication apps. In all of these,
you see large updates, large result sets and volatile tables
("large" to me means over 20% of a table and over 1M rows).

To answer your specific question: yes, every 30 mins,
in a data redistribution app that makes a 1M-row query,
and writes ~1000 individual update files, of overlapping sets of rows.
It's the kind of operation SQL doesn't do well,
so you have to rely on one big query to get the data out.

My 2c
--
"Dreams come true, not free." -- S.Sondheim, ITW


pgsql-performance by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Bad n_distinct estimation; hacks suggested?
Next
From: "Joel Fradkin"
Date:
Subject: Re: Joel's Performance Issues WAS : Opteron vs Xeon