Re: Need for speed 2

From: Merlin Moncure
Subject: Re: Need for speed 2
Date: ,
Msg-id: 6EE64EF3AB31D5448D0007DD34EEB3417DD1D2@Herge.rcsinc.local
(view: Whole thread, Raw)
In response to: Need for speed 2  (Ulrich Wisser)
Responses: What *_mem to increase when running CLUSTER  (Andrew Lazarus)
List: pgsql-performance

Tree view

Need for speed 2  (Ulrich Wisser, )
    Re: Need for speed 2  (Frank Wiles, )
    Re: Need for speed 2  (Ron, )
     Re: Need for speed 2  (Kelly Burkhart, )
      Re: Need for speed 2  (Alex Turner, )
    Re: Need for speed 2  ("Merlin Moncure", )
     What *_mem to increase when running CLUSTER  (Andrew Lazarus, )
      Re: What *_mem to increase when running CLUSTER  (Steve Poe, )
      Re: What *_mem to increase when running CLUSTER  (Tom Lane, )

> Putting pg_xlog on the IDE drives gave about 10% performance
> improvement. Would faster disks give more performance?
>
> What my application does:
>
> Every five minutes a new logfile will be imported. Depending on the
> source of the request it will be imported in one of three "raw click"
> tables. (data from two months back, to be able to verify customer
> complains)
> For reporting I have a set of tables. These contain data from the last
> two years. My app deletes all entries from today and reinserts updated
> data calculated from the raw data tables.
>
> The queries contain no joins only aggregates. I have several indexes
to
> speed different kinds of queries.
>
> My problems occur when one users does a report that contains to much
old
> data. In that case all cache mechanisms will fail and disc io is the
> limiting factor.

It seems like you are pushing limit of what server can handle.  This
means: 1. expensive server upgrade. or
2. make software more efficient.

Since you sound I/O bound, you can tackle 1. by a. adding more memory or
b. increasing i/o throughput.

Unfortunately, you already have a pretty decent server (for x86) so 1.
means 64 bit platform and 2. means more expensive hard drives.  The
archives is full of information about this...

Is your data well normalized?  You can do tricks like:
if table has fields a,b,c,d,e,f with a is primary key, and d,e,f not
frequently queried or missing, move d,e,f to seprate table.

well normalized structures are always more cache efficient.  Do you have
lots of repeating and/or empty data values in your tables?

Make your indexes and data as small as possible to reduce pressure on
the cache, here are just a few tricks:
1. use int2/int4 instead of numeric
2. know when to use char and varchar
3. use functional indexes to reduce index expression complexity.  This
can give extreme benefits if you can, for example, reduce double field
index to Boolean.

Merlin


pgsql-performance by date:

From: Chris Browne
Date:
Subject: Re: Read/Write block sizes
From: Ron
Date:
Subject: Re: Read/Write block sizes