Re: RAID arrays and performance - Mailing list pgsql-performance

From Matthew Wakeling
Subject Re: RAID arrays and performance
Date
Msg-id alpine.DEB.1.10.0809191618030.23198@aragorn.flymine.org
Whole thread Raw
In response to Re: RAID arrays and performance  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
On Fri, 19 Sep 2008, Tom Lane wrote:
> Your example shows the IN-list as being sorted, but I wonder whether you
> actually are sorting the items in practice?  If not, you might try that
> to improve locality of access to the index.

Well, like I said, we generally don't have the luxury of dictating the
order of entries in the data source. However, the IN list itself is sorted
- more to do with making the logs readable and the tests reproducable than
for performance.

However, I have been looking at changing the order of the input data. This
particular data source is a 29GB xml file, and I wrote a quick program
which sorts that by one key in 40 minutes, which will hopefully allow
later data sources (which are easier to sort) to take advantage of spacial
locality in the table. However, that key is not the same one as the one
used in the query above, hence why I say we can't really dictate the order
of the entries. There's another complication which I won't go into.

> Also, parsing/planning time could be part of your problem here with 1000
> things to look at.  Can you adjust your client code to use a prepared
> query?  I'd try
>     SELECT * FROM table WHERE field = ANY($1::text[])
> (or whatever the field datatype actually is) and then push the list
> over as a single parameter value using array syntax.  You might find
> that it scales to much larger IN-lists that way.

Yes, that is a useful suggestion. However, I am fairly clear that the
system is disk seek-bound at the moment, so it probably wouldn't make a
massive improvement. It would also unfortunately require changing a lot of
our code. Worth doing at some point.

Matthew

--
"Interwoven alignment preambles are not allowed."
If you have been so devious as to get this message, you will understand
it, and you deserve no sympathy.  -- Knuth, in the TeXbook

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: RAID arrays and performance
Next
From: Mark Mielke
Date:
Subject: Re: RAID arrays and performance