Re: perf problem with huge table

From: jesper@krogh.cc
Subject: Re: perf problem with huge table
Date: ,
Msg-id: 6864e5279b3489ff8f8a48fed69c2ad1.squirrel@shrek.krogh.cc
(view: Whole thread, Raw)
In response to: perf problem with huge table  (rama)
List: pgsql-performance

Tree view

perf problem with huge table  (rama, )
 Re: perf problem with huge table  (Justin Graf, )
 Re: perf problem with huge table  (Dave Crooke, )
  Re: perf problem with huge table  (Jon Lewison, )
   Re: perf problem with huge table  (Dave Crooke, )
    Re: perf problem with huge table  (Jon Lewison, )
     Re: perf problem with huge table  (Dave Crooke, )
    Re: perf problem with huge table  (Leo Mannhart, )
 Re: perf problem with huge table  (Greg Smith, )
 Re: perf problem with huge table  (, )

> Hi all,
>
> i am trying to move my app from M$sql to PGsql, but i need a bit of help
> :)

Except from all the other good advises about partitioning the dataset and
such there is another aspect to "keep in mind". When you have a large
dataset and your queries become "IO-bound" the "tuple density" is going to
hit you in 2 ways. Your dataset seems to have a natural clustering around
the time, which is also what you would use for the partitioning. That also
means that if you sort of have the clustering of data on disk you would
have the tuples you need to satisfy a query on the "same page" or pages
"close to".

The cost of checking visibillity for a tuple is to some degree a function
of the "tuple size", so if you can do anything to increase the tuple
density that will most likely benefit speed in two ways:

* You increace the likelyhood that the next tuple was in the same page and
  then dont result in a random I/O seek.
* You increace the total amount of tuples you have sitting in your system
  cache in the same amount of pages (memory) so they dont result in a
  random I/O seek.

So .. if you are carrying around columns you "dont really need", then
throw them away. (that could be colums that trivially can be computed
bases on other colums), but you need to do your own testing here. To
stress the first point theres a sample run on a fairly old desktop with
one SATA drive.

testtable has the "id integer" and a "data" which is 486 bytes of text.
testtable2 has the "id integer" and a data integer.

both filled with 10M tuples and PG restarted and rand drop caches before
to simulate "totally disk bound system".

testdb=# select count(id) from testtable where id > 8000000 and id < 8500000;
 count
--------
 499999
(1 row)

Time: 7909.464 ms
testdb=# select count(id) from testtable2 where id > 8000000 and id <
8500000;
 count
--------
 499999
(1 row)

Time: 2149.509 ms

In this sample.. 4 times faster, the query does not touch the "data" column.
(on better hardware you'll most likely see better results).

If the columns are needed, you can push less frequently used columns to a
1:1 relation.. but that gives you some administrative overhead, but then
you can desice at query time if you want the extra random seeks to
access data.

You have the same picture the "other way around" if your queries are
accession data sitting in TOAST, you'll be paying "double random IO"-cost
for getting the tuple. So it is definately a tradeoff, that should be done
with care.

I've monkeypatched my own PG using this patch to toy around with criteria
to send the "less frequently used data" to a TOAST table.
http://article.gmane.org/gmane.comp.db.postgresql.devel.general/135158/match=

Google "vertical partition" for more, this is basically what it is.

(I belive this could benefit my own application, so I'm also
trying to push some interest into the area).

--
Jesper









pgsql-performance by date:

From: Leo Mannhart
Date:
Subject: Re: perf problem with huge table
From: Scott Marlowe
Date:
Subject: Re: moving pg_xlog -- yeah, it's worth it!