Re: "Slow" query or just "Bad hardware"? - Mailing list pgsql-performance

From PFC
Subject Re: "Slow" query or just "Bad hardware"?
Date
Msg-id op.t8ootiakcigqcu@apollo13.peufeu.com
Whole thread Raw
In response to "Slow" query or just "Bad hardware"?  ("Jesper Krogh" <jesper@krogh.cc>)
List pgsql-performance
Hm, so this table has 10 million entries and it does not fit in 32GB of
RAM ?
Could you investigate :
- average size of rows in both tables
- a quick description of your table columns especially the average size of
your TEXT fields, especially the large one(s) like comments etc (don't
bother about INTs unless you have like 50 int columns)
- which fields get toasted, which don't, number of accesses to TOASTed
fields in this query, could add 1 seek per field per fetched row if
they're not cached
- other stuff in your database that is using those gigabytes of RAM ?
(indexes which are used often do count)

I would tend to think that you are not going to display 200 kilobytes of
text on your listing webpage, most likely something like 100 or 200 bytes
of text from each row, right ? If that is the case, 10M rows * 200 bytes =
2G to keep cached in RAM, plus overhead, so it should work fast.

You may want to partition your table in two, one which holds the fields
which are often used in bulk, search, and listings, especially when you
list 200 rows, and the other table holding the large fields which are only
displayed on the "show details" page.

Note that one (or several) large text field will not kill your
performance, postgres will store that offline (TOAST) for you without you
needing to ask, so your main table stays small and well cached. Of course
if you grab that large 10 kB text field 200 times to display the first 80
charachers of it followed by "..." in your listing page, then, you're
screwed ;) that's one of the things to avoid.

However, if your "comments" field is small enough that PG doesn't want to
TOAST it offline (say, 500 bytes), but still represents the bulk of your
table size (for instance you have just a few INTs beside that that you
want to quickly search on) then you may tell postgres to store the large
fields offline (EXTERNAL, check the docs), and also please enable
automatic compression.

If however, you have something like 200 INT columns, or a few dozens of
small TEXTs, or just way lots of columns, TOAST is no help and in this
case you you must fight bloat by identifying which columns of your table
need to be accessed often (for searches, listing, reporting, etc), and
which are not accessed often (ie. details page only, monthly reports,
etc). If you are lucky the column in the first group will form a much
smaller subset of your gigabytes of data. Then, you partition your table
in two (vertically), so the small table stays small.

EXAMPLE on a community site :

- members table, huge, search is slow, join to forum tables to get user's
name horribly slow because cache is full and it seeks
- push members' profiles and other data that only shows up in the details
page to a second table : main members table much smaller, fits in RAM now,
search is fast, joins to members are also fast.

Word to remember : working set ;)


pgsql-performance by date:

Previous
From: Matthew
Date:
Subject: Re: "Slow" query or just "Bad hardware"?
Next
From: PFC
Date:
Subject: Re: "Slow" query or just "Bad hardware"?