Thread: Sanity checking big select performance

Sanity checking big select performance

From
Jeff Chen
Date:
Hi friends!

I'd love to get a sanity check on whether a fat select query I'm doing makes sense given the infrastructure that we have.

We have 3 big tables that we typically join together for certain queries: a ~40 million row photos table, a ~20 million row users table, and a ~50 million row photo_to_album table that maps photos to albums.

We like to display real time analytics, which often results in a query like:

select (random aggregations )
from
photo_to_album join photos on photos.id = photo_to_album.photo_id
join users on users.id = photos.user_id
where
photo_to_album.album_id = <something>
and
photos.created_at between <some dates>
and <other junk>

We have indexes on all of the joins, and the where clauses.

One of these queries that should be targeting something like 300K photos takes 38 seconds to run (with an aggregate/nested loop taking effectively all of that time), and then upon second execution with a warm cache, 4 seconds.

Also worryingly, it spikes read IOPS to almost 1500/sec during the time and write IOPS 200/sec.  When not running the query, steady level read iops basically nil, write hovers around 50-100.

This also increases the queue depth from basically 0 up to 6.  Keeping the queue depth high seems to cause timeouts in other queries.  The CPU is barely if at all affected, hovering around 20%.  Memory also barely affected.

We have a RDS Postgres database, m3.2xlarge with 2000 Provisioned IOPS and 400GB storage.  This translates to 8 virtual CPUs, 30GiB memory, and all SSD drives.

Several questions here:

1) Is that level of IOPS normal?
2) Is it bad that the level of iops can queue requests that screw up the whole database even if it's just select queries?  Especially when the CPU and Memory are still plentiful?
3) What is up with the huge difference between cold and warm cache?

Any help is appreciated!

- jzc

Re: Sanity checking big select performance

From
Tomas Vondra
Date:
On 28.10.2014 22:15, Jeff Chen wrote:
> Hi friends!
>
> I'd love to get a sanity check on whether a fat select query I'm doing
> makes sense given the infrastructure that we have.
>
> We have 3 big tables that we typically join together for certain
> queries: a ~40 million row photos table, a ~20 million row users table,
> and a ~50 million row photo_to_album table that maps photos to albums.

So how much data is it? Does it fit within RAM (after loading into DB,
with all the indexes)?

> We like to display real time analytics, which often results in a query like:
>
> select (random aggregations )
> from
> photo_to_album join photos on photos.id <http://photos.id> =
> photo_to_album.photo_id
> join users on users.id <http://users.id> = photos.user_id
> where
> photo_to_album.album_id = <something>
> and
> photos.created_at between <some dates>
> and <other junk>
>
> We have indexes on all of the joins, and the where clauses.

Can we get EXPLAIN (and ideally EXPLAIN ANALYZE) for such queries?

> One of these queries that should be targeting something like 300K
> photos takes 38 seconds to run (with an aggregate/nested loop taking
> effectively all of that time), and then upon second execution with a
> warm cache, 4 seconds.

Well, if you're hitting disk, it's going to be slow. As you observed,
after loading it into page cache, it's much faster.

> Also worryingly, it spikes read IOPS to almost 1500/sec during the time
> and write IOPS 200/sec.  When not running the query, steady level read
> iops basically nil, write hovers around 50-100.
>
> This also increases the queue depth from basically 0 up to 6.  Keeping
> the queue depth high seems to cause timeouts in other queries.  The CPU
> is barely if at all affected, hovering around 20%.  Memory also barely
> affected.

20% is ~2 CPU cores (as you have 8 of them).

> We have a RDS Postgres database, m3.2xlarge with 2000 Provisioned IOPS
> and 400GB storage.  This translates to 8 virtual CPUs, 30GiB memory, and
> all SSD drives.

AFAIK there are two PostgreSQL major versions supported on RDS - 9.1 and
9.3. Which one are you using?

Also, can you list values for some basic parameters (shared_buffers,
work_mem)? We don't know what are the default values on RDS, neither if
you somehow modified them.

> Several questions here:
>
> 1) Is that level of IOPS normal?

Ummmmm, why wouldn't it be? Each IO request works with 16 KB (on EBS),
and you're reading/writing a certain amount of data.

> 2) Is it bad that the level of iops can queue requests that screw up the
> whole database even if it's just select queries?  Especially when the
> CPU and Memory are still plentiful?

You're saturating a particular resource. If you hit I/O wall, you can't
use the CPU / memory. The fact that it slows down your queries is
somehow expected.

Is it bad? Well, if you need to minimize impact on other queries, then
probably yes.

> 3) What is up with the huge difference between cold and warm cache?

I don't understand why you're surprised by this? The EBS performance on
m3.2xlarge (with EBS-Optimized networking, i.e. 1 Gbit dedicated to EBS)
you get up to ~120 MB/s, except that you set 2000 IOPS, which is ~32
MB/s. Memory is orders of magnitude faster, hence the difference.

regards
Tomas


Re: Sanity checking big select performance

From
Kevin Grittner
Date:
Jeff Chen <jeff@pixleeteam.com> wrote:

> One of these queries that should be targeting something like 300K
> photos takes 38 seconds to run (with an aggregate/nested loop
> taking effectively all of that time),

With the seek time of commodity disk drives typically being 9ms, a
naive approach using random access to join to 300k rows on a single
thread with no caching would take 45 minutes; so the fact that you
are seeing much better than that implies some benefit from cache,
some sequential scanning, faster drives, or concurrent access to
multiple spindles.

> and then upon second execution with a warm cache, 4 seconds.

This shows that it is faster to access data in RAM than on disk,
and that your data wasn't already all in cache (most likely because
it doesn't all fit in RAM).

> Also worryingly, it spikes read IOPS to almost 1500/sec during
> the time and write IOPS 200/sec.  When not running the query,
> steady level read iops basically nil, write hovers around 50-100.

The reads are just another symptom of not having the data fully
cached.  The writes are more interesting.  The two obvious
possibilities are that the query needed to use work files (for
sorts or hash tables) or that you were accessing a fair amount of
data which had not been vacuumed since it was last modified.  To
help improve performance for the first, you might want to consider
increasing work_mem (although this will reduce RAM available for
caching).  To improve performance for the second you might want to
make autovacuum more aggressive.

To get more specific advice, you may want to read this page and
follow the advice there:

https://wiki.postgresql.org/wiki/SlowQueryQuestions

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company