Thread: Auto-clustering?

Auto-clustering?

From

Royce Ausburn

Date:

17 December 2010, 04:05:22

Hi all,

I have a table that in the typical case holds two minute sample data for a few thousand sources. Often we need to report on these data for a particular source over a particular time period and we're finding this query tends to get a bit slow.

The structure of the table:

fc=# explain analyse select collection, period, tariff, sum(bytesSent), sum(bytesReceived), sum(packets), max(sample), (starttime / 3600) * 3600 as startchunk from sample_20101001 where starttime between 1287493200 and 1290171599 and collection=128 and ip = '10.9.125.207' group by startchunk, tariff, collection, period; QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------

HashAggregate (cost=34959.01..34959.03 rows=1 width=44) (actual time=67047.850..67047.850 rows=0 loops=1)

-> Bitmap Heap Scan on sample_20101001 (cost=130.56..34958.91 rows=5 width=44) (actual time=67047.847..67047.847 rows=0 loops=1)

Recheck Cond: ((collection = 128) AND (starttime >= 1287493200) AND (starttime <= 1290171599))

Filter: (ip = '10.9.125.207'::text)

-> Bitmap Index Scan on sample_20101001_collection_starttime_idx (cost=0.00..130.56 rows=9596 width=0) (actual time=9806.115..9806.115 rows=6830 loops=1)

Index Cond: ((collection = 128) AND (starttime >= 1287493200) AND (starttime <= 1290171599))

Total runtime: 67048.201 ms

(7 rows)

I figure at most there should only be ~20,000 rows to be read from disk, and I expect that the index is doing a pretty good job of making sure only the rows that need reading are read. inclusion of the ip in the query is almost redundant as most of the time an ip has its own collection.... My suspicion is that the rows that we're interested in are very sparsely distributed on disk, so we're having to read too many pages for the query...

All of the queries on this table are reporting on a single collection, so ideally a collection's data would all be stored in the same part of the disk... or at least clumped together. This can be achieved using "cluster", however as far as I know there's no automated, non-cronesque means of clustering and having the table become unusable during the cluster is not ideal.

I've considered partitioning, but I don't think that's going to give the effect I need. Apparently clustering is only going to scale to a few dozen child tables, so that's only going to give one order of magnitude performance for significant complexity.

Are there any other options?

Cheers!

--Royce

Re: Auto-clustering?

From

Filip Rembiałkowski

Date:

17 December 2010, 05:28:08

2010/12/17 Royce Ausburn <royce@inomial.com>

Hi all,

I have a table that in the typical case holds two minute sample data for a few thousand sources. Often we need to report on these data for a particular source over a particular time period and we're finding this query tends to get a bit slow.

The structure of the table:

   Table "public.sample"
   Column | Type | Modifiers
-------------------+--------------------------+-------------------------------------------------
client | integer | not null
aggregateid | bigint | not null
sample | bigint | not null default nextval('samplekey'::regclass)
customer | integer |
period | integer | not null
starttime | integer | not null
duration | integer | not null
ip | text |
tariff | integer |
bytessentrate | bigint |
bytessent | bigint |
bytesreceived | bigint |
packets | integer | not null
queuetype | integer | not null default 0
collection | integer |
bytesreceivedrate | bigint |
greatestrate | bigint |
invalidated | timestamp with time zone |
Indexes:
   "sample_pkey" PRIMARY KEY, btree (sample)
   "sample_collection_starttime_idx" btree (collection, starttime)
   "sample_customer_starttime_idx" btree (customer, starttime)
   "sample_sample_idx" btree (client, sample)
Foreign-key constraints:
   "sample_client_fkey" FOREIGN KEY (client) REFERENCES client(client)

fc=# explain analyse select collection, period, tariff, sum(bytesSent), sum(bytesReceived), sum(packets), max(sample), (starttime / 3600) * 3600 as startchunk from sample_20101001 where starttime between 1287493200 and 1290171599 and collection=128 and ip = '10.9.125.207' group by startchunk, tariff, collection, period; QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=34959.01..34959.03 rows=1 width=44) (actual time=67047.850..67047.850 rows=0 loops=1)
   -> Bitmap Heap Scan on sample_20101001 (cost=130.56..34958.91 rows=5 width=44) (actual time=67047.847..67047.847 rows=0 loops=1)
   Recheck Cond: ((collection = 128) AND (starttime >= 1287493200) AND (starttime <= 1290171599))
   Filter: (ip = '10.9.125.207'::text)
   -> Bitmap Index Scan on sample_20101001_collection_starttime_idx (cost=0.00..130.56 rows=9596 width=0) (actual time=9806.115..9806.115 rows=6830 loops=1)
   Index Cond: ((collection = 128) AND (starttime >= 1287493200) AND (starttime <= 1290171599))
Total runtime: 67048.201 ms
(7 rows)

how about (auto)vacuuming?

I figure at most there should only be ~20,000 rows to be read from disk, and I expect that the index is doing a pretty good job of making sure only the rows that need reading are read. inclusion of the ip in the query is almost redundant as most of the time an ip has its own collection.... My suspicion is that the rows that we're interested in are very sparsely distributed on disk, so we're having to read too many pages for the query...

you can test this suspicion in very simple way:
- create test table (like yours including indexes including constraints, but with no data)
- insert into test select * from yours order by
- analyze test tablee available
- test the query on the new table

If new query is much faster, and if you have intensive random UPD/DEL/INS activity, periodic CLUSTER could be a good idea...
but it depends on actual usage patterns (SELECT/modify ratio, types of updates, and so on).

All of the queries on this table are reporting on a single collection, so ideally a collection's data would all be stored in the same part of the disk... or at least clumped together. This can be achieved using "cluster", however as far as I know there's no automated, non-cronesque means of clustering and having the table become unusable during the cluster is not ideal.

cron is a way of automation, isn't it :-)

I've considered partitioning, but I don't think that's going to give the effect I need. Apparently clustering is only going to scale to a few dozen child tables, so that's only going to give one order of magnitude performance for significant complexity.

regarding partitioning: I guess it starts to make sense around 10M rows or 10G Bytes in one table.

regarding clustering: it does not help with index bloat.

and finally, you did not specify what PostgreSQL version are you using.

cheers,
Filip

Re: Auto-clustering?

From

Marti Raudsepp

Date:

17 December 2010, 05:49:54

2010/12/17 Filip Rembiałkowski <filip.rembialkowski@gmail.com>:
> regarding clustering: it does not help with index bloat.

I'm almost sure it does, CLUSTER re-creates all indexes from scratch
after copying the tuples.

Regards,
Marti

Re: Auto-clustering?

From

Filip Rembiałkowski

Date:

17 December 2010, 06:19:46

you are right, I must have missed it...

                 Table "public.u"
Column |            Type             | Modifiers
--------+-----------------------------+-----------
id     | integer                     |
t      | timestamp without time zone |
d      | text                        |
Indexes:
    "u_d" btree (d)
    "u_id" btree (id)
    "u_t" btree (t)

filip@filip=# select oid, relname, pg_Relation_size(oid) from pg_class where relname in('u','u_id','u_t','u_d');
oid | relname | pg_relation_size
-------+---------+------------------
64283 | u       |         15187968
64289 | u_id    |          6758400
64290 | u_t     |          6086656
64291 | u_d     |         16482304

filip@filip=# CLUSTER u USING u_t;
CLUSTER
filip@filip=# select oid, relname, pg_Relation_size(oid) from pg_class where relname in('u','u_id','u_t','u_d');
oid | relname | pg_relation_size
-------+---------+------------------
64283 | u       |         12115968
64289 | u_id    |          3391488
64290 | u_t     |          3391488
64291 | u_d     |          8216576
(4 rows)

So CLUSTER is effectively CLUSTER + REINDEX... nice.

W dniu 17 grudnia 2010 10:41 użytkownik Marti Raudsepp <marti@juffo.org> napisał:

2010/12/17 Filip Rembiałkowski <filip.rembialkowski@gmail.com>:
> regarding clustering: it does not help with index bloat.

I'm almost sure it does, CLUSTER re-creates all indexes from scratch
after copying the tuples.

Regards,
Marti

Re: Auto-clustering?

From

"Pierre C"

Date:

17 December 2010, 06:20:20

> fc=# explain  analyse select collection, period, tariff, sum(bytesSent),
> sum(bytesReceived), sum(packets), max(sample), (starttime / 3600) * 3600
> as startchunk from sample_20101001 where starttime between 1287493200
> and 1290171599  and collection=128    and ip = '10.9.125.207' group by
> startchunk, tariff, collection,
> period;

If CLUSTER locks bother you, and you don't do UPDATEs, you might consider
doing something like this :

- accumulate the rows in a "recent" table
- every hour, INSERT INTO archive SELECT * FROM recent ORDER BY (your
cluster fields)
- DELETE FROM recent the rows you just inserted
- VACUUM recent

The cluster in your archive table will not be perfect but at least all
rows from 1 source in 1 hour will be stored close together. But clustering
doesn't need to be perfect either, if you get 100x better locality, that's
already good !

Now, if you have a huge amount of data but never query it with a precision
exceeding 1 hour, you might consider creating an aggregate table where, at
the end of every hour, you only store sum(), min(), max() of the data for
the last hour's data using GROUP BY the fields you want. You could also
use a trigger, but that would generate a huge amount of UPDATEs.

For the above query you'd do :

INSERT INTO stats_by_hour (columns...) SELECT
collection, ip, period, tariff, sum(bytesSent),
sum(bytesReceived), sum(packets), max(sample), (starttime / 3600) * 3600
as startchunk from sample_20101001 WHERE starttime > some value
GROUP BY collection, ip, period, tariff, startchunk

Then you can run aggregates against this much smaller table instead.

Re: Auto-clustering?

From

phb07

Date:

17 December 2010, 12:19:29

Royce Ausburn a écrit :
> All of the queries on this table are reporting on a single collection, so ideally a collection's data would all be
storedin the same part of the disk... or at least clumped together.  This can be achieved using "cluster", however as
faras I know there's no automated, non-cronesque means of clustering and having the table become unusable during the
clusteris not ideal.   
>
>
>

If the lock level used by CLUSTER is a problem for you, you could
consider pg_reorg contrib. AFAIK, it does similar work as CLUSTER but
allowing a concurrent read and write activity on the table.

Regards. Philippe.

Re: Auto-clustering?

From

Royce Ausburn

Date:

18 December 2010, 19:55:55

On 17/12/2010, at 8:27 PM, Filip Rembiałkowski wrote:

2010/12/17 Royce Ausburn <royce@inomial.com>
Hi all,

I have a table that in the typical case holds two minute sample data for a few thousand sources. Often we need to report on these data for a particular source over a particular time period and we're finding this query tends to get a bit slow.

how about (auto)vacuuming?

A key piece of information I left out: we almost never update rows in this table.

I figure at most there should only be ~20,000 rows to be read from disk, and I expect that the index is doing a pretty good job of making sure only the rows that need reading are read. inclusion of the ip in the query is almost redundant as most of the time an ip has its own collection.... My suspicion is that the rows that we're interested in are very sparsely distributed on disk, so we're having to read too many pages for the query...

you can test this suspicion in very simple way:
- create test table (like yours including indexes including constraints, but with no data)
- insert into test select * from yours order by
- analyze test tablee available
- test the query on the new table

If new query is much faster, and if you have intensive random UPD/DEL/INS activity, periodic CLUSTER could be a good idea...
but it depends on actual usage patterns (SELECT/modify ratio, types of updates, and so on).

Good idea! This vastly improves query times.

and finally, you did not specify what PostgreSQL version are you using.

In the case I've been working with it's 8.1 =( But we have a few instances of this database... I believe the rest are a mixture of 8.4s and they all have the same problem.

--Royce

Re: Auto-clustering?

From

Royce Ausburn

Date:

18 December 2010, 23:07:57

On 17/12/2010, at 9:20 PM, Pierre C wrote:

>
>> fc=# explain  analyse select collection, period, tariff, sum(bytesSent), sum(bytesReceived), sum(packets),
max(sample),(starttime / 3600) * 3600 as startchunk from sample_20101001 where starttime between 1287493200 and
1290171599 and collection=128    and ip = '10.9.125.207' group by startchunk, tariff, collection, period; 
>
> If CLUSTER locks bother you, and you don't do UPDATEs, you might consider doing something like this :
>
> - accumulate the rows in a "recent" table
> - every hour, INSERT INTO archive SELECT * FROM recent ORDER BY (your cluster fields)
> - DELETE FROM recent the rows you just inserted
> - VACUUM recent
>
> The cluster in your archive table will not be perfect but at least all rows from 1 source in 1 hour will be stored
closetogether. But clustering doesn't need to be perfect either, if you get 100x better locality, that's already good ! 

That's a really decent idea and can slot in perfectly well with how the application already works!  We have existing
DBAOcode that handles monthly tables; it'll happily pop data in to a recent table.... In fact we can probably tolerate
havinga "today" table.  Thanks! 

--Royce