Thread: Partial indexes instead of partitions

Partial indexes instead of partitions

From

Leonardo F

Date:

11 June 2010, 06:01:04

HI all,


I have a very big table (2000 inserts per sec, I have to store 20 days of data).
The table has 2 indexes, in columns that have almost-random values.

Since keeping those two indexes up-to-date can't be done (updating 2000
times per second 2 indexes with random values on such a huge table is
impossible) I thought that partitioning was the way to go.

Now I think I have 2 options:

a) create 480 partitions, 1 for each hour of the day. 2 indexes on each
partition
b) create 20 partitions, and create 24*2 partial indexes on the current
partition; then the next day (overnight) create 2 global indexes for the
table and drop the 24*2 indexes...

I thought about option b) because I don't like the fact that the planner takes
"ages" to plan a query in case there are 480 partitions; in option b) I would
have:

19 partitions with 2 indexes each
1 partition (the "current day" one) with 24*2 partial indexes


Does it make sense? Anyone has ever used partial indexes instead of
partitions?

Re: Partial indexes instead of partitions

From

Sergey Konoplev

Date:

11 June 2010, 08:55:21

On 11 June 2010 13:00, Leonardo F <m_lists@yahoo.it> wrote:
> a) create 480 partitions, 1 for each hour of the day. 2 indexes on each
> partition
> b) create 20 partitions, and create 24*2 partial indexes on the current
> partition; then the next day (overnight) create 2 global indexes for the
> table and drop the 24*2 indexes...
>
> I thought about option b) because I don't like the fact that the planner takes
> "ages" to plan a query in case there are 480 partitions; in option b) I would
> have:
>
> 19 partitions with 2 indexes each
> 1 partition (the "current day" one) with 24*2 partial indexes

Could you please explain the reason to do so many partitions?

In case b) you will face a huge overhead related to necessity of
checking all the data in the table every time new index is created
(doesn't matter it is partial).

--
Sergey Konoplev

Blog: http://gray-hemp.blogspot.com /
Linkedin: http://ru.linkedin.com/in/grayhemp /
JID/GTalk: gray.ru@gmail.com / Skype: gray-hemp / ICQ: 29353802

Re: Partial indexes instead of partitions

From

Leonardo F

Date:

11 June 2010, 09:29:55

> Could you please explain the reason to do so many
> partitions?


Because otherwise there would be tons of rows in each
partition, and randomly "updating" the index for that many
rows 2000 times per second isn't doable (the indexes
get so big that it would be like writing a multi-GB file
randomly)

> In case b) you will face a huge overhead related to necessity
> of
> checking all the data in the table every time new index is
> created


I would create the table with all the indexes already in; but only

the index related to the "current timestamp of the inserted row"
would be updated; the others wouldn't be touched.

Re: Partial indexes instead of partitions

From

Sergey Konoplev

Date:

11 June 2010, 09:40:27

On 11 June 2010 16:29, Leonardo F <m_lists@yahoo.it> wrote:
>
>> Could you please explain the reason to do so many
>> partitions?
>
>
> Because otherwise there would be tons of rows in each
> partition, and randomly "updating" the index for that many
> rows 2000 times per second isn't doable (the indexes
> get so big that it would be like writing a multi-GB file
> randomly)
>
>> In case b) you will face a huge overhead related to necessity
>> of
>> checking all the data in the table every time new index is
>> created
>
>
> I would create the table with all the indexes already in; but only
>
> the index related to the "current timestamp of the inserted row"
> would be updated; the others wouldn't be touched.

Well the situation is still ambiguous so:
Is it possible to provide this table and indexes definitions?
And it would be great it you describe the queries you are going to do
on this table or just provide the SQL.


--
Sergey Konoplev

Blog: http://gray-hemp.blogspot.com /
Linkedin: http://ru.linkedin.com/in/grayhemp /
JID/GTalk: gray.ru@gmail.com / Skype: gray-hemp / ICQ: 29353802

Re: Partial indexes instead of partitions

From

Leonardo F

Date:

11 June 2010, 10:15:36

> Well the situation is still ambiguous
> so:
> Is it possible to provide this table and indexes definitions?
> And it
> would be great it you describe the queries you are going to do
> on this table
> or just provide the SQL.


Sure!
Basically what I'm trying to do is to partition the index in the table
where the data is going to be inserted into smaller indexes, but
without using partitions: I would use partial indexes.
"Historic" data will have just the big index...

say that I want to store 1G rows: 100M per day, 10 days.
I would have 10 tables, 9 of them with 2 big indexes (the indexes
on the 2 columns that are going to be used in queries together
with the timestamp) and the latest one with 24*2 smaller indexes
(so that insertion will still be fast) to be dropped overnight after
the 2 big indexes have been created... then a new table is created
(for the new day's data) with the small indexes and the oldest table
dropped (as I said, I won't store more than 10 days).


This is "pseudo SQL":

CREATE TABLE master
(
   ts timestamp,
   key1 bigint,  <-- populated with almost-random values
   key2 bigint,  <-- populated with almost-random values
   data1 varchar(20),
   [...]
);


CREATE TABLE master_01 (
CHECK ( ts >= DATE '2006-03-01' AND ts < DATE '2006-03-02' )
) INHERITS (master);

CREATE INDEX master_01_ix1 ON master_01 (key1);
CREATE INDEX master_01_ix2 ON master_01 (key2)

CREATE TABLE master_02 (
CHECK ( ts >= DATE '2006-03-02' AND ts < DATE '2006-03-03' )
) INHERITS (master);


CREATE INDEX master_02_ix1 ON master_02 (key1);
CREATE INDEX master_02_ix2 ON master_02 (key2)

[10 tables like the above...]

With this config insertion on the "today's" table will be slow
at the end of the day, because updating the 2 indexes will be
very slow (they will be getting very large).

So I thought I could make, on "today's table", instead of the 2
indexes on the whole table, something like:

CREATE INDEX master_10_1_ix1 ON  master_10 (key1)
WHERE (ts >= DATETIME '2006-03-10 00:00'  and
ts < DATETIME '2006-03-10 01:00')

(same thing for second indexed column)

CREATE INDEX master_10_2_ix1 ON  master_10 (key1)
WHERE (ts >= DATETIME '2006-03-10 01:00'  and
ts < DATETIME '2006-03-10 02:00')

(same thing for second indexed column)

[other 22 indexes definition like the above, one per hour...]

That is, the table where data will be inserted (ts will always be
ascending, so I will always insert data in the latest table)
will have multiple small indexes.
Then, at night, the small indexes would be dropped after one big
index has been created (since no more rows will be inserted in that
table, I don't care if the index is big).

So, a query like:

select * from master where key1=938479
and ts between now() and "now()-10 minutes"

would use the proper index on the "today's" table;

a query like:

select * from master where key1=938479
and ts between "3 days ago" and "2 days ago"

would use the indexes in table "today minus 2 days" and
"today minus 3 days"

Re: Partial indexes instead of partitions

From

Sergey Konoplev

Date:

13 June 2010, 16:53:02

On 11 June 2010 17:15, Leonardo F <m_lists@yahoo.it> wrote:
> Basically what I'm trying to do is to partition the index in the table
> where the data is going to be inserted into smaller indexes, but
> without using partitions: I would use partial indexes.
> "Historic" data will have just the big index...

Well, you can estimate if it's worth bothering with index
partitioning. For "selects" you should compare

logM(N)
N - number of records
M - (base) number of records in b-tree node (in one 8k page)

for whole table partition and index partition but I do not think the
difference would be great. For "inserts" I do not see the reason why
it would be better to use index partitioning because AFAIK b-tree
would behave exactly the same in both cases.

> That is, the table where data will be inserted (ts will always be
> ascending, so I will always insert data in the latest table)
> will have multiple small indexes.
> Then, at night, the small indexes would be dropped after one big
> index has been created (since no more rows will be inserted in that
> table, I don't care if the index is big).
>
> So, a query like:
> select * from master where key1=938479
> and ts between now() and "now()-10 minutes"

You should explicitly state the index conditions and the partition
conditions here otherwise they would not be used

SELECT * FROM master
WHERE
    -- For table partition
    ts >= '2006-03-10' AND
    ts < '2006-04-10' AND
    -- For index partition
    ts >= '2006-03-10 01:00' AND
    ts < '2006-03-10 02:00' AND
    -- Target conditions
    key1 = 938479 AND
    ts BETWEEN now() AND now() - interval '10 minutes';

Furthermore I would suggest you to use this index

CREATE INDEX master_10_2_ix1
ON master_10 (key1, ts)
WHERE
    ts >= '2006-03-10 01:00'  and
    ts < '2006-03-10 02:00';

if you want "Target conditions" to work optimal way.

> a query like:
> select * from master where key1=938479
> and ts between "3 days ago" and "2 days ago"

You can not use BETWEEN here because it is equal to "ts >= ... AND ts
<= ..." not "ts >= ... AND ts < ..." as specified in the table
definition. See above.


--
Sergey Konoplev

Blog: http://gray-hemp.blogspot.com /
Linkedin: http://ru.linkedin.com/in/grayhemp /
JID/GTalk: gray.ru@gmail.com / Skype: gray-hemp / ICQ: 29353802

Re: Partial indexes instead of partitions

From

Leonardo F

Date:

14 June 2010, 06:25:06

> For "inserts" I do not see the reason
> why
> it would be better to use index partitioning because AFAIK
> b-tree
> would behave exactly the same in both cases.

no, when the index gets very big inserting random values gets
very slow.

But still, my approach doesn't work because I thought Postgresql
was able to "merge" different partial indexes (using
BitmapOr/BitmapAnd) when the WHERE condition matches multiple
partial indexes... but that's (I guess) not that easy to do

(basically partial indexes condition should be checked like the
constraint of the inheritance mechanism, and Or/And Bitmapped
as happens in the "regular" partitioning method).

that is, having a table with:


CREATE TABLE test
(
    a timestamp without time zone,
    b integer
)


CREATE INDEX test1idx
  ON test
  (b)
  WHERE a >= '2008-03-10 14:00:00' AND a < '2008-03-10 16:00:00';

CREATE INDEX test2idx
  ON test
  (b)
  WHERE a >= '2008-03-10 16:00:00' AND a < '2008-03-10 18:00:00';


the select:

select * from test where a > '2008-03-10 15:00:00'
   and a < '2008-03-10 17:00:00'  and b = 100

should use a BitmapOr between an index scan on test1idx and test2idx...

But I think that's a complicated thing for the planner... even though
it doesn't sound that different from what the planner does for partition
pruning...

Re: Partial indexes instead of partitions

From

Sergey Konoplev

Date:

14 June 2010, 08:56:45

On 14 June 2010 13:24, Leonardo F <m_lists@yahoo.it> wrote:
>> For "inserts" I do not see the reason
>> why
>> it would be better to use index partitioning because AFAIK
>> b-tree
>> would behave exactly the same in both cases.
>
> no, when the index gets very big inserting random values gets
> very slow.

Hm, interesting. Could you please provide some references/links?


--
Sergey Konoplev

Blog: http://gray-hemp.blogspot.com /
Linkedin: http://ru.linkedin.com/in/grayhemp /
JID/GTalk: gray.ru@gmail.com / Skype: gray-hemp / ICQ: 29353802

Re: Partial indexes instead of partitions

From

David Wilson

Date:

14 June 2010, 09:33:09

On Mon, Jun 14, 2010 at 5:24 AM, Leonardo F <m_lists@yahoo.it> wrote:

> For "inserts" I do not see the reason
> why
> it would be better to use index partitioning because AFAIK
> b-tree
> would behave exactly the same in both cases.

no, when the index gets very big inserting random values gets
very slow.

Do you have any empirical evidence for this being a real problem, or are you simply guessing? I have tables with 500m+ rows, on commodity hardware (4 SATA disks in raid 10), and inserts to the indexes on those tables remain quite acceptable from a performance standpoint.

--
- David T. Wilson
david.t.wilson@gmail.com

Re: Partial indexes instead of partitions

From

Peter Hunsberger

Date:

14 June 2010, 09:38:36

On Mon, Jun 14, 2010 at 7:27 AM, David Wilson <david.t.wilson@gmail.com> wrote:
>
>
> On Mon, Jun 14, 2010 at 5:24 AM, Leonardo F <m_lists@yahoo.it> wrote:
>>
>> > For "inserts" I do not see the reason
>> > why
>> > it would be better to use index partitioning because AFAIK
>> > b-tree
>> > would behave exactly the same in both cases.
>>
>> no, when the index gets very big inserting random values gets
>> very slow.
>
> Do you have any empirical evidence for this being a real problem, or are you
> simply guessing? I have tables with 500m+ rows, on commodity hardware (4
> SATA disks in raid 10), and inserts to the indexes on those tables remain
> quite acceptable from a performance standpoint.
>

Can you define acceptable?  IIRC the OP is looking for 20,000+ inserts / sec.


--
Peter Hunsberger

Re: Partial indexes instead of partitions

From

David Wilson

Date:

14 June 2010, 09:53:15

On Mon, Jun 14, 2010 at 8:38 AM, Peter Hunsberger <peter.hunsberger@gmail.com> wrote:

Can you define acceptable? IIRC the OP is looking for 20,000+ inserts / sec.

He's actually only looking for 2k inserts/sec. With a battery backed controller I can sustain that, yes. That's also on commodity hardware (the whole system was under $2k a year and a half ago).

--
- David T. Wilson
david.t.wilson@gmail.com

Re: Partial indexes instead of partitions

From

Sam Mason

Date:

14 June 2010, 11:49:45

On Mon, Jun 14, 2010 at 08:27:49AM -0400, David Wilson wrote:
> On Mon, Jun 14, 2010 at 5:24 AM, Leonardo F <m_lists@yahoo.it> wrote:
> > > For "inserts" I do not see the reason why it would be better to
> > > use index partitioning because AFAIK b-tree would behave exactly
> > > the same in both cases.
> >
> > no, when the index gets very big inserting random values gets
> > very slow.
>
> Do you have any empirical evidence for this being a real problem, or are you
> simply guessing?

Just guessing here as well, but when you're inserting uniformly
distributed "random" values, then it should slow down quite a lot.  You
may happen to be lucky in your distributions and keep the upper nodes
of the tree in cache but with more uniform distributions the less this
is going to happen.  The larger an index and the more uniform the
distribution the more time is going to be spent pulling blocks off the
disk.

AFAIU the OP is trying to give the cache a chance of doing some useful
work by partitioning by time so it's going to be forced to go to disk
less.

Slightly more usefully for the OP, have you considered a couple of
"levels" to your hierarchy.  Maybe bi-hourly (~15 million records?)
within the current day and move them over into a "day" table at night
(or whenever is better).  It would be a good time to cluster the data,
if that would help as well.

--
  Sam  http://samason.me.uk/

Re: Partial indexes instead of partitions

From

Leonardo F

Date:

15 June 2010, 07:37:38

> AFAIU the OP is trying to give the cache a chance of
> doing some useful
> work by partitioning by time so it's going to be forced to
> go to disk
> less.

Exactly

> have you
> considered a couple of
> "levels" to your hierarchy.  Maybe bi-hourly (~15
> million records?)
> within the current day and move them over into a "day"
> table at night

I was going for the partitioned-index approach because
it would avoid re-copying the data over another table.
My idea was:

1) create partial indexes on today's table
2) at night, create a whole index (not partial) on yesterday's
table
3) drop the partial indexes on yesterday's table

But this doesn't work, because partial indexes aren't
"appended" the way partitioned tables are... that is, if I have
one index covering half table, and another covering the other
half, if I query the data over the "intersection" I'll always get
a plain table scan, where I would expect the planner to do
an append of the result of 2 index scans...

Would it be something that could be added to the TODO list?
It doesn't look that different from what table partitioning/pruning
does....

Thank you everybody for your replies anyway!