Re: Dynamic Partitioning using Segment Visibility Maps - Mailing list pgsql-hackers

From Richard Huxton
Subject Re: Dynamic Partitioning using Segment Visibility Maps
Date
Msg-id 477E119F.1090102@archonet.com
Whole thread Raw
In response to Re: Dynamic Partitioning using Segment Visibility Maps  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
Simon Riggs wrote:
> On Fri, 2008-01-04 at 10:22 +0000, Richard Huxton wrote:
>> Simon Riggs wrote:
>>> We would keep a dynamic visibility map at *segment* level, showing which
>>> segments have all rows as 100% visible. No freespace map data would be
>>> held at this level. 
>> Small dumb-user question.
>>
>> I take it you've considered some more flexible consecutive-run-of-blocks 
>> unit of flagging rather than file-segments. That obviously complicates 
>> the tracking but means you can cope with infrequent updates as well as 
>> mark most of the "most recent" segment for log-style tables.
> 
> I'm writing the code to abstract that away, so yes.
> 
> Now you mention it, it does seem straightforward to have a table storage
> parameter for partition size, which defaults to 1GB. The partition size
> is simply a number of consecutive blocks, as you say.
> 
> The smaller the partition size the greater the overhead of managing it.

Oh, obviously, but with smaller partition sizes this also becomes useful 
for low-end systems as well as high-end ones. Skipping 80% of a seq-scan 
on a date-range query is a win for even small (by your standards) 
tables. I shouldn't be surprised if the sensible-number-of-partitions 
remained more-or-less constant as you scaled the hardware, but the 
partition size grew.

> Also I've been looking at read-only tables and compression, as you may
> know. My idea was that in the future we could mark segments as either
> - read-only 
> - compressed
> - able to be shipped off to hierarchical storage
> 
> Those ideas work best if the partitioning is based around the physical
> file sizes we use for segments.

I can see why you've chosen file segments. It certainly makes things easier.

Hmm - thinking about the date-range scenario above, it occurs to me that   for seq-scan purposes the correct partition
sizedepends upon the 
 
data value you are interested in. What I want to know is what blocks Jan 
07 covers (or rather what blocks it doesn't) rather than knowing blocks 
1-9999999 cover 2005-04-12 to 2007-10-13. Of course that means that 
you'd eventually want different partition sizes tracking visibility for 
different columns (e.g. id, timestamp).

I suspect the same would be true for read-only/compressed/archived 
flags, but I can see how they are tightly linked to physical files 
(particularly the last two).

--   Richard Huxton  Archonet Ltd


pgsql-hackers by date:

Previous
From: Glyn Astill
Date:
Subject: Problem with PgTcl auditing function on trigger
Next
From: Alvaro Herrera
Date:
Subject: Re: timestamp typedefs