Thread: Named vs Unnamed Partitions
Hi, IMO, the lengthy discussion about Segment Exclusion and Segment Visibility Maps has long turned into a discussion about partitioning in general. I'm thankful for all the new insights it has brought me and I want to continue sharing my view on things. What's following is highly theoretical and has brainstorming characteristics. You've been warned. There are two very distinct ways to handle partitioning. For now, I'm calling them named and unnamed partitioning. Let's have a closer look at both options from a users point of view. I'm using Andrew's pseudo DDL example from the above mentioned thread: ALTER TABLE foo SET read_only='t' WHERE created_on < '2007-01-01'; Given all tuples were read-writeable before, that implicitly created two partitions. Giving them names could look like that: ALTER TABLE foo SPLIT INTO old_foos AND new_foos; AT created_on < '2007-01-01'ALTER PARTITION old_foos SET READONLY; Instead of only setting the read-only property, one could also set an alternative table space for the partition: ALTER TABLE foo SET TABLE SPACE large_but_slow_storage WHERE created_on < '2007-01-01'; vs: ALTER PARTITION old_foos SET TABLE SPACE large_but_slow_storage; Please also note, that neither variant is limited to range partitioning. You can theoretically partition by pretty much anything, for example with a WHERE clause like: ..WHERE (id % 5) < 2 The primary difference I see between these two ways to declare partitions is, that the former only modifies tuple properties (read-only, storage location), while the later also tells the database *why* it has to modify them. That has several different effects. First, newly inserted tuples are treated differently. For unnamed partitions, there must be defaults, like read-writable and a default table space. With named partitions, you define split points, so I guess one expects newly inserted tuples to end up in the right partition automatically. Unnamed partitioning could be equally automatic when letting a function decide, where to insert the new tuple. Second, repartitioning must be treated differently. With unnamed partitioning, the admin must first adjust the defaults (if required) and then move the existing tuple properties accordingly. With named partitions, the admin only needs to adjust the split point and the database system knows what it has to do. And third, but IMO most importantly: to be able to optimize queries, the database system has to know split points, so it can exclude partitions or segments from scanning. Obviously, with named partitions, it always knows them. Otherwise, you'll have to maintain some information about the tuples in your partitions, as Simon does with the min/max tuples. As soon as required, it could also maintain additional min/max values, i.e. for (id % 5) for the above example. I hope to have shown the most relevant aspects. To conclude, I'd say that named partitioning is closer to manually managed partitioning, as already known and often used. While unnamed partitioning is closer to automated partitioning, where the DBA does *not need* to have names for partitions, which is a pretty new and interesting idea to me. Regards Markus
"Markus Schiltknecht" <markus@bluegap.ch> writes: > There are two very distinct ways to handle partitioning. For now, I'm calling > them named and unnamed partitioning. I had most of a draft email written which I seem to have lost in a reboot. To a large degree I was on the same line of thought as you. The whole point of partitioning is to give the DBA a short-hand to allow him or her to describe certain properties of the data to the database. The "named" approach is to let the DBA create objects which can then have various properties attached to them. So you can create a bucket for each month or for each financial account or whatever. Then you can attach properties to the buckets such as what tablespace to store them in, or whether to treat them as read-only or offline. The naming is precisely the useful part in that it is how the DBA associates the properties with chunks of data. Without naming the DBA would have to specify the same ranges every time he wants to change the properties. He might do a "SET read_only WHERE created_on < '2000-01-01'" one day then another "SET tablespace tsslow WHERE created_on < '2000-01-01'" and then later again do "SET offline WHERE created_on < '2000-01-01'" I have to admit I always found it kludgy to have objects named invoices_2000_JAN and invoices_2000_FEB and so on. It's kind of an meta denormalization. But so is specifying where clauses repeatedly. If you don't have a first-class object which you can refer to to attach properties to, and instead are forced to redefine it repeatedly for each use then there's nothing stopping you from creating overlapping or even conflicting sets of properties. What's the database to do if you tell it something like: ALTER TABLE foo SET tablespace tsslow WHERE created_on < '2000-01-01' ALTER TABLE foo SET tablespace tsfast WHERE updated_on > '2006-01-01' Maybe you know that no record older than 2000 will be updated now but the database doesn't. As Markus describes too the behaviour *before* you've attached any particular properties to a partition is interesting too. A big benefit of partitioning is being able to load whole partitions or drop whole partitions of data which were not in any way special prior to needing to be archived. Effectively the named objects are the DBA's way of telling the database "this chunk of data here, keep it all in one place because I'll be doing something en masse to it (such as dropping it) at some later date". -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!
On Wed, 2008-01-09 at 02:25 +0000, Gregory Stark wrote: > "Markus Schiltknecht" <markus@bluegap.ch> writes: > > > There are two very distinct ways to handle partitioning. For now, I'm calling > > them named and unnamed partitioning. > The naming is precisely the useful part in that it is how the DBA associates > the properties with chunks of data. Why does giving something a name help partition exclusion? > Without naming the DBA would have to specify the same ranges every time he > wants to change the properties. He might do a "SET read_only WHERE created_on > < '2000-01-01'" one day then another "SET tablespace tsslow WHERE created_on < > '2000-01-01'" and then later again do "SET offline WHERE created_on < > '2000-01-01'" > > I have to admit I always found it kludgy to have objects named > invoices_2000_JAN and invoices_2000_FEB and so on. It's kind of an meta > denormalization. But so is specifying where clauses repeatedly. The idea for using the WHERE clauses was to specifically avoid naming. In most cases the table is divided into old read only and newer data. So there is one split point that make it easy to use a simple WHERE clause. If you guys really want names, we can have names, but I think I want to see a case where the storage characteristics of the table are so complex we can only make sense of it by naming particular chunks. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
Simon Riggs wrote: >> I have to admit I always found it kludgy to have objects named >> invoices_2000_JAN and invoices_2000_FEB and so on. It's kind of an meta >> denormalization. But so is specifying where clauses repeatedly. > > The idea for using the WHERE clauses was to specifically avoid naming. I understand, and I'm all for avoiding needless, kludgy names. As I pointed out, knowledge of split points might be important for the database system. Maybe we can store the split point without the need for names? Dunno. > If you guys really want names, we can have names, but I think I want to > see a case where the storage characteristics of the table are so complex > we can only make sense of it by naming particular chunks. Well, assuming you only have to deal with one split point, that's certainly true. However, there are people using more than two table spaces, thus obviously needing more split points. Can we name the split points, rather than the partitions? Regards Markus
"Simon Riggs" <simon@2ndquadrant.com> writes: > On Wed, 2008-01-09 at 02:25 +0000, Gregory Stark wrote: > >> Without naming the DBA would have to specify the same ranges every time he >> wants to change the properties. He might do a "SET read_only WHERE created_on >> < '2000-01-01'" one day then another "SET tablespace tsslow WHERE created_on < >> '2000-01-01'" and then later again do "SET offline WHERE created_on < >> '2000-01-01'" >> >> I have to admit I always found it kludgy to have objects named >> invoices_2000_JAN and invoices_2000_FEB and so on. It's kind of an meta >> denormalization. But so is specifying where clauses repeatedly. > > The idea for using the WHERE clauses was to specifically avoid naming. > > In most cases the table is divided into old read only and newer data. So > there is one split point that make it easy to use a simple WHERE clause. > > If you guys really want names, we can have names, but I think I want to > see a case where the storage characteristics of the table are so complex > we can only make sense of it by naming particular chunks. Perhaps a good analogy is indexes. Index names are themselves kind of redundant and people usually use names which encode up most of the information of the definition. But the reason you need names for indexes is so that you can refer to them later to drop them, rebuild them, change their properties such as tablespace, fill factor, etc? You could imagine imposing on users that they should restate the index definition every time they want to change the fill factor or tablespace but I'm sure you could see the downsides with that approach. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning
On Wed, 2008-01-09 at 16:20 +0100, Markus Schiltknecht wrote: > Simon Riggs wrote: > >> I have to admit I always found it kludgy to have objects named > >> invoices_2000_JAN and invoices_2000_FEB and so on. It's kind of an meta > >> denormalization. But so is specifying where clauses repeatedly. > > > > The idea for using the WHERE clauses was to specifically avoid naming. > > I understand, and I'm all for avoiding needless, kludgy names. > > As I pointed out, knowledge of split points might be important for the > database system. Maybe we can store the split point without the need for > names? Dunno. > > > If you guys really want names, we can have names, but I think I want to > > see a case where the storage characteristics of the table are so complex > > we can only make sense of it by naming particular chunks. > > Well, assuming you only have to deal with one split point, that's > certainly true. However, there are people using more than two table > spaces, thus obviously needing more split points. > > Can we name the split points, rather than the partitions? So far, I've been looking at partition exclusion as the most important feature for the VLDB use case. You seem to have moved straight on to what I've regarded as later features for partitioning. From my side, if I can't make SE work then most of the other features seem moot, even though I personally regard them as important also. With that in mind, can I clarify what you're thinking, please? 1) the things you've been discussing are so important I should do them first, which would necessarily require named chunks of tables 2) the things you've been discussing are essential requirements of partitioning and we could never consider it complete until they are also included and we must therefore talk about them now to check that its all possible before we do anything on SE 3) doing SE first is right, I'm just thinking ahead 4) the topics aren't really linked and I'm suggesting doing development on them in parallel or... Sorry if that seems blunt, I'm just not clear where we're going. I have to think about implementability, planning and priorities if I'm to get it done. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
On Wed, 2008-01-09 at 15:53 +0000, Gregory Stark wrote: > "Simon Riggs" <simon@2ndquadrant.com> writes: > Perhaps a good analogy is indexes. Index names are themselves kind of > redundant and people usually use names which encode up most of the information > of the definition. > > But the reason you need names for indexes is so that you can refer to them > later to drop them, rebuild them, change their properties such as tablespace, > fill factor, etc? > > You could imagine imposing on users that they should restate the index > definition every time they want to change the fill factor or tablespace but > I'm sure you could see the downsides with that approach. Which is exactly what we do with DROP FUNCTION... You'd run these things as often as you run ALTER TABLE SET tablespace, so it doesn't seem a problem. When I delete all rows WHERE some_date < 'cut-off date' on a segment boundary value that would delete all segments that met the criteria. The following VACUUM will then return those segments to be read-write, where they can then be refilled with new incoming data. The only command we would have to run is the DELETE, everything else is automatic. If we have named chunks, then you'd have to specifically reset the boundary conditions on the named chunk after deletion before the chunk could be reused. That all becomes DDL, which means additional code to be written, bugs to be fixed, as well as the table locking required. Seems like a lot just for some occasional convenience. So not convinced of the need for named sections of tables yet. It all seems like detail, rather than actually what we want for managing large tables. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
Hi, Simon Riggs wrote: > With that in mind, can I clarify what you're thinking, please? Sure, I can try to clarify: > 2) the things you've been discussing are essential requirements of > partitioning and we could never consider it complete until they are also > included and we must therefore talk about them now to check that its all > possible before we do anything on SE I thought so, but am slowly dropping that point of view. In favor of something like: hey, if you manage to do it all automatically, cool, go for it! > 3) doing SE first is right, I'm just thinking ahead Yes, SE certainly has merit. Combine it with some sort of maintained CLUSTERing order and it's worth doing, IMO. I'm not convinced about dynamic partitioning being able to generally replace explicit partitioning anytime soon. > Sorry if that seems blunt, I'm just not clear where we're going. Well, implicit or automatic partitioning is still a pretty new concept to me, but I'm slowly beginning to like it. Thank you for pointing me at it. Regards Markus
Hi, Simon Riggs wrote: > When I delete all rows WHERE some_date < 'cut-off date' on a segment > boundary value that would delete all segments that met the criteria. The > following VACUUM will then return those segments to be read-write, where > they can then be refilled with new incoming data. The only command we > would have to run is the DELETE, everything else is automatic. Agreed, that would be very nice. > So not convinced of the need for named sections of tables yet. It all > seems like detail, rather than actually what we want for managing large > tables. What do you think about letting the database system know the split point vs it having to find optimal split points automatically? Read-write vs. read-only is as good start, but can that concept be expanded to automatically choosing hash partitioning between storage systems, for example? Or more generally: can the database system gather enough information about the storage systems to take a decision as good as or better than the DBA? Regards Markus
On Wed, 2008-01-09 at 17:30 +0100, Markus Schiltknecht wrote: > Simon Riggs wrote: > > With that in mind, can I clarify what you're thinking, please? > > Sure, I can try to clarify: > > > 2) the things you've been discussing are essential requirements of > > partitioning and we could never consider it complete until they are also > > included and we must therefore talk about them now to check that its all > > possible before we do anything on SE > > I thought so, but am slowly dropping that point of view. In favor of > something like: hey, if you manage to do it all automatically, cool, go > for it! > > > 3) doing SE first is right, I'm just thinking ahead > > Yes, SE certainly has merit. Combine it with some sort of maintained > CLUSTERing order and it's worth doing, IMO. > > I'm not convinced about dynamic partitioning being able to generally > replace explicit partitioning anytime soon. In all cases, no. But do you think it would work well for the specific databases you've used partitioning on? Would it be possible to check? > > Sorry if that seems blunt, I'm just not clear where we're going. > > Well, implicit or automatic partitioning is still a pretty new concept > to me, but I'm slowly beginning to like it. Thank you for pointing me at it. OK, thanks. I'll write up what I've learned in last few days into a new version of the proposal and put it on the Wiki. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
markus@bluegap.ch (Markus Schiltknecht) writes: > Simon Riggs wrote: >> With that in mind, can I clarify what you're thinking, please? > > Sure, I can try to clarify: > >> 2) the things you've been discussing are essential requirements of >> partitioning and we could never consider it complete until they are also >> included and we must therefore talk about them now to check that its all >> possible before we do anything on SE > > I thought so, but am slowly dropping that point of view. In favor of > something like: hey, if you manage to do it all automatically, cool, > go for it! > >> 3) doing SE first is right, I'm just thinking ahead > > Yes, SE certainly has merit. Combine it with some sort of maintained > CLUSTERing order and it's worth doing, IMO. My suspicion is that if this gets added in with "maintained CLUSTER order," we *lose* all of the exclusions aside from the ones directly established by the CLUSTER order. That is, the CLUSTER ordering winds up preventing other "natural" patterns from emerging, with the result that SE winds up being of pretty limited usefulness. > I'm not convinced about dynamic partitioning being able to generally > replace explicit partitioning anytime soon. It also seems to me that explicit partitioning would make this form of dynamic partitioning less useful. Suppose there are 4 more or less uniformly used partitions; if you're splitting the data evenly across 4x the partitions, then that means that each segment will tend to have ranges ~4x as wide, which makes SE rather less of a "win." (Relax the assumption of uniform distributions, and that just changes the weights...) -- output = reverse("ofni.sesabatadxunil" "@" "enworbbc") http://cbbrowne.com/info/advocacy.html "It seems that perfection is attained not when nothing is left to add, but when nothing is left to be taken away." -- Antoine de Saint-Exupery.
On Wed, 2008-01-09 at 18:04 +0100, Markus Schiltknecht wrote: > > So not convinced of the need for named sections of tables yet. It all > > seems like detail, rather than actually what we want for managing large > > tables. > > What do you think about letting the database system know the split point > vs it having to find optimal split points automatically? For me, managing the table's files can be separate from the chunking that allows partition exclusion. Managing the table's files must be a manual operation. We can't infer the presence of a new tablespace etc.. Those files would need less than 10 zones or chunks, usually just one. The chunking to allow partition exclusion can still be automatic, allowing a much finer grain of partition. If we restrict the actions allowed to be just - mark read-only then for read-only segments (only) - migrate tablespaces - compress read-only segments - mark as off-line (not fully convinced we need this yet) then it seems straightforward to allow this to occur by a WHERE clause only, since the constraints on a segment will be fixed in place when it is read-only. This also allows those operations to take place without holding locks for any length of time, since they are all just metadata ops or copying read only data to another place. The WHERE clause approach might easily allow more than 2 chunks and they need not be logically contiguous. So the phrase split point doesn't really fit because it implies a one dimensional viewpoint, but I'm happy for you to give it a name. If we want to perform manipulations on non-read-only chunks then we need named or numbered partitions, locking, DDL etc.. That seems like too much functionality for what we really need. I really am still open on that point, but I would like to see a good description of a use case that really needs it, rather than just saying "of course we do". Which is exactly where *I* started, even as recently as 3 weeks ago now. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
Hi, Le Wednesday 09 January 2008 19:27:41 Simon Riggs, vous avez écrit : > The WHERE clause approach might easily allow more than 2 chunks and they > need not be logically contiguous. So the phrase split point doesn't > really fit because it implies a one dimensional viewpoint, but I'm happy > for you to give it a name. Maybe that's only me but I'm not yet clear, after reading this thread and the previous one, whether or not Segment Exclusion would allow for multi-level partitioning. I have a use case at the moment, where we load logs-like data from several server to a central one (batch loading), the central table having an extra "server" column to identify each tuple origin. Will SE technique be able to see that this table would be better partitionned by server AND date? That's what I would have done if it was easier to do with constraint exclusion (did only date partitioning), as the reporting queries will always have some server (stats by services, each service being installed on 1 or more servers) and date restrictions. Please note I'd be happy to have this use case handled by explicitly specifying the partitioning system I want PostgreSQL to use, and more than happy if you answer me than an automatic transparent code is able to optimize the data on disk for my need without me bothering about partitions, their names and "split points"... > If we want to perform manipulations on non-read-only chunks then we need > named or numbered partitions, locking, DDL etc.. That seems like too > much functionality for what we really need. I really am still open on > that point, but I would like to see a good description of a use case > that really needs it, rather than just saying "of course we do". Which > is exactly where *I* started, even as recently as 3 weeks ago now. I like Markus ideas proposing to have SE at work inside partitions or tables, partitions being another kind of relations holding data. Then the DBA who needs to explicitly manage partitions to save faster tablespace for live data is able to tell that to the system and benefit fully from it. Regards, -- dim
On Wed, 2008-01-09 at 21:29 +0100, Dimitri Fontaine wrote: > Le Wednesday 09 January 2008 19:27:41 Simon Riggs, vous avez écrit : > > The WHERE clause approach might easily allow more than 2 chunks and they > > need not be logically contiguous. So the phrase split point doesn't > > really fit because it implies a one dimensional viewpoint, but I'm happy > > for you to give it a name. > > Maybe that's only me but I'm not yet clear, after reading this thread and the > previous one, whether or not Segment Exclusion would allow for multi-level > partitioning. > > I have a use case at the moment, where we load logs-like data from several > server to a central one (batch loading), the central table having an > extra "server" column to identify each tuple origin. Will SE technique be > able to see that this table would be better partitionned by server AND date? No, but it will be able to handle partitioning on other columns that provide a degree of differentiation that you possibly hadn't considered at design time. > That's what I would have done if it was easier to do with constraint exclusion > (did only date partitioning), as the reporting queries will always have some > server (stats by services, each service being installed on 1 or more servers) > and date restrictions. Hmm, well if you found declaring the partitions a problem with constraint exclusion it's not going to be any easier using other declarative approaches. So it will work with what you currently use. You can always use constraint exclusion to separate out the servers and then segment exclusion to handle the date range. > Please note I'd be happy to have this use case handled by explicitly > specifying the partitioning system I want PostgreSQL to use, and more than > happy if you answer me than an automatic transparent code is able to optimize > the data on disk for my need without me bothering about partitions, their > names and "split points"... > > > If we want to perform manipulations on non-read-only chunks then we need > > named or numbered partitions, locking, DDL etc.. That seems like too > > much functionality for what we really need. I really am still open on > > that point, but I would like to see a good description of a use case > > that really needs it, rather than just saying "of course we do". Which > > is exactly where *I* started, even as recently as 3 weeks ago now. > > I like Markus ideas proposing to have SE at work inside partitions or tables, > partitions being another kind of relations holding data. Then the DBA who > needs to explicitly manage partitions to save faster tablespace for live data > is able to tell that to the system and benefit fully from it. OK, thanks, -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
On Wed, Jan 09, 2008 at 08:51:30PM +0000, Simon Riggs wrote: > > That's what I would have done if it was easier to do with constraint exclusion > > (did only date partitioning), as the reporting queries will always have some > > server (stats by services, each service being installed on 1 or more servers) > > and date restrictions. > > Hmm, well if you found declaring the partitions a problem with > constraint exclusion it's not going to be any easier using other > declarative approaches. I disagree (although it is unreasonable for me to do so without posting syntax -- it's coming). Proper grammar for partition support means running a single DDL command. The user does not have to line up table generation with rules (or triggers) and check constraints. As such, I believe it to be much much easier. Thanks, Gavin
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wed, 9 Jan 2008 23:52:09 +0100 Gavin Sherry <swm@alcove.com.au> wrote: te restrictions. > > > > Hmm, well if you found declaring the partitions a problem with > > constraint exclusion it's not going to be any easier using other > > declarative approaches. > > I disagree (although it is unreasonable for me to do so without > posting syntax -- it's coming). Proper grammar for partition support > means running a single DDL command. The user does not have to line up > table generation with rules (or triggers) and check constraints. As > such, I believe it to be much much easier. +1 .... http://www.databasedesign-resource.com/oracle-partitions.html I am not saying I like Oracle's syntax (I don't) but: http://dev.mysql.com/tech-resources/articles/mysql_5.1_partitioning_with_dates.html and: http://www.databasejournal.com/features/mssql/article.php/3456991 Or worlds above us in usability. Sincerely, Joshua D. Drake - -- The PostgreSQL Company: Since 1997, http://www.commandprompt.com/ Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate SELECT 'Training', 'Consulting' FROM vendor WHERE name = 'CMD' -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHhVoAATb/zqfZUUQRAp9IAJ4+LQ+zHOgD1wpblH/q1OwF4+1W3QCdFaLU hlb5uRrbK7Z+oRCLMi+SNJs= =cmIs -----END PGP SIGNATURE-----
Hi, Simon Riggs wrote: > On Wed, 2008-01-09 at 18:04 +0100, Markus Schiltknecht wrote: >> What do you think about letting the database system know the split point >> vs it having to find optimal split points automatically? > > For me, managing the table's files can be separate from the chunking > that allows partition exclusion. Agreed. So in your terms, my question is: who does the chunking? What you are proposing with SE is that the database does the chunking automatically (based on spontaneous ordering). That's a novel and interesting idea to me. > Managing the table's files must be a manual operation. We can't infer > the presence of a new tablespace etc.. Sure. However, letting the database do the chunking (i.e. define split points), but leaving it up to the user to decide where to put those chunks certainly doesn't work. So, there are only two options: either we let the user choose split points manually, or else we tell the database those missing pieces of information and rely on automatisms. If I understand correctly, you are stating, that in the case of read-only vs read-writable, the database has enough information to make good decisions. > Those files would need less than > 10 zones or chunks, usually just one. Agreed. > The WHERE clause approach might easily allow more than 2 chunks and they > need not be logically contiguous. So the phrase split point doesn't > really fit because it implies a one dimensional viewpoint, but I'm happy > for you to give it a name. I consider read-only vs. read-writable to be pretty one dimensional. And the storage is logically organized in contiguous blocks. So there are split points between segments with differing read-only property, according to my definition. Regards Markus