Re: help with plug-in function for additional (partition/shard) visibility checks - Mailing list pgsql-hackers
From | Hannu Krosing |
---|---|
Subject | Re: help with plug-in function for additional (partition/shard) visibility checks |
Date | |
Msg-id | 1314966960.3187.17.camel@hvost Whole thread Raw |
In response to | Re: help with plug-in function for additional (partition/shard) visibility checks (PostgreSQL - Hans-Jürgen Schönig<postgres@cybertec.at>) |
Responses |
Re: help with plug-in function for additional (partition/shard) visibility checks
|
List | pgsql-hackers |
On Fri, 2011-09-02 at 14:01 +0200, PostgreSQL - Hans-Jürgen Schönig wrote: > hello … > > i have been thinking about this issue for quite a while ... > given your idea i am not sure how this can work at all. > > consider: > begin; > insert 1 > insert 2 > commit > > assume this ends up in the same node, > now you split it into two … > 1 and 2 will have exactly the same visibility to and transaction. > i am not sure how you can get this right without looking at the data. It has to consider the data when determining visibility, that's the whole point of the plug-in . The idea is, that each row "belongs" to a certain partition, as determined by some function over it's fields. Most often this function is hash of primary key OR-ed by a bitmap representing cluster size and AND-ed by bitmap for partition(s) stored in this database. when you split the parition, then some row's don't belong in the old partition database anymore (and if you did a full copy, then the other half dont belong to the new one), so they should be handled as invisible / deleted. As this can be only done by looking at the tuple data, this needs an additional visibility function. And as this is only needed for partitioned databases, it makes sense to implement it as a plogin, so it would not wast cycles on non-partitioned databases > alternative idea: what if the proxy would add / generate a filter by > looking at the data? > a quick idea would be that once you split you add a simple directive > such as "FILTER GENERATOR $1" or so to the PL/proxy code. > it would then behind the scene arrange the filter passed on. > what do you think? Hmm. I'm not sure I understand what you are trying to say. Can you elaborate a little ? > > regards, > > hans > > > > On Sep 1, 2011, at 10:13 AM, Hannu Krosing wrote: > > > Hallow hackers > > > > I have the following problem to solve and would like to get advice on > > the best way to do it. > > > > The problem: > > > > When growing a pl/proxy based database cluster, one of the main > > operations is splitting a partition. The standard flow is as follows: > > > > 1) make a copy of the partitions table(s) to another database > > 2) reconfigure pl/proxy to use 2 partitions instead of one > > > > The easy part is making a copy of all or half of the table to another > > database. The hard part is fast deletion (i mean milliseconds, > > comparable to TRUNCATE) the data that should not be in a partition (so > > that RUN ON ALL functions will continue to return right results). > > > > It would be relatively easy, if we still had the RULES for select > > available for plain tables, but even then the eventual cleanup would > > usually mean at least 3 passes of disk writes (set xmax, write deleted > > flag, vacuum and remove) > > > > What I would like to have is possibility for additional visibility > > checks, which would run some simple C function over tuple data (usually > > hash(fieldval) + and + or ) and return visibility (is in this partition) > > as a result. It would be best if this is run at so low level that also > > vacuum would use it and can clean up the foreign partition data in one > > pass, without doing the delete dance first. > > > > So finally the QUESTION : > > > > where in code would be the best place to check for this so that > > > > 1) both regular queries and VACUUM see it > > 2) the tuple data (and not only system fields or just xmin/xmax) would > > be available for the function to use > > > > > > -- > > ------- > > Hannu Krosing > > PostgreSQL Unlimited Scalability and Performance Consultant > > 2ndQuadrant Nordic > > PG Admin Book: http://www.2ndQuadrant.com/books/ > > > > > > -- > > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > > To make changes to your subscription: > > http://www.postgresql.org/mailpref/pgsql-hackers > > > > -- > Cybertec Schönig & Schönig GmbH > Gröhrmühlgasse 26 > A-2700 Wiener Neustadt, Austria > Web: http://www.postgresql-support.de > >
pgsql-hackers by date: