Hallow hackers
I have the following problem to solve and would like to get advice on
the best way to do it.
The problem:
When growing a pl/proxy based database cluster, one of the main
operations is splitting a partition. The standard flow is as follows:
1) make a copy of the partitions table(s) to another database
2) reconfigure pl/proxy to use 2 partitions instead of one
The easy part is making a copy of all or half of the table to another
database. The hard part is fast deletion (i mean milliseconds,
comparable to TRUNCATE) the data that should not be in a partition (so
that RUN ON ALL functions will continue to return right results).
It would be relatively easy, if we still had the RULES for select
available for plain tables, but even then the eventual cleanup would
usually mean at least 3 passes of disk writes (set xmax, write deleted
flag, vacuum and remove)
What I would like to have is possibility for additional visibility
checks, which would run some simple C function over tuple data (usually
hash(fieldval) + and + or ) and return visibility (is in this partition)
as a result. It would be best if this is run at so low level that also
vacuum would use it and can clean up the foreign partition data in one
pass, without doing the delete dance first.
So finally the QUESTION :
where in code would be the best place to check for this so that
1) both regular queries and VACUUM see it
2) the tuple data (and not only system fields or just xmin/xmax) would
be available for the function to use
--
-------
Hannu Krosing
PostgreSQL Unlimited Scalability and Performance Consultant
2ndQuadrant Nordic
PG Admin Book: http://www.2ndQuadrant.com/books/