pg_reorg in core? - Mailing list pgsql-hackers

From Michael Paquier
Subject pg_reorg in core?
Date
Msg-id CAB7nPqTGmNUFi+W6F1iwmf7J-o6sY+xxo6Yb=mkUVYT-CG-B5A@mail.gmail.com
Whole thread Raw
Responses Re: pg_reorg in core?  (Josh Kupershmidt <schmiddy@gmail.com>)
Re: pg_reorg in core?  (Hitoshi Harada <umi.tanuki@gmail.com>)
List pgsql-hackers
Hi all,<br /><br />During the last PGCon, I heard that some community members would be interested in having pg_reorg
directlyin core.<br />Just to recall, pg_reorg is a functionality developped by NTT that allows to redistribute a table
withouttaking locks on it.<br /> The technique it uses to reorganize the table is to create a temporary copy of the
tableto be redistributed with a CREATE TABLE AS<br />whose definition changes if table is redistributed with a VACUUM
FULLor CLUSTER.<br />Then it follows this mechanism:<br /> - triggers are created to redirect all the DMLs that occur
onthe table to an intermediate log table.<br />- creation of indexes on the temporary table based on what the user
wishes<br/>- Apply the logs registered during the index creation<br /> - Swap the names of freshly created table and
oldtable<br />- Drop the useless objects<br /><br />The code is hosted by pg_foundry here: <a
href="http://pgfoundry.org/projects/reorg/">http://pgfoundry.org/projects/reorg/</a>.<br/> I am also maintaining a fork
ingithub in sync with pgfoundry here: <a
href="https://github.com/michaelpq/pg_reorg">https://github.com/michaelpq/pg_reorg</a>.<br/><br />Just, do you guys
thinkit is worth adding a functionality like pg_reorg in core or not?<br /><br />If yes, well I think the code of
pg_reorgis going to need some modifications to make it more compatible with contrib modules using only EXTENSION.<br
/>Forthe time being pg_reorg is divided into 2 parts, binary and library.<br /> The library part is the SQL portion of
pg_reorg,containing a set of C functions that are called by the binary part. This has been extended to support CREATE
EXTENSIONrecently.<br />The binary part creates a command pg_reorg in charge of calling the set of functions created by
thelib part, being just a wrapper of the library part to control the creation and deletion of the objects.<br /> It is
alsoin charge of deleting the temporary objects by callback if an error occurs.<br /><br />By using the binary command,
itis possible to reorganize a single table or a database, in this case reorganizing a database launches only a loop on
eachtable of this database.<br /><br />My idea is to remove the binary part and to rely only on the library part to
makepg_reorg a single extension with only system functions like other contrib modules.<br />In order to do that what is
missingis a function that could be used as an entry point for table reorganization, a function of the type
pg_reorg_table(tableoid)and pg_reorg_table(tableoid, text).<br /> All the functionalities of pg_reorg could be
reproducible:<br/>- pg_reorg_table(tableoid) for a VACUUM FULL reorganization<br />- pg_reorg_table(tableoid, NULL) for
aCLUSTER reorganization if table has a CLUSTER key<br />- pg_reorg_table(tableoid, columnname) for a CLUSTER
reorganizationbased on a wanted column.<br /><br />Is it worth the shot?<br /><br />Regards,<br />-- <br />Michael
Paquier<br/><a href="http://michael.otacoo.com" target="_blank">http://michael.otacoo.com</a><br /> 

pgsql-hackers by date:

Previous
From: Nozomi Anzai
Date:
Subject: Re: 64-bit API for large object
Next
From: Josh Kupershmidt
Date:
Subject: Re: pg_reorg in core?