Thread: GSoC proposal - "make an unlogged table logged"
Hi all,
Is the TODO item "make an unlogged table logged" [1] a good GSoC project?
Regards,
[1] http://www.postgresql.org/message-id/AANLkTinENZbRxdCwOHkqBba2BHUbfY8_C5JwRXLodxcX@mail.gmail.com
--
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello <fabriziomello@gmail.com> wrote: > Is the TODO item "make an unlogged table logged" [1] a good GSoC project? I'm pretty sure we found some problems in that design that we couldn't figure out how to solve. I don't have a pointer to the relevant -hackers discussion off-hand, but I think there was one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
* Robert Haas (robertmhaas@gmail.com) wrote: > On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello > <fabriziomello@gmail.com> wrote: > > Is the TODO item "make an unlogged table logged" [1] a good GSoC project? > > I'm pretty sure we found some problems in that design that we couldn't > figure out how to solve. I don't have a pointer to the relevant > -hackers discussion off-hand, but I think there was one. ISTR the discussion going something along the lines of "we'd have to WAL log the entire table to do that, and if we have to do that, what's the point?". Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > * Robert Haas (robertmhaas@gmail.com) wrote: >> On Mon, Mar 3, 2014 at 11:28 AM, Fabr�zio de Royes Mello >> <fabriziomello@gmail.com> wrote: >>> Is the TODO item "make an unlogged table logged" [1] a good GSoC project? >> I'm pretty sure we found some problems in that design that we couldn't >> figure out how to solve. I don't have a pointer to the relevant >> -hackers discussion off-hand, but I think there was one. > ISTR the discussion going something along the lines of "we'd have to WAL > log the entire table to do that, and if we have to do that, what's the > point?". IIRC, the reason you'd have to do that is to make the table contents appear on slave servers. If you don't consider replication then it might seem easier. regards, tom lane
On 03/03/2014 05:22 PM, Tom Lane wrote: > Stephen Frost <sfrost@snowman.net> writes: ... >> ISTR the discussion going something along the lines of "we'd have to WAL >> log the entire table to do that, and if we have to do that, what's the >> point?". > IIRC, the reason you'd have to do that is to make the table contents > appear on slave servers. If you don't consider replication then it might > seem easier. So switch on logging and then perform CLUSTER/VACUUM FULL ? Should this work, or is something extra needed ? Cheers -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ
On 2014-03-03 12:08:26 -0500, Stephen Frost wrote: > * Robert Haas (robertmhaas@gmail.com) wrote: > > On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello > > <fabriziomello@gmail.com> wrote: > > > Is the TODO item "make an unlogged table logged" [1] a good GSoC project? > > > > I'm pretty sure we found some problems in that design that we couldn't > > figure out how to solve. I don't have a pointer to the relevant > > -hackers discussion off-hand, but I think there was one. > > ISTR the discussion going something along the lines of "we'd have to WAL > log the entire table to do that, and if we have to do that, what's the > point?". I don't see that as a particularly problematic problem. The primary reason to want to convert a unlogged to a logged table probably is that it's easier to do so than to recreate the table + dependencies. Also the overhead of logging full pages will be noticeably smaller than the overhead of adding all rows individually, even if using heap_multi_insert(). Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Mar 3, 2014 at 8:28 AM, Fabrízio de Royes Mello <fabriziomello@gmail.com> wrote: > Is the TODO item "make an unlogged table logged" [1] a good GSoC project? Another interesting project around unlogged tables would be to make it possible to have unlogged indexes on fully-logged tables. That is something that there was some discussion of before, that might be easier. FWIW, I don't think that TODO page is a very good resource for finding a starter project. Picking a good project is a skill in and of itself. A lot of that stuff is aspirational, either because it's difficult, or, more often, because it's difficult relative to the rewards, which can be quite low. To be honest, if I have what I imagine to be a great idea for a project, I don't put it on that page. Maybe I should, but I don't, and I don't think that is uncommon. This is not because I'm particularly guarded about sharing the information. Why do you think that hash indexes still aren't WAL-logged after all these years (a project that someone made noise about recently in relation to GSoC), even though that's generally considered to be a SMOP? -- Peter Geoghegan
On 2014-03-03 12:44:26 -0800, Peter Geoghegan wrote: > On Mon, Mar 3, 2014 at 8:28 AM, Fabrízio de Royes Mello > <fabriziomello@gmail.com> wrote: > > Is the TODO item "make an unlogged table logged" [1] a good GSoC project? > > Another interesting project around unlogged tables would be to make it > possible to have unlogged indexes on fully-logged tables. That is > something that there was some discussion of before, that might be > easier. I'd actually say it's harder because it requires modifying the catalog or transparently introducing hacks similar to what unlogged matviews are doing, to make sure the index is marked invalid after a crash restart. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Mar 3, 2014 at 2:40 PM, Hannu Krosing <hannu@2ndquadrant.com> wrote:
>
> On 03/03/2014 05:22 PM, Tom Lane wrote:
> > Stephen Frost <sfrost@snowman.net> writes:
> ...
> >> ISTR the discussion going something along the lines of "we'd have to WAL
> >> log the entire table to do that, and if we have to do that, what's the
> >> point?".
> > IIRC, the reason you'd have to do that is to make the table contents
> > appear on slave servers. If you don't consider replication then it might
> > seem easier.
> So switch on logging and then perform CLUSTER/VACUUM FULL ?
>
> Should this work, or is something extra needed ?
>
6) drop table foo; -- the old foo table
7) alter table tmp_foo rename to foo;
8) end transactionI know it's very ugly, but works... and works for standbys too... :-)
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
On Mon, Mar 3, 2014 at 5:44 PM, Peter Geoghegan <pg@heroku.com> wrote:
>
> On Mon, Mar 3, 2014 at 8:28 AM, Fabrízio de Royes Mello
> <fabriziomello@gmail.com> wrote:
> > Is the TODO item "make an unlogged table logged" [1] a good GSoC project?
>
> Another interesting project around unlogged tables would be to make it
> possible to have unlogged indexes on fully-logged tables. That is
> something that there was some discussion of before, that might be
> easier.
>
> FWIW, I don't think that TODO page is a very good resource for finding
> a starter project. Picking a good project is a skill in and of itself.
> A lot of that stuff is aspirational, either because it's difficult,
> or, more often, because it's difficult relative to the rewards, which
> can be quite low. To be honest, if I have what I imagine to be a great
> idea for a project, I don't put it on that page. Maybe I should, but I
> don't, and I don't think that is uncommon. This is not because I'm
> particularly guarded about sharing the information.
>
Share your ideas, please!
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
<div dir="ltr"><div class="gmail_extra">On Mon, Mar 3, 2014 at 5:47 PM, Andres Freund <<a href="mailto:andres@2ndquadrant.com">andres@2ndquadrant.com</a>>wrote:<br />><br />> On 2014-03-03 12:44:26 -0800,Peter Geoghegan wrote:<br /> > > On Mon, Mar 3, 2014 at 8:28 AM, Fabrízio de Royes Mello<br />> > <<ahref="mailto:fabriziomello@gmail.com">fabriziomello@gmail.com</a>> wrote:<br />> > > Is the TODO item "makean unlogged table logged" [1] a good GSoC project?<br /> > ><br />> > Another interesting project aroundunlogged tables would be to make it<br />> > possible to have unlogged indexes on fully-logged tables. That is<br/>> > something that there was some discussion of before, that might be<br /> > > easier.<br />><br />>I'd actually say it's harder because it requires modifying the catalog<br />> or transparently introducing hackssimilar to what unlogged matviews are<br />> doing, to make sure the index is marked invalid after a crash restart.<br/> ><br /><br />Unlogged matviews was disallowed [1].<br /><br />[1] <a href="http://www.postgresql.org/message-id/E1UZNrD-0008Eh-83@gemulon.postgresql.org">http://www.postgresql.org/message-id/E1UZNrD-0008Eh-83@gemulon.postgresql.org</a><br /><br/>--<br />Fabrízio de Royes Mello<br />Consultoria/Coaching PostgreSQL<br />>> Timbira: <a href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/>>> Blog sobre TI: <a href="http://fabriziomello.blogspot.com">http://fabriziomello.blogspot.com</a><br/> >> Perfil Linkedin: <a href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/>>> Twitter: <a href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a></div></div>
<div dir="ltr"><div class="gmail_extra"><br />On Mon, Mar 3, 2014 at 2:42 PM, Andres Freund <<a href="mailto:andres@2ndquadrant.com">andres@2ndquadrant.com</a>>wrote:<br />><br />> On 2014-03-03 12:08:26 -0500,Stephen Frost wrote:<br /> > > * Robert Haas (<a href="mailto:robertmhaas@gmail.com">robertmhaas@gmail.com</a>)wrote:<br />> > > On Mon, Mar 3, 2014 at 11:28 AM,Fabrízio de Royes Mello<br />> > > <<a href="mailto:fabriziomello@gmail.com">fabriziomello@gmail.com</a>>wrote:<br /> > > > > Is the TODO item "makean unlogged table logged" [1] a good GSoC project?<br />> > ><br />> > > I'm pretty sure we foundsome problems in that design that we couldn't<br /> > > > figure out how to solve. I don't have a pointerto the relevant<br />> > > -hackers discussion off-hand, but I think there was one.<br />> ><br />>> ISTR the discussion going something along the lines of "we'd have to WAL<br /> > > log the entire tableto do that, and if we have to do that, what's the<br />> > point?".<br />><br />> I don't see that as aparticularly problematic problem. The primary<br />> reason to want to convert a unlogged to a logged table probablyis that<br /> > it's easier to do so than to recreate the table + dependencies. Also the<br />> overhead oflogging full pages will be noticeably smaller than the<br />> overhead of adding all rows individually, even if using<br/> > heap_multi_insert().<br />><br /><br /></div><div class="gmail_extra">Do you know some similar in thesource code?<br /></div><div class="gmail_extra"><br /></div><div class="gmail_extra">Regards,<br /></div><div class="gmail_extra"><br/>--<br />Fabrízio de Royes Mello<br />Consultoria/Coaching PostgreSQL<br />>> Timbira: <a href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/>>> Blog sobre TI: <a href="http://fabriziomello.blogspot.com">http://fabriziomello.blogspot.com</a><br/> >> Perfil Linkedin: <a href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/>>> Twitter: <a href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a></div></div>
On 2014-03-04 01:10:50 -0300, Fabrízio de Royes Mello wrote: > Today I do something like that: > > 1) create unlogged table tmp_foo ... > 2) populate 'tmp_foo' table (ETL scripts or whatever) > 3) start transaction > 4) lock table tmp_foo in access exclusive mode > 5) update pg_class set relpersistence = 'p' where oid = 'tmp_foo':regclass > 6) drop table foo; -- the old foo table > 7) alter table tmp_foo rename to foo; > 8) end transaction > 9) run pg_repack in table 'foo' > > I know it's very ugly, but works... and works for standbys too... :-) No, it doesn't work. It just may happen to not fail loudly/visibly in some cases. You're absolutely risking corruption of this *and* other relations when doing so. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Mar 3, 2014 at 12:08 PM, Stephen Frost <sfrost@snowman.net> wrote: > * Robert Haas (robertmhaas@gmail.com) wrote: >> On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello >> <fabriziomello@gmail.com> wrote: >> > Is the TODO item "make an unlogged table logged" [1] a good GSoC project? >> >> I'm pretty sure we found some problems in that design that we couldn't >> figure out how to solve. I don't have a pointer to the relevant >> -hackers discussion off-hand, but I think there was one. > > ISTR the discussion going something along the lines of "we'd have to WAL > log the entire table to do that, and if we have to do that, what's the > point?". No, not really. The issue is more around what happens if we crash part way through. At crash recovery time, the system catalogs are not available, because the database isn't consistent yet and, anyway, the startup process can't be bound to a database, let alone every database that might contain unlogged tables. So the sentinel that's used to decide whether to flush the contents of a table or index is the presence or absence of an _init fork, which the startup process obviously can see just fine. The _init fork also tells us what to stick in the relation when we reset it; for a table, we can just reset to an empty file, but that's not legal for indexes, so the _init fork contains a pre-initialized empty index that we can just copy over. Now, to make an unlogged table logged, you've got to at some stage remove those _init forks. But this is not a transactional operation. If you remove the _init forks and then the transaction rolls back, you've left the system an inconsistent state. If you postpone the removal until commit time, then you have a problem if it fails, particularly if it works for the first file but fails for the second. And if you crash at any point before you've fsync'd the containing directory, you have no idea which files will still be on disk after a hard reboot. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2014-03-04 09:47:08 -0500, Robert Haas wrote: > On Mon, Mar 3, 2014 at 12:08 PM, Stephen Frost <sfrost@snowman.net> wrote: > > * Robert Haas (robertmhaas@gmail.com) wrote: > >> On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello > >> <fabriziomello@gmail.com> wrote: > >> > Is the TODO item "make an unlogged table logged" [1] a good GSoC project? > >> > >> I'm pretty sure we found some problems in that design that we couldn't > >> figure out how to solve. I don't have a pointer to the relevant > >> -hackers discussion off-hand, but I think there was one. > > > > ISTR the discussion going something along the lines of "we'd have to WAL > > log the entire table to do that, and if we have to do that, what's the > > point?". > > No, not really. The issue is more around what happens if we crash > part way through. At crash recovery time, the system catalogs are not > available, because the database isn't consistent yet and, anyway, the > startup process can't be bound to a database, let alone every database > that might contain unlogged tables. So the sentinel that's used to > decide whether to flush the contents of a table or index is the > presence or absence of an _init fork, which the startup process > obviously can see just fine. The _init fork also tells us what to > stick in the relation when we reset it; for a table, we can just reset > to an empty file, but that's not legal for indexes, so the _init fork > contains a pre-initialized empty index that we can just copy over. > > Now, to make an unlogged table logged, you've got to at some stage > remove those _init forks. But this is not a transactional operation. > If you remove the _init forks and then the transaction rolls back, > you've left the system an inconsistent state. If you postpone the > removal until commit time, then you have a problem if it fails, > particularly if it works for the first file but fails for the second. > And if you crash at any point before you've fsync'd the containing > directory, you have no idea which files will still be on disk after a > hard reboot. Can't that be solved by just creating the permanent relation in a new relfilenode? That's equivalent to a rewrite, yes, but we need to do that for anything but wal_level=minimal anyway. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
<div dir="ltr"><div class="gmail_extra"><br />On Tue, Mar 4, 2014 at 3:31 AM, Andres Freund <<a href="mailto:andres@2ndquadrant.com">andres@2ndquadrant.com</a>>wrote:<br />><br />> On 2014-03-04 01:10:50 -0300,Fabrízio de Royes Mello wrote:<br /> > > Today I do something like that:<br />> ><br />> > 1) createunlogged table tmp_foo ...<br />> > 2) populate 'tmp_foo' table (ETL scripts or whatever)<br />> > 3) starttransaction<br />> > 4) lock table tmp_foo in access exclusive mode<br /> > > 5) update pg_class set relpersistence= 'p' where oid = 'tmp_foo':regclass<br />> > 6) drop table foo; -- the old foo table<br />> >7) alter table tmp_foo rename to foo;<br />> > 8) end transaction<br /> > > 9) run pg_repack in table 'foo'<br/>> ><br />> > I know it's very ugly, but works... and works for standbys too... :-)<br />><br />>No, it doesn't work. It just may happen to not fail loudly/visibly in<br /> > some cases. You're absolutely riskingcorruption of this *and* other<br />> relations when doing so.<br />><br /><br /></div><div class="gmail_extra">Wellthis already works for some time, but you are correct, exists the risk of corruption!<br /><br />Butin my case if all run without any interrupt the relation is switched to logged. I do some checks before and after, andif something happens with this process we cleanup everything and start from the beginning.<br /></div><div class="gmail_extra"><br/></div><div class="gmail_extra">Maybe I must run CLUSTER inside the transaction block after updatepg_class and execute DROP and RENAME after, in a second phase. Maybe this way is more secure. Is it?<br /><br />Ifsome crash occurs and PostgreSQL restart I check if the unlogged table 'tmp_foo' exists and then I drop it.<br /></div><divclass="gmail_extra"><br /></div><div class="gmail_extra">Regards,<br /><br /></div><div class="gmail_extra">--<br />Fabrízio de Royes Mello<br />Consultoria/Coaching PostgreSQL<br />>> Timbira: <a href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/>>> Blog sobre TI: <a href="http://fabriziomello.blogspot.com">http://fabriziomello.blogspot.com</a><br/> >> Perfil Linkedin: <a href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/>>> Twitter: <a href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a></div></div>
On Tue, Mar 4, 2014 at 11:50 AM, Andres Freund <andres@2ndquadrant.com> wrote:
>
> On 2014-03-04 09:47:08 -0500, Robert Haas wrote:
> > On Mon, Mar 3, 2014 at 12:08 PM, Stephen Frost <sfrost@snowman.net> wrote:
> > > * Robert Haas (robertmhaas@gmail.com) wrote:
> > >> On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello
> > >> <fabriziomello@gmail.com> wrote:
> > >> > Is the TODO item "make an unlogged table logged" [1] a good GSoC project?
> > >>
> > >> I'm pretty sure we found some problems in that design that we couldn't
> > >> figure out how to solve. I don't have a pointer to the relevant
> > >> -hackers discussion off-hand, but I think there was one.
> > >
> > > ISTR the discussion going something along the lines of "we'd have to WAL
> > > log the entire table to do that, and if we have to do that, what's the
> > > point?".
> >
> > No, not really. The issue is more around what happens if we crash
> > part way through. At crash recovery time, the system catalogs are not
> > available, because the database isn't consistent yet and, anyway, the
> > startup process can't be bound to a database, let alone every database
> > that might contain unlogged tables. So the sentinel that's used to
> > decide whether to flush the contents of a table or index is the
> > presence or absence of an _init fork, which the startup process
> > obviously can see just fine. The _init fork also tells us what to
> > stick in the relation when we reset it; for a table, we can just reset
> > to an empty file, but that's not legal for indexes, so the _init fork
> > contains a pre-initialized empty index that we can just copy over.
> >
> > Now, to make an unlogged table logged, you've got to at some stage
> > remove those _init forks. But this is not a transactional operation.
> > If you remove the _init forks and then the transaction rolls back,
> > you've left the system an inconsistent state. If you postpone the
> > removal until commit time, then you have a problem if it fails,
> > particularly if it works for the first file but fails for the second.
> > And if you crash at any point before you've fsync'd the containing
> > directory, you have no idea which files will still be on disk after a
> > hard reboot.
>
> Can't that be solved by just creating the permanent relation in a new
> relfilenode? That's equivalent to a rewrite, yes, but we need to do that
> for anything but wal_level=minimal anyway.
>
Regards,
[1] http://www.postgresql.org/message-id/263033.9223.qm@web29013.mail.ird.yahoo.com
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
On Tue, Mar 4, 2014 at 9:50 AM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-03-04 09:47:08 -0500, Robert Haas wrote: >> On Mon, Mar 3, 2014 at 12:08 PM, Stephen Frost <sfrost@snowman.net> wrote: >> > * Robert Haas (robertmhaas@gmail.com) wrote: >> >> On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello >> >> <fabriziomello@gmail.com> wrote: >> >> > Is the TODO item "make an unlogged table logged" [1] a good GSoC project? >> >> >> >> I'm pretty sure we found some problems in that design that we couldn't >> >> figure out how to solve. I don't have a pointer to the relevant >> >> -hackers discussion off-hand, but I think there was one. >> > >> > ISTR the discussion going something along the lines of "we'd have to WAL >> > log the entire table to do that, and if we have to do that, what's the >> > point?". >> >> No, not really. The issue is more around what happens if we crash >> part way through. At crash recovery time, the system catalogs are not >> available, because the database isn't consistent yet and, anyway, the >> startup process can't be bound to a database, let alone every database >> that might contain unlogged tables. So the sentinel that's used to >> decide whether to flush the contents of a table or index is the >> presence or absence of an _init fork, which the startup process >> obviously can see just fine. The _init fork also tells us what to >> stick in the relation when we reset it; for a table, we can just reset >> to an empty file, but that's not legal for indexes, so the _init fork >> contains a pre-initialized empty index that we can just copy over. >> >> Now, to make an unlogged table logged, you've got to at some stage >> remove those _init forks. But this is not a transactional operation. >> If you remove the _init forks and then the transaction rolls back, >> you've left the system an inconsistent state. If you postpone the >> removal until commit time, then you have a problem if it fails, >> particularly if it works for the first file but fails for the second. >> And if you crash at any point before you've fsync'd the containing >> directory, you have no idea which files will still be on disk after a >> hard reboot. > > Can't that be solved by just creating the permanent relation in a new > relfilenode? That's equivalent to a rewrite, yes, but we need to do that > for anything but wal_level=minimal anyway. Yes, that would work. I've tended to view optimizing away the relfilenode copy as an indispensable part of this work, but that might be wrongheaded. It would certainly be a lot easier to make this happen if we didn't insist on that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2014-03-04 12:54:02 -0500, Robert Haas wrote: > On Tue, Mar 4, 2014 at 9:50 AM, Andres Freund <andres@2ndquadrant.com> wrote: > > On 2014-03-04 09:47:08 -0500, Robert Haas wrote: > > Can't that be solved by just creating the permanent relation in a new > > relfilenode? That's equivalent to a rewrite, yes, but we need to do that > > for anything but wal_level=minimal anyway. > > Yes, that would work. I've tended to view optimizing away the > relfilenode copy as an indispensable part of this work, but that might > be wrongheaded. It would certainly be a lot easier to make this > happen if we didn't insist on that. I think it'd already much better than today's situation, and it's a required codepath for wal_level > logical anyway. So even if somebody wants to make this work without the full copy for minimal, it'd still be a required codepath. So I am perfectly ok with a patch just adding that. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
<div dir="ltr"><div class="gmail_extra"><br />On Tue, Mar 4, 2014 at 3:29 PM, Andres Freund <<a href="mailto:andres@2ndquadrant.com">andres@2ndquadrant.com</a>>wrote:<br />><br />> On 2014-03-04 12:54:02 -0500,Robert Haas wrote:<br /> > > On Tue, Mar 4, 2014 at 9:50 AM, Andres Freund <<a href="mailto:andres@2ndquadrant.com">andres@2ndquadrant.com</a>>wrote:<br />> > > On 2014-03-04 09:47:08 -0500,Robert Haas wrote:<br />> > > Can't that be solved by just creating the permanent relation in a new<br />> > > relfilenode? That's equivalent to a rewrite, yes, but we need to do that<br />> > > for anythingbut wal_level=minimal anyway.<br />> ><br />> > Yes, that would work. I've tended to view optimizingaway the<br /> > > relfilenode copy as an indispensable part of this work, but that might<br />> >be wrongheaded. It would certainly be a lot easier to make this<br />> > happen if we didn't insist on that.<br/>><br /> > I think it'd already much better than today's situation, and it's a<br />> required codepathfor wal_level > logical anyway. So even if somebody<br />> wants to make this work without the full copy forminimal, it'd still be<br /> > a required codepath. So I am perfectly ok with a patch just adding that.<br />><br/><br /></div><div class="gmail_extra">Then is this a good idea for a GSoC project ?<br /></div><div class="gmail_extra"><br/>I don't know very well this internals, but I am willing to learn and I think the GSoC is a goodopportunity.<br /><br /></div><div class="gmail_extra">Any of you are willing to mentoring this project?<br /></div><divclass="gmail_extra"><br /></div><div class="gmail_extra">Regards,<br /><br /></div><div class="gmail_extra">--<br/>Fabrízio de Royes Mello<br /> Consultoria/Coaching PostgreSQL<br />>> Timbira: <a href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/>>> Blog sobre TI: <a href="http://fabriziomello.blogspot.com">http://fabriziomello.blogspot.com</a><br/> >> Perfil Linkedin: <a href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/>>> Twitter: <a href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a></div></div>
On Tue, Mar 4, 2014 at 5:00 PM, Fabrízio de Royes Mello <fabriziomello@gmail.com> wrote:
>
>
> On Tue, Mar 4, 2014 at 3:29 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> >
> > On 2014-03-04 12:54:02 -0500, Robert Haas wrote:
> > > On Tue, Mar 4, 2014 at 9:50 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> > > > On 2014-03-04 09:47:08 -0500, Robert Haas wrote:
> > > > Can't that be solved by just creating the permanent relation in a new
> > > > relfilenode? That's equivalent to a rewrite, yes, but we need to do that
> > > > for anything but wal_level=minimal anyway.
> > >
> > > Yes, that would work. I've tended to view optimizing away the
> > > relfilenode copy as an indispensable part of this work, but that might
> > > be wrongheaded. It would certainly be a lot easier to make this
> > > happen if we didn't insist on that.
> >
> > I think it'd already much better than today's situation, and it's a
> > required codepath for wal_level > logical anyway. So even if somebody
> > wants to make this work without the full copy for minimal, it'd still be
> > a required codepath. So I am perfectly ok with a patch just adding that.
> >
>
> Then is this a good idea for a GSoC project ?
>
> I don't know very well this internals, but I am willing to learn and I think the GSoC is a good opportunity.
>
> Any of you are willing to mentoring this project?
>
Hi all,
I written the proposal to this feature, so I would like to know if someone can review.
<proposal>
** Add to PostgreSQL the capacity to making an “Unlogged” table “Logged” **
Introduction
This project will allow to change an “unlogged” table (that doesn’t create transaction logs - WAL files) and it’s dependencies to a “logged” table, in other words, a regular table that create WAL files. To make this happen we'll introduce a new SQL syntax:
ALTER TABLE name SET LOGGED;
Benefits to the PostgreSQL Community
The “unlogged” tables feature was introduced by 9.1 version, and provide better write performance than regular tables (logged), but are not crash-safe. Their contents are automatically discarded (cleared) in a case of a server crash, and their contents do not propagate to replication slaves, either.
With the capacity of turning an “unlogged” table in a logged table will allow us have the better of two features, in other words, we can use an "unlogged" table to run a bulk load a thousands of lines (ETL scripts) and get better performance, and then change it to a "logged" table to get durability of loaded data.
Deliverables
This project will be splitted into 2 (two) deliverables:
1) Allow change an “unlogged” table to “logged” when “wal_level = minimal” (without propagate their contents to replication slaves)
2) Allow change an “unlogged” table to “logged” when “wal_level != minimal” (propagating their contents to replication slaves)
Project Schedule
until May 19:
* create a website to the project (wiki.postgresql.org)
* create a public repository to the project (github.com/fabriziomello)
* read what has already been discussed by the community about the project (http://wiki.postgresql.org/wiki/Todo)
* discuss with the community the best "design" to the feature
* learn about some PostgreSQL internals:
. physical storage for relations (src/backend/catalog/storage.c)
. transaction system (src/backend/access/transam/xact.c)
. two-phase commit (src/backend/access/transam/twophase.c)
. table commands (src/backend/commands/tablecmds.c)
. grammar (src/backend/parser/gram.y)
May 19 - June 23
* evaluate with the mentor and community if is a good start point use the already sent patch (http://www.postgresql.org/message-id/263033.9223.qm@web29013.mail.ird.yahoo.com)
* implementation of the first deliverable:
. change the grammar of PostgreSQL to support “ALTER TABLE … SET LOGGED”
. implement the routines to change an "unlogged" table to "logged" when "wal_level = minimal"
* write documentation and the test cases
* submit this first deliverable to the commitfest 2014/06 (https://commitfest.postgresql.org/action/commitfest_view?id=22)
June 23 - June 27
* review with the Mentor of the work done until now
June 27 - August 18
* implementation of the second deliverable (wal_level != minimal)
* write documentation and the test cases
* submit this second deliverable to the commitfest 2014/09 (webpage don’t created yet)
August 18 - August 22
* final review with the Mentor of all work done.
About the proponent
Fabrízio de Royes Mello
e-mail: fabriziomello@gmail.com
twitter: @fabriziomello
github: http://github.com/fabriziomello
linkedin: http://linkedin.com/in/fabriziomello
Currently I help people and teams to take the full potential of relational databases, especially PostgreSQL, helping teams to design the structure of the database (modeling), build physical architecture (database schema), programming (procedural languages), SQL (usage, tuning, best practices), optimization and orchestration of instances in production too. I perform a volunteer work for Brazilian Community of PostgreSQL (www.postgresql.org.br), supporting mailing lists, organizing events (pgbr.postgresql.org.br) and some admin tasks. And also I help a little the PostgreSQL Global Development Group (PGDG) in the implementation of some features and review of patches (git.postgresql.org).
Introduction
This project will allow to change an “unlogged” table (that doesn’t create transaction logs - WAL files) and it’s dependencies to a “logged” table, in other words, a regular table that create WAL files. To make this happen we'll introduce a new SQL syntax:
ALTER TABLE name SET LOGGED;
Benefits to the PostgreSQL Community
The “unlogged” tables feature was introduced by 9.1 version, and provide better write performance than regular tables (logged), but are not crash-safe. Their contents are automatically discarded (cleared) in a case of a server crash, and their contents do not propagate to replication slaves, either.
With the capacity of turning an “unlogged” table in a logged table will allow us have the better of two features, in other words, we can use an "unlogged" table to run a bulk load a thousands of lines (ETL scripts) and get better performance, and then change it to a "logged" table to get durability of loaded data.
Deliverables
This project will be splitted into 2 (two) deliverables:
1) Allow change an “unlogged” table to “logged” when “wal_level = minimal” (without propagate their contents to replication slaves)
2) Allow change an “unlogged” table to “logged” when “wal_level != minimal” (propagating their contents to replication slaves)
Project Schedule
until May 19:
* create a website to the project (wiki.postgresql.org)
* create a public repository to the project (github.com/fabriziomello)
* read what has already been discussed by the community about the project (http://wiki.postgresql.org/wiki/Todo)
* discuss with the community the best "design" to the feature
* learn about some PostgreSQL internals:
. physical storage for relations (src/backend/catalog/storage.c)
. transaction system (src/backend/access/transam/xact.c)
. two-phase commit (src/backend/access/transam/twophase.c)
. table commands (src/backend/commands/tablecmds.c)
. grammar (src/backend/parser/gram.y)
May 19 - June 23
* evaluate with the mentor and community if is a good start point use the already sent patch (http://www.postgresql.org/message-id/263033.9223.qm@web29013.mail.ird.yahoo.com)
* implementation of the first deliverable:
. change the grammar of PostgreSQL to support “ALTER TABLE … SET LOGGED”
. implement the routines to change an "unlogged" table to "logged" when "wal_level = minimal"
* write documentation and the test cases
* submit this first deliverable to the commitfest 2014/06 (https://commitfest.postgresql.org/action/commitfest_view?id=22)
June 23 - June 27
* review with the Mentor of the work done until now
June 27 - August 18
* implementation of the second deliverable (wal_level != minimal)
* write documentation and the test cases
* submit this second deliverable to the commitfest 2014/09 (webpage don’t created yet)
August 18 - August 22
* final review with the Mentor of all work done.
About the proponent
Fabrízio de Royes Mello
e-mail: fabriziomello@gmail.com
twitter: @fabriziomello
github: http://github.com/fabriziomello
linkedin: http://linkedin.com/in/fabriziomello
Currently I help people and teams to take the full potential of relational databases, especially PostgreSQL, helping teams to design the structure of the database (modeling), build physical architecture (database schema), programming (procedural languages), SQL (usage, tuning, best practices), optimization and orchestration of instances in production too. I perform a volunteer work for Brazilian Community of PostgreSQL (www.postgresql.org.br), supporting mailing lists, organizing events (pgbr.postgresql.org.br) and some admin tasks. And also I help a little the PostgreSQL Global Development Group (PGDG) in the implementation of some features and review of patches (git.postgresql.org).
</proposal>
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
On Wed, Mar 5, 2014 at 7:42 PM, Fabrízio de Royes Mello <fabriziomello@gmail.com> wrote: > On Tue, Mar 4, 2014 at 5:00 PM, Fabrízio de Royes Mello > <fabriziomello@gmail.com> wrote: >> On Tue, Mar 4, 2014 at 3:29 PM, Andres Freund <andres@2ndquadrant.com> >> wrote: >> > >> > On 2014-03-04 12:54:02 -0500, Robert Haas wrote: >> > > On Tue, Mar 4, 2014 at 9:50 AM, Andres Freund <andres@2ndquadrant.com> >> > > wrote: >> > > > On 2014-03-04 09:47:08 -0500, Robert Haas wrote: >> > > > Can't that be solved by just creating the permanent relation in a >> > > > new >> > > > relfilenode? That's equivalent to a rewrite, yes, but we need to do >> > > > that >> > > > for anything but wal_level=minimal anyway. >> > > >> > > Yes, that would work. I've tended to view optimizing away the >> > > relfilenode copy as an indispensable part of this work, but that might >> > > be wrongheaded. It would certainly be a lot easier to make this >> > > happen if we didn't insist on that. >> > >> > I think it'd already much better than today's situation, and it's a >> > required codepath for wal_level > logical anyway. So even if somebody >> > wants to make this work without the full copy for minimal, it'd still be >> > a required codepath. So I am perfectly ok with a patch just adding that. >> > >> >> Then is this a good idea for a GSoC project ? >> >> I don't know very well this internals, but I am willing to learn and I >> think the GSoC is a good opportunity. >> >> Any of you are willing to mentoring this project? >> > > I written the proposal to this feature, so I would like to know if someone > can review. I think this isn't a good design. Per the discussion between Andres and I, I think that I think you should do is make ALTER TABLE .. SET LOGGED work just like VACUUM FULL, with the exception that it will set a different relpersistence for the new relfilenode. If you do it that way, this will be less efficient, but much simpler, and you might actually finish it in one summer. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 6 March 2014 19:42, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Mar 5, 2014 at 7:42 PM, Fabrízio de Royes Mello<fabriziomello@gmail.com> wrote:
> On Tue, Mar 4, 2014 at 5:00 PM, Fabrízio de Royes Mello
> <fabriziomello@gmail.com> wrote:
>> On Tue, Mar 4, 2014 at 3:29 PM, Andres Freund <andres@2ndquadrant.com>
>> wrote:
>> >
>> > On 2014-03-04 12:54:02 -0500, Robert Haas wrote:
>> > > On Tue, Mar 4, 2014 at 9:50 AM, Andres Freund <andres@2ndquadrant.com>
>> > > wrote:
>> > > > On 2014-03-04 09:47:08 -0500, Robert Haas wrote:
>> > > > Can't that be solved by just creating the permanent relation in a
>> > > > new
>> > > > relfilenode? That's equivalent to a rewrite, yes, but we need to do
>> > > > that
>> > > > for anything but wal_level=minimal anyway.
>> > >
>> > > Yes, that would work. I've tended to view optimizing away the
>> > > relfilenode copy as an indispensable part of this work, but that might
>> > > be wrongheaded. It would certainly be a lot easier to make this
>> > > happen if we didn't insist on that.
>> >
>> > I think it'd already much better than today's situation, and it's a
>> > required codepath for wal_level > logical anyway. So even if somebody
>> > wants to make this work without the full copy for minimal, it'd still be
>> > a required codepath. So I am perfectly ok with a patch just adding that.
>> >
>>
>> Then is this a good idea for a GSoC project ?
>>
>> I don't know very well this internals, but I am willing to learn and I
>> think the GSoC is a good opportunity.
>>
>> Any of you are willing to mentoring this project?
>>
>> I written the proposal to this feature, so I would like to know if someoneI think this isn't a good design. Per the discussion between Andres
> can review.
and I, I think that I think you should do is make ALTER TABLE .. SET
LOGGED work just like VACUUM FULL, with the exception that it will set
a different relpersistence for the new relfilenode. If you do it that
way, this will be less efficient, but much simpler, and you might
actually finish it in one summer.
Sounds like a plan. Would there be any stretch-goals for this work, or is there not really anything else that could be done?
Thom
<div dir="ltr"><br />On Thu, Mar 6, 2014 at 4:42 PM, Robert Haas <<a href="mailto:robertmhaas@gmail.com">robertmhaas@gmail.com</a>>wrote:<br />><br />><br />> I think this isn'ta good design. Per the discussion between Andres<br /> > and I, I think that I think you should do is make ALTERTABLE .. SET<br />> LOGGED work just like VACUUM FULL, with the exception that it will set<br />> a differentrelpersistence for the new relfilenode. If you do it that<br /> > way, this will be less efficient, but muchsimpler, and you might<br />> actually finish it in one summer.<br />><br /><br />Do it like 'VACUUM FULL' forany wal_level?<br /><br />Regards,<br /><br />--<br />Fabrízio de Royes Mello<br /> Consultoria/Coaching PostgreSQL<br/>>> Timbira: <a href="http://www.timbira.com.br">http://www.timbira.com.br</a><br />>> Blog sobreTI: <a href="http://fabriziomello.blogspot.com">http://fabriziomello.blogspot.com</a><br /> >> Perfil Linkedin:<a href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br />>> Twitter:<a href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a></div>
On Thu, Mar 6, 2014 at 2:52 PM, Fabrízio de Royes Mello <fabriziomello@gmail.com> wrote: > On Thu, Mar 6, 2014 at 4:42 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> I think this isn't a good design. Per the discussion between Andres >> and I, I think that I think you should do is make ALTER TABLE .. SET >> LOGGED work just like VACUUM FULL, with the exception that it will set >> a different relpersistence for the new relfilenode. If you do it that >> way, this will be less efficient, but much simpler, and you might >> actually finish it in one summer. >> > > Do it like 'VACUUM FULL' for any wal_level? Yep. Anything else appears to be a research problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Mar 6, 2014 at 5:04 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Mar 6, 2014 at 2:52 PM, Fabrízio de Royes Mello
> <fabriziomello@gmail.com> wrote:
> > On Thu, Mar 6, 2014 at 4:42 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> >> I think this isn't a good design. Per the discussion between Andres
> >> and I, I think that I think you should do is make ALTER TABLE .. SET
> >> LOGGED work just like VACUUM FULL, with the exception that it will set
> >> a different relpersistence for the new relfilenode. If you do it that
> >> way, this will be less efficient, but much simpler, and you might
> >> actually finish it in one summer.
> >>
> >
> > Do it like 'VACUUM FULL' for any wal_level?
>
> Yep. Anything else appears to be a research problem.
>
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
<div dir="ltr"><div class="gmail_extra"><br />On Thu, Mar 6, 2014 at 5:05 PM, Fabrízio de Royes Mello <<a href="mailto:fabriziomello@gmail.com">fabriziomello@gmail.com</a>>wrote:<br />><br />><br />> On Thu, Mar 6,2014 at 5:04 PM, Robert Haas <<a href="mailto:robertmhaas@gmail.com">robertmhaas@gmail.com</a>> wrote:<br /> >><br />> > On Thu, Mar 6, 2014 at 2:52 PM, Fabrízio de Royes Mello<br />> > <<a href="mailto:fabriziomello@gmail.com">fabriziomello@gmail.com</a>>wrote:<br />> > > On Thu, Mar 6, 2014 at 4:42PM, Robert Haas <<a href="mailto:robertmhaas@gmail.com">robertmhaas@gmail.com</a>> wrote:<br /> > > >>I think this isn't a good design. Per the discussion between Andres<br />> > >> and I, I think thatI think you should do is make ALTER TABLE .. SET<br />> > >> LOGGED work just like VACUUM FULL, with theexception that it will set<br /> > > >> a different relpersistence for the new relfilenode. If you do itthat<br />> > >> way, this will be less efficient, but much simpler, and you might<br />> > >>actually finish it in one summer.<br /> > > >><br />> > ><br />> > > Do it like 'VACUUMFULL' for any wal_level?<br />> ><br />> > Yep. Anything else appears to be a research problem.<br />>><br />><br />> I'll change the proposal. Thanks a lot!<br /> ><br /><br /></div><div class="gmail_extra">Onelast question. <br /><br />Do you think is difficult to implement "ALTER TABLE ... SET UNLOGGED" too?Thinking in a scope of one GSoC, of course.<br /></div><div class="gmail_extra"><br /></div><div class="gmail_extra">Regards,<br/></div><div class="gmail_extra"><br />--<br />Fabrízio de Royes Mello<br />Consultoria/CoachingPostgreSQL<br />>> Timbira: <a href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/> >> Blog sobre TI: <a href="http://fabriziomello.blogspot.com">http://fabriziomello.blogspot.com</a><br/>>> Perfil Linkedin: <a href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/> >> Twitter: <a href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a></div></div>
<div dir="ltr"><div class="gmail_extra">On Thu, Mar 6, 2014 at 5:04 PM, Robert Haas <<a href="mailto:robertmhaas@gmail.com">robertmhaas@gmail.com</a>>wrote:<br />><br />> On Thu, Mar 6, 2014 at 2:52 PM,Fabrízio de Royes Mello<br /> > <<a href="mailto:fabriziomello@gmail.com">fabriziomello@gmail.com</a>> wrote:<br/>> > On Thu, Mar 6, 2014 at 4:42 PM, Robert Haas <<a href="mailto:robertmhaas@gmail.com">robertmhaas@gmail.com</a>>wrote:<br /> > >> I think this isn't a good design. Per the discussion between Andres<br />> >> and I, I think that I think you should do is make ALTER TABLE.. SET<br />> >> LOGGED work just like VACUUM FULL, with the exception that it will set<br /> > >>a different relpersistence for the new relfilenode. If you do it that<br />> >> way, this will be lessefficient, but much simpler, and you might<br />> >> actually finish it in one summer.<br /> > >><br/>> ><br />> > Do it like 'VACUUM FULL' for any wal_level?<br />><br />> Yep. Anything elseappears to be a research problem.<br />><br /><br />Updated proposal:<br /><br /><proposal><br /><br />** Addto PostgreSQL the capacity to making an “Unlogged” table “Logged” **<br /><br />Introduction<br /><br />This project willallow to change an “unlogged” table (that doesn’t create transaction logs - WAL files) and it’s dependencies to a “logged”table, in other words, a regular table that create WAL files. To make this happen we'll introduce a new SQL syntax:<br/><br />ALTER TABLE name SET LOGGED;<br /><br /><br />Benefits to the PostgreSQL Community<br /><br />The “unlogged”tables feature was introduced by 9.1 version, and provide better write performance than regular tables (logged),but are not crash-safe. Their contents are automatically discarded (cleared) in a case of a server crash, and theircontents do not propagate to replication slaves, either.<br /> With the capacity of turning an “unlogged” table in alogged table will allow us have the better of two features, in other words, we can use an "unlogged" table to run a bulkload a thousands of lines (ETL scripts) and get better performance, and then change it to a "logged" table to get durabilityof loaded data.<br /><br /><br />Deliverables<br /><br />This project has just one deliverable at the end. Thedeliverable will be the implementation of the routines that transform an “unlogged” table to “logged”, using the samealgorithm of the “vacuum full”, with the exception that it will set a different “relpersistence” for the new “relfilenode”.<br/><br /><br />Project Schedule<br /><br />until May 19:<br />* create a website to the project (<a href="http://wiki.postgresql.org">wiki.postgresql.org</a>)<br/>* create a public repository to the project (<a href="http://github.com/fabriziomello">github.com/fabriziomello</a>)<br/> * read what has already been discussed by the communityabout the project (<a href="http://wiki.postgresql.org/wiki/Todo">http://wiki.postgresql.org/wiki/Todo</a>)<br />*as already discussed in pgsql-hackers mailing list this feature will be implemented similar to “vacuum full”, with theexception that it will set a differente “relpersistence” for the new “relfilenode”<br /> * learn about some PostgreSQLinternals:<br /> . grammar (src/backend/parser/gram.y)<br /> . vacuum full (src/backend/commands/[vacuum.c |cluster.c])<br /><br />May 19 - June 23<br />* implementation of the first prototype:<br /> . change the grammar of PostgreSQLto support “ALTER TABLE … SET LOGGED”<br /> . implement and/or adapt the routines to change an "unlogged" tableto "logged" (similar to “vacuum full”)<br />* write documentation and the test cases<br />* submit this first prototypeto the commitfest 2014/06 (<a href="https://commitfest.postgresql.org/action/commitfest_view?id=22">https://commitfest.postgresql.org/action/commitfest_view?id=22</a>)<br /><br/>June 23 - June 27<br />* review with the Mentor of the work done until now<br /><br />June 27 - August 18<br />* dothe adjustments based on the community feedback during the commitfest 2014/06<br />* submit to the commitfest 2014/09 forfinal evaluation and maybe will be committed to 9.5 version (webpage don’t created yet)<br /><br />August 18 - August22<br />* final review with the Mentor of all work done.<br /><br /><br />About the proponent<br /><br />Fabrízio deRoyes Mello<br />e-mail: <a href="mailto:fabriziomello@gmail.com">fabriziomello@gmail.com</a><br /> twitter: @fabriziomello<br/>github: <a href="http://github.com/fabriziomello">http://github.com/fabriziomello</a><br />linkedin: <ahref="http://linkedin.com/in/fabriziomello">http://linkedin.com/in/fabriziomello</a><br /><br /> Currently I help peopleand teams to take the full potential of relational databases, especially PostgreSQL, helping teams to design the structureof the database (modeling), build physical architecture (database schema), programming (procedural languages), SQL(usage, tuning, best practices), optimization and orchestration of instances in production too. I perform a volunteerwork for Brazilian Community of PostgreSQL (<a href="http://www.postgresql.org.br">www.postgresql.org.br</a>), supportingmailing lists, organizing events (<a href="http://pgbr.postgresql.org.br">pgbr.postgresql.org.br</a>) and someadmin tasks. And also I help a little the PostgreSQL Global Development Group (PGDG) in the implementation of some featuresand review of patches (<a href="http://git.postgresql.org">git.postgresql.org</a>).<br /><br /></proposal><br/><br />--<br />Fabrízio de Royes Mello<br />Consultoria/Coaching PostgreSQL<br />>> Timbira:<a href="http://www.timbira.com.br">http://www.timbira.com.br</a><br />>> Blog sobre TI: <a href="http://fabriziomello.blogspot.com">http://fabriziomello.blogspot.com</a><br/> >> Perfil Linkedin: <a href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/>>> Twitter: <a href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a></div></div>
Fabrízio de Royes Mello <fabriziomello@gmail.com> writes: > Do you think is difficult to implement "ALTER TABLE ... SET UNLOGGED" too? > Thinking in a scope of one GSoC, of course. I think it's basically the same thing. You might hope to optimize it; but you have to create (rather than remove) an init fork, and there's no way to do that in exact sync with the commit. So for safety I think you have to copy the data into a new relfilenode. regards, tom lane
On Fri, Mar 7, 2014 at 12:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Fabrízio de Royes Mello <fabriziomello@gmail.com> writes:
> > Do you think is difficult to implement "ALTER TABLE ... SET UNLOGGED" too?
> > Thinking in a scope of one GSoC, of course.
>
> I think it's basically the same thing. You might hope to optimize it;
> but you have to create (rather than remove) an init fork, and there's
> no way to do that in exact sync with the commit. So for safety I think
> you have to copy the data into a new relfilenode.
>
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
On Fri, Mar 7, 2014 at 12:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Fabrízio de Royes Mello <fabriziomello@gmail.com> writes:I think it's basically the same thing. You might hope to optimize it;
> Do you think is difficult to implement "ALTER TABLE ... SET UNLOGGED" too?
> Thinking in a scope of one GSoC, of course.
but you have to create (rather than remove) an init fork, and there's
no way to do that in exact sync with the commit. So for safety I think
you have to copy the data into a new relfilenode.
Hi all,
In the GSoC proposal page [1] I received some suggestions to strech goals:
* "ALTER TABLE name SET UNLOGGED". This is essentially the reverse of the core proposal, which is "ALTER TABLE name SET LOGGED". Yes, I think that should definitely be included. It would be weird to have SET LOGGED but not SET UNLOGGED.
* Allow unlogged indexes on logged tables.
* Implement "ALTER TABLE name SET LOGGED" without rewriting the whole table, when wal_level = minimal.
* Allow unlogged materialized views.
* "ALTER TABLE name SET UNLOGGED". This is essentially the reverse of the core proposal, which is "ALTER TABLE name SET LOGGED". Yes, I think that should definitely be included. It would be weird to have SET LOGGED but not SET UNLOGGED.
* Allow unlogged indexes on logged tables.
* Implement "ALTER TABLE name SET LOGGED" without rewriting the whole table, when wal_level = minimal.
* Allow unlogged materialized views.
Comments?
[1] http://www.google-melange.com/gsoc/proposal/review/student/google/gsoc2014/fabriziomello/5629499534213120
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://frabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Blog sobre TI: http://frabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
On 2014-04-01 13:37:57 -0300, Fabrízio de Royes Mello wrote: > In the GSoC proposal page [1] I received some suggestions to strech goals: > > * "ALTER TABLE name SET UNLOGGED". This is essentially the reverse of the > core proposal, which is "ALTER TABLE name SET LOGGED". Yes, I think that > should definitely be included. It would be weird to have SET LOGGED but not > SET UNLOGGED. Yes, that makes sense. > * Allow unlogged indexes on logged tables. I don't think it's realistic to build the infrastructure necessary for that as part of gsoc. The reasons have been explained somewhere in this thread. > * Implement "ALTER TABLE name SET LOGGED" without rewriting the whole > table, when wal_level = minimal. Yea, maybe. > * Allow unlogged materialized views. I don't think that's realistic either. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 03/07/2014 05:36 AM, Tom Lane wrote: > Fabrízio de Royes Mello <fabriziomello@gmail.com> writes: >> Do you think is difficult to implement "ALTER TABLE ... SET UNLOGGED" too? >> Thinking in a scope of one GSoC, of course. > > I think it's basically the same thing. You might hope to optimize it; > but you have to create (rather than remove) an init fork, and there's > no way to do that in exact sync with the commit. You just have to include that information with the commit WAL record, no? - Heikki
On 3/4/14, 8:50 AM, Andres Freund wrote: > Can't that be solved by just creating the permanent relation in a new > relfilenode? That's equivalent to a rewrite, yes, but we need to do that > for anything but wal_level=minimal anyway. Maybe I'm missing something, but doesn't this actually involve writing the data twice? Once into WAL and again into the relationitself? -- Jim C. Nasby, Data Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
On 2014-04-01 12:56:04 -0500, Jim Nasby wrote: > On 3/4/14, 8:50 AM, Andres Freund wrote: > >Can't that be solved by just creating the permanent relation in a new > >relfilenode? That's equivalent to a rewrite, yes, but we need to do that > >for anything but wal_level=minimal anyway. > > Maybe I'm missing something, but doesn't this actually involve writing the data twice? Once into WAL and again into therelation itself? Yes. But as I said, that's unavoidable for anything but wal_level=minimal. If somebody wants to put in the additional nontrivial work to make it work faster with wal_level=minimal, they can do so. But the other case is more general and needs to be done anyway. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Apr 1, 2014 at 1:40 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>
> On 2014-04-01 13:37:57 -0300, Fabrízio de Royes Mello wrote:
> > In the GSoC proposal page [1] I received some suggestions to strech goals:
> >
> > * "ALTER TABLE name SET UNLOGGED". This is essentially the reverse of the
> > core proposal, which is "ALTER TABLE name SET LOGGED". Yes, I think that
> > should definitely be included. It would be weird to have SET LOGGED but not
> > SET UNLOGGED.
>
> Yes, that makes sense.
>
> > * Allow unlogged indexes on logged tables.
>
> I don't think it's realistic to build the infrastructure necessary for
> that as part of gsoc. The reasons have been explained somewhere in this
> thread.
>
> > * Implement "ALTER TABLE name SET LOGGED" without rewriting the whole
> > table, when wal_level = minimal.
>
> Yea, maybe.
>
> > * Allow unlogged materialized views.
>
> I don't think that's realistic either.
>
Thanks Andres for your comments.>
> On 2014-04-01 13:37:57 -0300, Fabrízio de Royes Mello wrote:
> > In the GSoC proposal page [1] I received some suggestions to strech goals:
> >
> > * "ALTER TABLE name SET UNLOGGED". This is essentially the reverse of the
> > core proposal, which is "ALTER TABLE name SET LOGGED". Yes, I think that
> > should definitely be included. It would be weird to have SET LOGGED but not
> > SET UNLOGGED.
>
> Yes, that makes sense.
>
> > * Allow unlogged indexes on logged tables.
>
> I don't think it's realistic to build the infrastructure necessary for
> that as part of gsoc. The reasons have been explained somewhere in this
> thread.
>
> > * Implement "ALTER TABLE name SET LOGGED" without rewriting the whole
> > table, when wal_level = minimal.
>
> Yea, maybe.
>
> > * Allow unlogged materialized views.
>
> I don't think that's realistic either.
>
Anyway I added this "additional goals" to proposal and as Heikki said:
"It's actually nice to have several separate goals like this, it means that if you finish the task faster than expected, you can move on to the next goal, and if one task takes longer than expected so that you don't finish all the work, we'll still have something useful."
I hope you can help me in some way with this project too. :-)
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
On 04/01/2014 08:58 PM, Andres Freund wrote: > On 2014-04-01 12:56:04 -0500, Jim Nasby wrote: >> On 3/4/14, 8:50 AM, Andres Freund wrote: >>> Can't that be solved by just creating the permanent relation in a new >>> relfilenode? That's equivalent to a rewrite, yes, but we need to do that >>> for anything but wal_level=minimal anyway. >> >> Maybe I'm missing something, but doesn't this actually involve writing the data twice? Once into WAL and again into therelation itself? > > Yes. But as I said, that's unavoidable for anything but > wal_level=minimal. Ideally, you would *only* write the data to WAL, when you do ALTER TABLE ... SET LOGGED. There's no fundamental reason you need to rewrite the heap, too. I understand that it might be difficult to do, because of the way the system catalogs work, but it's worthy goal. - Heikki
On 2014-04-03 13:38:29 +0300, Heikki Linnakangas wrote: > On 04/01/2014 08:58 PM, Andres Freund wrote: > >On 2014-04-01 12:56:04 -0500, Jim Nasby wrote: > >>On 3/4/14, 8:50 AM, Andres Freund wrote: > >>>Can't that be solved by just creating the permanent relation in a new > >>>relfilenode? That's equivalent to a rewrite, yes, but we need to do that > >>>for anything but wal_level=minimal anyway. > >> > >>Maybe I'm missing something, but doesn't this actually involve writing the data twice? Once into WAL and again into therelation itself? > > > >Yes. But as I said, that's unavoidable for anything but > >wal_level=minimal. > > Ideally, you would *only* write the data to WAL, when you do ALTER TABLE ... > SET LOGGED. There's no fundamental reason you need to rewrite the heap, too. > I understand that it might be difficult to do, because of the way the system > catalogs work, but it's worthy goal. I don't think that's realistic to achieve due to the issues described in http://archives.postgresql.org/message-id/CA%2BTgmob44LNwwU73N1aJsGQyzQ61SdhKJRC_89wCm0%2BaLg%3Dx2Q%40mail.gmail.com I don't think it's worthwile to make the feature much more complex, just to address this. perfect is the enemy of good and all that. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 04/03/2014 01:44 PM, Andres Freund wrote: > On 2014-04-03 13:38:29 +0300, Heikki Linnakangas wrote: >> On 04/01/2014 08:58 PM, Andres Freund wrote: >>> On 2014-04-01 12:56:04 -0500, Jim Nasby wrote: >>>> On 3/4/14, 8:50 AM, Andres Freund wrote: >>>>> Can't that be solved by just creating the permanent relation in a new >>>>> relfilenode? That's equivalent to a rewrite, yes, but we need to do that >>>>> for anything but wal_level=minimal anyway. >>>> >>>> Maybe I'm missing something, but doesn't this actually involve writing the data twice? Once into WAL and again intothe relation itself? >>> >>> Yes. But as I said, that's unavoidable for anything but >>> wal_level=minimal. >> >> Ideally, you would *only* write the data to WAL, when you do ALTER TABLE ... >> SET LOGGED. There's no fundamental reason you need to rewrite the heap, too. >> I understand that it might be difficult to do, because of the way the system >> catalogs work, but it's worthy goal. > > I don't think that's realistic to achieve due to the issues described in > http://archives.postgresql.org/message-id/CA%2BTgmob44LNwwU73N1aJsGQyzQ61SdhKJRC_89wCm0%2BaLg%3Dx2Q%40mail.gmail.com To which I replied here: http://www.postgresql.org/message-id/533AF9D7.7010503@vmware.com. Please reply to that sub-thread with any problems you see. I might be missing something, but I really don't see any insurmountable problem here. > I don't think it's worthwile to make the feature much more complex, just > to address this. perfect is the enemy of good and all that. We should do the trivial implementation first, sure. But that ought to be trivial. Now is the time to discuss how to do the more optimal thing. If we can come up with a feasible design on that, Fabrizio will have time to do that as part of the GSoC. - Heikki
On 2014-04-01 20:39:35 +0300, Heikki Linnakangas wrote: > On 03/07/2014 05:36 AM, Tom Lane wrote: > >Fabrízio de Royes Mello <fabriziomello@gmail.com> writes: > >>Do you think is difficult to implement "ALTER TABLE ... SET UNLOGGED" too? > >>Thinking in a scope of one GSoC, of course. > > > >I think it's basically the same thing. You might hope to optimize it; > >but you have to create (rather than remove) an init fork, and there's > >no way to do that in exact sync with the commit. > > You just have to include that information with the commit WAL record, no? Sure, it's possible to do that. But that seems like complicating generic paths more than I'd like for a minor feature. Especially as the unlinking of the files would need to happen somewhere in RecordTransactionCommit(). After the XLogFlush(), but before unsetting MyPgXact->delayChkpt. That's a crit section, right? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 04/01/2014 08:39 PM, Heikki Linnakangas wrote: > On 03/07/2014 05:36 AM, Tom Lane wrote: >> Fabrízio de Royes Mello <fabriziomello@gmail.com> writes: >>> Do you think is difficult to implement "ALTER TABLE ... SET UNLOGGED" too? >>> Thinking in a scope of one GSoC, of course. >> >> I think it's basically the same thing. You might hope to optimize it; >> but you have to create (rather than remove) an init fork, and there's >> no way to do that in exact sync with the commit. > > You just have to include that information with the commit WAL record, no? No-one's replied yet, but perhaps the worry is that after you've written the commit record, you have to go ahead with removing/creating the init fork, and that is seen as too risky. If a creat() or unlink() call fails, that will have to be a PANIC, and crash recovery will likewise have to PANIC if the forks still cannot be removed/created. My first thought is that that seems ok. It's unlikely that an unlink() of a small file in the data directory would fail. Creation could be done with a temporary name first and renamed into place, to avoid running out of disk space in the critical section. If that's not acceptable, one idea off the top of my head is to somehow stamp the init forks when making an unlogged table logged, with the XID of the transcation. Crash recovery could then check the clog to see if the transaction committed, and ignore any init fork files belonging to committed transactions. (Same in reverse when making a logged table unlogged). Currently, we reset unlogged relations before replaying the WAL. That would have to be delayed until end of WAL replay, because otherwise we don't know if the transaction committed or not. Although if we go with the stamping approach, we could still reset unstamped files at the beginning of recovery. - Heikki
On 2014-04-03 14:26:50 +0300, Heikki Linnakangas wrote: > On 04/01/2014 08:39 PM, Heikki Linnakangas wrote: > >On 03/07/2014 05:36 AM, Tom Lane wrote: > >>Fabrízio de Royes Mello <fabriziomello@gmail.com> writes: > >>>Do you think is difficult to implement "ALTER TABLE ... SET UNLOGGED" too? > >>>Thinking in a scope of one GSoC, of course. > >> > >>I think it's basically the same thing. You might hope to optimize it; > >>but you have to create (rather than remove) an init fork, and there's > >>no way to do that in exact sync with the commit. > > > >You just have to include that information with the commit WAL record, no? > > No-one's replied yet That might be because it was a month after the initial discussion, and at least I'd temporarily lost track of the thread ;) > , but perhaps the worry is that after you've written the > commit record, you have to go ahead with removing/creating the init fork, > and that is seen as too risky. If a creat() or unlink() call fails, that > will have to be a PANIC, and crash recovery will likewise have to PANIC if > the forks still cannot be removed/created. That's part of the worry, yes. It's also creeping code dealing with unlogged relations into a fairly critical place (RecordTransactionCommit()) where it really doesn't seem to belong. > My first thought is that that seems ok. It's unlikely that an unlink() of a > small file in the data directory would fail. Creation could be done with a > temporary name first and renamed into place, to avoid running out of disk > space in the critical section. I continue to feel that that's far too much impact for a minor feature. Even if it could be made work reliably, it'll be a fair amount of seldomly used infrastructure. > If that's not acceptable, one idea off the top of my head is to somehow > stamp the init forks when making an unlogged table logged, with the XID of > the transcation. Crash recovery could then check the clog to see if the > transaction committed, and ignore any init fork files belonging to committed > transactions. (Same in reverse when making a logged table unlogged). I've thought about that - after all, the logical decoding stuff uses that trick in some places - but it has the grave disadvantage that it requires a full directory scan to fully remove a relation. That seems to be a heavy price. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2014-04-03 13:38:29 +0300, Heikki Linnakangas wrote: > On 04/01/2014 08:58 PM, Andres Freund wrote: > >On 2014-04-01 12:56:04 -0500, Jim Nasby wrote: > >>On 3/4/14, 8:50 AM, Andres Freund wrote: > >>>Can't that be solved by just creating the permanent relation in a new > >>>relfilenode? That's equivalent to a rewrite, yes, but we need to do that > >>>for anything but wal_level=minimal anyway. > >> > >>Maybe I'm missing something, but doesn't this actually involve writing the data twice? Once into WAL and again into therelation itself? > > > >Yes. But as I said, that's unavoidable for anything but > >wal_level=minimal. > > Ideally, you would *only* write the data to WAL, when you do ALTER TABLE ... > SET LOGGED. There's no fundamental reason you need to rewrite the > heap, too. As another point: What's the advantage of that? The amount of writes will be the same, no? It doesn't seem to be all that interesting that a second filenode exists temporarily? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 04/03/2014 02:41 PM, Andres Freund wrote: > On 2014-04-03 13:38:29 +0300, Heikki Linnakangas wrote: >> On 04/01/2014 08:58 PM, Andres Freund wrote: >>> On 2014-04-01 12:56:04 -0500, Jim Nasby wrote: >>>> On 3/4/14, 8:50 AM, Andres Freund wrote: >>>>> Can't that be solved by just creating the permanent relation in a new >>>>> relfilenode? That's equivalent to a rewrite, yes, but we need to do that >>>>> for anything but wal_level=minimal anyway. >>>> >>>> Maybe I'm missing something, but doesn't this actually involve writing the data twice? Once into WAL and again intothe relation itself? >>> >>> Yes. But as I said, that's unavoidable for anything but >>> wal_level=minimal. >> >> Ideally, you would *only* write the data to WAL, when you do ALTER TABLE ... >> SET LOGGED. There's no fundamental reason you need to rewrite the >> heap, too. > > As another point: What's the advantage of that? The amount of writes > will be the same, no? It doesn't seem to be all that interesting that > a second filenode exists temporarily? Surely it's cheaper to read the whole relation and copy it to just WAL, than to read the whole relation and write it both the WAL and another file. (Maybe it's not worth the trouble to avoid it - but that depends on whether we come up with a good design..) - Heikki
On 2014-04-03 15:02:27 +0300, Heikki Linnakangas wrote: > On 04/03/2014 02:41 PM, Andres Freund wrote: > >On 2014-04-03 13:38:29 +0300, Heikki Linnakangas wrote: > >>On 04/01/2014 08:58 PM, Andres Freund wrote: > >>>On 2014-04-01 12:56:04 -0500, Jim Nasby wrote: > >>>>On 3/4/14, 8:50 AM, Andres Freund wrote: > >>>>>Can't that be solved by just creating the permanent relation in a new > >>>>>relfilenode? That's equivalent to a rewrite, yes, but we need to do that > >>>>>for anything but wal_level=minimal anyway. > >>>> > >>>>Maybe I'm missing something, but doesn't this actually involve writing the data twice? Once into WAL and again intothe relation itself? > >>> > >>>Yes. But as I said, that's unavoidable for anything but > >>>wal_level=minimal. > >> > >>Ideally, you would *only* write the data to WAL, when you do ALTER TABLE ... > >>SET LOGGED. There's no fundamental reason you need to rewrite the > >>heap, too. > > > >As another point: What's the advantage of that? The amount of writes > >will be the same, no? It doesn't seem to be all that interesting that > >a second filenode exists temporarily? > > Surely it's cheaper to read the whole relation and copy it to just WAL, than > to read the whole relation and write it both the WAL and another file. I have to admit I was thinking of the WAL replay case ;). But we'll actually have to write all dirty s_b, change the persistency tags and such anyway because there's no LSN interlock with checkpoints. That seems pretty ugly as well, and once again, avoidable by a rewrite. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Heikki Linnakangas <hlinnakangas@vmware.com> writes: > No-one's replied yet, but perhaps the worry is that after you've written > the commit record, you have to go ahead with removing/creating the init > fork, and that is seen as too risky. If a creat() or unlink() call > fails, that will have to be a PANIC, and crash recovery will likewise > have to PANIC if the forks still cannot be removed/created. > My first thought is that that seems ok. No, it isn't. No filesystem operation should *ever* be thought to be guaranteed to succeed. I also concur with Andres' complaint that this feature is not worth adding complication to the core transaction commit path for. regards, tom lane
On Thu, Apr 3, 2014 at 7:26 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > On 04/01/2014 08:39 PM, Heikki Linnakangas wrote: >> On 03/07/2014 05:36 AM, Tom Lane wrote: >>> Fabrízio de Royes Mello <fabriziomello@gmail.com> >>> writes: >>>> Do you think is difficult to implement "ALTER TABLE ... SET UNLOGGED" >>>> too? >>>> Thinking in a scope of one GSoC, of course. >>> >>> I think it's basically the same thing. You might hope to optimize it; >>> but you have to create (rather than remove) an init fork, and there's >>> no way to do that in exact sync with the commit. >> >> You just have to include that information with the commit WAL record, no? > > No-one's replied yet, but perhaps the worry is that after you've written the > commit record, you have to go ahead with removing/creating the init fork, > and that is seen as too risky. If a creat() or unlink() call fails, that > will have to be a PANIC, and crash recovery will likewise have to PANIC if > the forks still cannot be removed/created. Yeah, that's the concern. If I may digress for a moment, unlogged materialized views are not supported. This is because we have this facility where if a materialized view hasn't been populated yet, you get an error when you try to scan it. If we allowed unlogged materialized views, then they'd get reset to empty rather than to not-populated, because the not-populated status is stored in the catalog, not the filesystem. I still wish we'd never added the notion of populated in the first place, but Kevin felt it was essential, so we ended up here. Anyway, the idea that I had for fixing the unlogged materialized view case was to add a new 64-bit integer to the control file that gets bumped every time we start crash recovery, and which also gets recorded in pg_class. The value 0 would be reserved, and all pg_class entries for non-unlogged relations would store 0. For unlogged relations, we could check whether the value in pg_class equals the current value; if not, the relation should be viewed as not-populated. This is not too far from a solution from the problem we need to solve here. If we want to make an unlogged relation logged, we can go ahead and remove the init forks right away, knowing that the pg_class update changing relpersistence and this new value won't take effect until commit. If the system meanwhile crashes, a backend connected to the relevant database has enough state to recognize that the relation is in this in-between state. Before we can again use that relation, we need to rebuild the init fork and reset it. Of course, it's not too clear exactly how that state cleanup happens; as one option, we could just require the user to run a manual TRUNCATE. This would not be totally without precedent, because CREATE INDEX CONCURRENTLY leaves crap behind that the user has to reindex or drop, but it's clearly not ideal. Another option would be to try to make autovacuum put things right, but that of course might not happen right away. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company