Thread: Acclerating INSERT/UPDATE using UPS
Hello PostgreSQL Hackers, I have made a modification of PostgreSQL which accelerates INSERT/UPDATE using UPS. The name of the software is "Sigres",and the philosophy is considering a battery supplied memory as a persistent device instead of a disk. You can downloadSigres from http://sourceforge.jp/projects/sigres/ . In the maximum case, Sigres is 7 times faster than PostgreSQL default (fsync=on) in my environment (CoreDuo 2.66GHz, UDMA/133),and it is also 10% faster than PostgreSQL without fsync (fsync=off). The magic lies in usually skipping XLogWrite() and ignoring WALWriteLock. The exceptions are XLogWrite() calls from AdvanceXLInsertBuffer().In addition, in XLogFileClose() issue_xlog_fsync() before close(). (In this point, Sigres is differentfrom just simply setting fsync=off.) Although I think Sigres can be considered as one of the future directions of PostgreSQL, I do not know whether this softwarecan be accepted or not. Could you please give me some comments ? Best Regards, -- Hideyuki Kawashima Assistant Professor, University of Tsukuba
Hideyuki Kawashima wrote: > Hello PostgreSQL Hackers, > > I have made a modification of PostgreSQL which accelerates INSERT/UPDATE using UPS. The name of the software is "Sigres",and the philosophy is considering a battery supplied memory as a persistent device instead of a disk. You can downloadSigres from http://sourceforge.jp/projects/sigres/ . > > In the maximum case, Sigres is 7 times faster than PostgreSQL default (fsync=on) in my environment (CoreDuo 2.66GHz, UDMA/133),and it is also 10% faster than PostgreSQL without fsync (fsync=off). Interesting and what happens when the UPS fails? My main concern is that one of the purposes of PostgreSQL is data integrity. I am all for every performance enhancement we can achieve, that does *not* sacrifice that. Sincerely, Joshua D. Drake > > The magic lies in usually skipping XLogWrite() and ignoring WALWriteLock. The exceptions are XLogWrite() calls from AdvanceXLInsertBuffer().In addition, in XLogFileClose() issue_xlog_fsync() before close(). (In this point, Sigres is differentfrom just simply setting fsync=off.) > > Although I think Sigres can be considered as one of the future directions of PostgreSQL, I do not know whether this softwarecan be accepted or not. Could you please give me some comments ? > > Best Regards, > > -- Hideyuki Kawashima > Assistant Professor, University of Tsukuba > > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
Joshua, Thanks for your comments ! Right. As you pointed out, Sigres cannot recover if UPS fails. Therefore, I think currently Sigres cannot be used for mission critical applications. Sigres keeps data integrity *only when* UPS works. On the other hand, some users such as researchers of sensor networks would like store and analyze 10^6 Hz sensor data insertions in real-time, and obviously research usages are not mission critical. Thus Sigres may be accepted for researchers, but I have no prospect now since I have just started to distributing Sigres for research institutes in Japan. BTW, Joshua, could you please let me know or give me any pointers for the reason why fsync=off option exists on PostgreSQL although PostgreSQL developers does not allow sacrificing data integrity ? If the reason is famous and clear in the community, I am sorry for bothering you. -- Hideyuki Joshua D. Drake wrote: > Hideyuki Kawashima wrote: > >> Hello PostgreSQL Hackers, >> >> I have made a modification of PostgreSQL which accelerates INSERT/UPDATE using UPS. The name of the software is "Sigres",and the philosophy is considering a battery supplied memory as a persistent device instead of a disk. You can downloadSigres from http://sourceforge.jp/projects/sigres/ . >> >> In the maximum case, Sigres is 7 times faster than PostgreSQL default (fsync=on) in my environment (CoreDuo 2.66GHz, UDMA/133),and it is also 10% faster than PostgreSQL without fsync (fsync=off). >> > > Interesting and what happens when the UPS fails? My main concern is that > one of the purposes of PostgreSQL is data integrity. I am all for every > performance enhancement we can achieve, that does *not* sacrifice that. > > Sincerely, > > Joshua D. Drake > > >> The magic lies in usually skipping XLogWrite() and ignoring WALWriteLock. The exceptions are XLogWrite() calls from AdvanceXLInsertBuffer().In addition, in XLogFileClose() issue_xlog_fsync() before close(). (In this point, Sigres is differentfrom just simply setting fsync=off.) >> >> Although I think Sigres can be considered as one of the future directions of PostgreSQL, I do not know whether this softwarecan be accepted or not. Could you please give me some comments ? >> >> Best Regards, >> >> -- Hideyuki Kawashima >> Assistant Professor, University of Tsukuba >> >> >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 1: if posting/reading through Usenet, please send an appropriate >> subscribe-nomail command to majordomo@postgresql.org so that your >> message can get through to the mailing list cleanly >> >> > > >
> BTW, Joshua, could you please let me know or give me any pointers for > the reason why fsync=off option exists on PostgreSQL although PostgreSQL A couple of reasons that I can think of. One would be data loads or restoring from backup. Another would be on data that you can afford to throw away. > developers does not allow sacrificing data integrity ? > If the reason is famous and clear in the community, I am sorry for > bothering you. No bother at all! We invite all good ideas and I am glad to see more from our eastern community participate. Another option you might want to look at to further give yourself a boost in performance is full_page_writes. Joshua D. Drake > > > -- Hideyuki > > Joshua D. Drake wrote: >> Hideyuki Kawashima wrote: >> >>> Hello PostgreSQL Hackers, >>> >>> I have made a modification of PostgreSQL which accelerates INSERT/UPDATE using UPS. The name of the software is "Sigres",and the philosophy is considering a battery supplied memory as a persistent device instead of a disk. You can downloadSigres from http://sourceforge.jp/projects/sigres/ . >>> >>> In the maximum case, Sigres is 7 times faster than PostgreSQL default (fsync=on) in my environment (CoreDuo 2.66GHz,UDMA/133), and it is also 10% faster than PostgreSQL without fsync (fsync=off). >>> >> Interesting and what happens when the UPS fails? My main concern is that >> one of the purposes of PostgreSQL is data integrity. I am all for every >> performance enhancement we can achieve, that does *not* sacrifice that. >> >> Sincerely, >> >> Joshua D. Drake >> >> >>> The magic lies in usually skipping XLogWrite() and ignoring WALWriteLock. The exceptions are XLogWrite() calls from AdvanceXLInsertBuffer().In addition, in XLogFileClose() issue_xlog_fsync() before close(). (In this point, Sigres is differentfrom just simply setting fsync=off.) >>> >>> Although I think Sigres can be considered as one of the future directions of PostgreSQL, I do not know whether this softwarecan be accepted or not. Could you please give me some comments ? >>> >>> Best Regards, >>> >>> -- Hideyuki Kawashima >>> Assistant Professor, University of Tsukuba >>> >>> >>> >>> ---------------------------(end of broadcast)--------------------------- >>> TIP 1: if posting/reading through Usenet, please send an appropriate >>> subscribe-nomail command to majordomo@postgresql.org so that your >>> message can get through to the mailing list cleanly >>> >>> >> >> > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
Joshua, I appreciate your quick & informative reply. And, I also really appreciate your kind comments. Since I have joined this ML 3 hours ago, I tried to be polite and slightly nervous. But I was relieved by your message. Major topic. The full_page_writes option is already ignored. In Sigres, bgWriter does not call CreatCheckPoint periodically. Sigres calls CreateCheckPoint only when bgWriter calls ShutdownXLOG. Thus also from this point, durable UPS is required for Sigres. The reason why I made the Sigres is, the advances of recent non volatile memories. Just now we do not usually use non volatile memories. But in the near future, situation would change. I think if a non volatile memories can be considered as a persistence device, PostgreSQL WAL mechanism should be modified. However, I do not use such devices usually. Thus I made Sigres which requires UPS. Currently I have just ignored XLogWrite and WALWriteLock, but a friend of mine (a Japanese great hacker of PostgreSQL) has more idea to improve WAL if a battery supplied memory can be considered as a persistent device. Regards, -- Hideyuki Joshua D. Drake wrote: >> BTW, Joshua, could you please let me know or give me any pointers for >> the reason why fsync=off option exists on PostgreSQL although PostgreSQL >> > > A couple of reasons that I can think of. One would be data loads or > restoring from backup. Another would be on data that you can afford to > throw away. > > > >> developers does not allow sacrificing data integrity ? >> If the reason is famous and clear in the community, I am sorry for >> bothering you. >> > > No bother at all! We invite all good ideas and I am glad to see more > from our eastern community participate. > > Another option you might want to look at to further give yourself a > boost in performance is full_page_writes. > > Joshua D. Drake > > > > >> -- Hideyuki >> >> Joshua D. Drake wrote: >> >>> Hideyuki Kawashima wrote: >>> >>> >>>> Hello PostgreSQL Hackers, >>>> >>>> I have made a modification of PostgreSQL which accelerates INSERT/UPDATE using UPS. The name of the software is "Sigres",and the philosophy is considering a battery supplied memory as a persistent device instead of a disk. You can downloadSigres from http://sourceforge.jp/projects/sigres/ . >>>> >>>> In the maximum case, Sigres is 7 times faster than PostgreSQL default (fsync=on) in my environment (CoreDuo 2.66GHz,UDMA/133), and it is also 10% faster than PostgreSQL without fsync (fsync=off). >>>> >>>> >>> Interesting and what happens when the UPS fails? My main concern is that >>> one of the purposes of PostgreSQL is data integrity. I am all for every >>> performance enhancement we can achieve, that does *not* sacrifice that. >>> >>> Sincerely, >>> >>> Joshua D. Drake >>> >>> >>> >>>> The magic lies in usually skipping XLogWrite() and ignoring WALWriteLock. The exceptions are XLogWrite() calls fromAdvanceXLInsertBuffer(). In addition, in XLogFileClose() issue_xlog_fsync() before close(). (In this point, Sigres isdifferent from just simply setting fsync=off.) >>>> >>>> Although I think Sigres can be considered as one of the future directions of PostgreSQL, I do not know whether thissoftware can be accepted or not. Could you please give me some comments ? >>>> >>>> Best Regards, >>>> >>>> -- Hideyuki Kawashima >>>> Assistant Professor, University of Tsukuba >>>> >>>> >>>> >>>> ---------------------------(end of broadcast)--------------------------- >>>> TIP 1: if posting/reading through Usenet, please send an appropriate >>>> subscribe-nomail command to majordomo@postgresql.org so that your >>>> message can get through to the mailing list cleanly >>>> >>>> >>>> >>> >>> >> ---------------------------(end of broadcast)--------------------------- >> TIP 6: explain analyze is your friend >> >> > > > -- 筑波大学大学院システム情報工学研究科 講師 川島英之 〒305-8573 つくば市天王台1-1-1 TEL: 029-853-5322 #2月より所属が変わりました
Hideyuki Kawashima wrote: > Joshua, :) > The reason why I made the Sigres is, the advances of recent non volatile > memories. Just now we do not usually use non volatile memories. But in > the near future, situation would change. I think if a non volatile > memories can be considered as a persistence device, PostgreSQL WAL > mechanism should be modified. > However, I do not use such devices usually. Thus I made Sigres which > requires UPS. This is actually very interesting. We (www.commandprompt.com) have had several customers ask us how we can make PostgreSQL more reasonable within a flash environment. I agree with you that in the future you will see many such databases including PostgreSQL living on these devices. Tom? What do you think? Is there some room for movement here within the postgresql.conf to make something like sigres usable within PostgreSQL proper? > > Currently I have just ignored XLogWrite and WALWriteLock, but a friend > of mine (a Japanese great hacker of PostgreSQL) has more idea to improve > WAL if a battery supplied memory can be considered as a persistent device. > We are coming up very quickly on a feature freeze for the next version of PostgreSQL. If... we can has something out quickly enough and in a thought out fashion, the hackers may be willing to accept a patch for 8.3.. If not there is always 8.4.. Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
I think it would be great to have this kind of functionality built into postgres (optional and disabled by default of course) I use postgres mainly for its querying and concurrency features (a 10x increase in insert/update speed would be phenomenal) I know most people need 100% data integrity but as Hideyuki pointed out we all don't need 100%. On our project, when the power is out, we aren't receiving data anyways... just my two cents. on a side note, would putting the wal on a tmpfs partition give you something similar?
--
Gene Hart
cell: 443-604-2679
On 2/10/07, Joshua D. Drake <jd@commandprompt.com> wrote:
Hideyuki Kawashima wrote:
> Joshua,
:)
> The reason why I made the Sigres is, the advances of recent non volatile
> memories. Just now we do not usually use non volatile memories. But in
> the near future, situation would change. I think if a non volatile
> memories can be considered as a persistence device, PostgreSQL WAL
> mechanism should be modified.
> However, I do not use such devices usually. Thus I made Sigres which
> requires UPS.
This is actually very interesting. We ( www.commandprompt.com) have had
several customers ask us how we can make PostgreSQL more reasonable
within a flash environment.
I agree with you that in the future you will see many such databases
including PostgreSQL living on these devices.
Tom? What do you think? Is there some room for movement here within the
postgresql.conf to make something like sigres usable within PostgreSQL
proper?
>
> Currently I have just ignored XLogWrite and WALWriteLock, but a friend
> of mine (a Japanese great hacker of PostgreSQL) has more idea to improve
> WAL if a battery supplied memory can be considered as a persistent device.
>
We are coming up very quickly on a feature freeze for the next version
of PostgreSQL. If... we can has something out quickly enough and in a
thought out fashion, the hackers may be willing to accept a patch for
8.3.. If not there is always 8.4..
Sincerely,
Joshua D. Drake
--
=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
--
Gene Hart
cell: 443-604-2679
Joshua, I appreciate your great suggestion! It is great honor for me if Sigres will be merged to PostgreSQL. Since the changes of Sigres from PostgreSQL-8.2.1 are not many, and moreover, all of changes are surrounded with #ifdef SIGRES --- #endif, incorporating Sigres into PostgreSQL would be easy. However, Sigres modifies WAL which is the most important point of DBMS on stability. Although I myself could not find any bugs in Sigres, I am really afraid of it. It a bug exists on Sigres, it puts everyone to huge inconvenience... Therefore, before incorporating Sigres into PostgreSQL, the code must be checked, and the behaviors of Sigres must be checked carefully. Since PostgreSQL is a famous and wide spread software, I strongly want to avoid losing its great reputation. Unfortunately in Japan, I do not know any WAL hackers except for a friend of mine, and he is too busy to check Sigres. So, if pgsql-hackers checks Sigres, I am really happy. Best Regards, -- Hideyuki Joshua D. Drake wrote: > Hideyuki Kawashima wrote: > >> Joshua, >> > > :) > > >> The reason why I made the Sigres is, the advances of recent non volatile >> memories. Just now we do not usually use non volatile memories. But in >> the near future, situation would change. I think if a non volatile >> memories can be considered as a persistence device, PostgreSQL WAL >> mechanism should be modified. >> However, I do not use such devices usually. Thus I made Sigres which >> requires UPS. >> > > This is actually very interesting. We (www.commandprompt.com) have had > several customers ask us how we can make PostgreSQL more reasonable > within a flash environment. > > I agree with you that in the future you will see many such databases > including PostgreSQL living on these devices. > > Tom? What do you think? Is there some room for movement here within the > postgresql.conf to make something like sigres usable within PostgreSQL > proper? > > >> Currently I have just ignored XLogWrite and WALWriteLock, but a friend >> of mine (a Japanese great hacker of PostgreSQL) has more idea to improve >> WAL if a battery supplied memory can be considered as a persistent device. >> >> > > We are coming up very quickly on a feature freeze for the next version > of PostgreSQL. If... we can has something out quickly enough and in a > thought out fashion, the hackers may be willing to accept a patch for > 8.3.. If not there is always 8.4.. > > Sincerely, > > Joshua D. Drake > > > > > -- 筑波大学大学院システム情報工学研究科 講師 川島英之 〒305-8573 つくば市天王台1-1-1 TEL: 029-853-5322 #2月より所属が変わりました
Hideyuki Kawashima wrote: > Joshua, > > I appreciate your great suggestion! > It is great honor for me if Sigres will be merged to PostgreSQL. > Since the changes of Sigres from PostgreSQL-8.2.1 are not many, > and moreover, all of changes are surrounded with #ifdef SIGRES --- #endif, > incorporating Sigres into PostgreSQL would be easy. The best way is to create a patch against -head and submit that patch with a complete description of why, and what. If you have test cases that show the improvement all the better. I would suggest though if you are going to submit the patch that you take a look at how you could disable/enable the feature within the postgresql.conf via a guc. Sincerely, Joshua D. Drake > > However, Sigres modifies WAL which is the most important point of DBMS > on stability. > Although I myself could not find any bugs in Sigres, I am really afraid > of it. It a bug exists on Sigres, it puts everyone to huge > inconvenience... Therefore, before incorporating Sigres into PostgreSQL, > the code must be checked, and the behaviors of Sigres must be checked > carefully. Since PostgreSQL is a famous and wide spread software, I > strongly want to avoid losing its great reputation. Unfortunately in > Japan, I do not know any WAL hackers except for a friend of mine, and he > is too busy to check Sigres. So, if pgsql-hackers checks Sigres, I am > really happy. > > Best Regards, > > -- Hideyuki > > Joshua D. Drake wrote: >> Hideyuki Kawashima wrote: >> >>> Joshua, >>> >> :) >> >> >>> The reason why I made the Sigres is, the advances of recent non volatile >>> memories. Just now we do not usually use non volatile memories. But in >>> the near future, situation would change. I think if a non volatile >>> memories can be considered as a persistence device, PostgreSQL WAL >>> mechanism should be modified. >>> However, I do not use such devices usually. Thus I made Sigres which >>> requires UPS. >>> >> This is actually very interesting. We (www.commandprompt.com) have had >> several customers ask us how we can make PostgreSQL more reasonable >> within a flash environment. >> >> I agree with you that in the future you will see many such databases >> including PostgreSQL living on these devices. >> >> Tom? What do you think? Is there some room for movement here within the >> postgresql.conf to make something like sigres usable within PostgreSQL >> proper? >> >> >>> Currently I have just ignored XLogWrite and WALWriteLock, but a friend >>> of mine (a Japanese great hacker of PostgreSQL) has more idea to improve >>> WAL if a battery supplied memory can be considered as a persistent device. >>> >>> >> We are coming up very quickly on a feature freeze for the next version >> of PostgreSQL. If... we can has something out quickly enough and in a >> thought out fashion, the hackers may be willing to accept a patch for >> 8.3.. If not there is always 8.4.. >> >> Sincerely, >> >> Joshua D. Drake >> >> >> >> >> > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
kawasima@cs.tsukuba.ac.jp (Hideyuki Kawashima) wrote: > Joshua, > > I appreciate your quick & informative reply. And, I also really > appreciate your kind comments. Since I have joined this ML 3 hours > ago, I tried to be polite and slightly nervous. But I was relieved > by your message. Your idea sounds interesting; there is likely to be a considerable resistance to mechanisms, however, that would be likely to make PostgreSQL less robust. Be aware, also, that in a public forum like this, people are sometimes less gentle than Joshua. The fundamental trouble with this mechanism is that a power outage can instantly turn a database into crud. One may try to mitigate that problem by supporting the memory device with multiple power supplies *and* multiple UPSes. But there is a not-inconsiderable risk that people will fail to read warnings, deploy databases in a way that leaves them exposed to total loss, and then lay blame at this community's feet. I'm sure you can understand why the community might resist that... -- output = reverse("moc.liamg" "@" "enworbbc") http://linuxdatabases.info/info/internet.html Babbage's Rule: "No man's cipher is worth looking at unless the inventor has himself solved a very difficult cipher" (The Codebreakers by Kahn, 2nd ed, pg 765)
A long time ago, in a galaxy far, far away, kawasima@cs.tsukuba.ac.jp (Hideyuki Kawashima) wrote: > I appreciate your great suggestion! > It is great honor for me if Sigres will be merged to PostgreSQL. > Since the changes of Sigres from PostgreSQL-8.2.1 are not many, > and moreover, all of changes are surrounded with #ifdef SIGRES --- #endif, > incorporating Sigres into PostgreSQL would be easy. You should consider submitting a patch for this against CVS HEAD. And actually, I'd think it a better idea to define a GUC variable and use that to control whether Sigres is active or not. At the more sophisticated end of the spectrum, you might set things up so that it could be activated/deactivated at runtime by a superuser. At the less sophisticated end, it might need to be configured in postgresql.conf... -- output = ("cbbrowne" "@" "acm.org") http://linuxfinances.info/info/ If you've done six impossible things this morning, why not round it off with breakfast at Milliways, the Restaurant at the End of the Universe?
Gene, Thanks for your comments ! > On our project, when the power is out, we aren't receiving data anyways... just my two cents. I am sorry, but I do not know how I can appropriately answer since I do not understand what you would like to ask me in this sentence. I understand that you have a project with UPS, and when the power failure occurs on UPS, suddenly your system cannot get data. Is my understanding right ? If so, then I agree with you and the reliability and durability of UPS is mandatory for your system. > on a side note, would putting the wal on a tmpfs partition give you something similar? There are two differences between Sigres method and tmpfs with UPS method. 1: XLogWrite Even if you use tmpfs, your system executes XLogWrite which includes write(). Since write() is heavy system call, Sigres would be slightly faster than tmpfs method. 2: XLogWriteLock Even if you use tmpfs, your system lock/release XLogWriteLock while Sigres ignores. Although the frequency of XLogWriteLock accesses is lower than XLogInsertLock, ignoring XLogWriteLock improves the performance especially in many core environment. Best Regards, -- Hideyuki Gene wrote: > I think it would be great to have this kind of functionality built > into postgres (optional and disabled by default of course) I use > postgres mainly for its querying and concurrency features (a 10x > increase in insert/update speed would be phenomenal) I know most > people need 100% data integrity but as Hideyuki pointed out we all > don't need 100%. On our project, when the power is out, we aren't > receiving data anyways... just my two cents. on a side note, would > putting the wal on a tmpfs partition give you something similar? > > On 2/10/07, *Joshua D. Drake* <jd@commandprompt.com > <mailto:jd@commandprompt.com>> wrote: > > Hideyuki Kawashima wrote: > > Joshua, > > :) > > > The reason why I made the Sigres is, the advances of recent non > volatile > > memories. Just now we do not usually use non volatile memories. > But in > > the near future, situation would change. I think if a non volatile > > memories can be considered as a persistence device, PostgreSQL WAL > > mechanism should be modified. > > However, I do not use such devices usually. Thus I made Sigres which > > requires UPS. > > This is actually very interesting. We ( www.commandprompt.com > <http://www.commandprompt.com>) have had > several customers ask us how we can make PostgreSQL more reasonable > within a flash environment. > > I agree with you that in the future you will see many such databases > including PostgreSQL living on these devices. > > Tom? What do you think? Is there some room for movement here > within the > postgresql.conf to make something like sigres usable within PostgreSQL > proper? > > > > > Currently I have just ignored XLogWrite and WALWriteLock, but a > friend > > of mine (a Japanese great hacker of PostgreSQL) has more idea to > improve > > WAL if a battery supplied memory can be considered as a > persistent device. > > > > We are coming up very quickly on a feature freeze for the next version > of PostgreSQL. If... we can has something out quickly enough and in a > thought out fashion, the hackers may be willing to accept a patch for > 8.3.. If not there is always 8.4.. > > Sincerely, > > Joshua D. Drake > > > > > -- > > === The PostgreSQL Company: Command Prompt, Inc. === > Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 > Providing the most comprehensive PostgreSQL solutions since 1997 > http://www.commandprompt.com/ > > Donate to the PostgreSQL Project: > http://www.postgresql.org/about/donate > PostgreSQL Replication: http://www.commandprompt.com/products/ > > > ---------------------------(end of > broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match > > > > > -- > Gene Hart > cell: 443-604-2679
On Feb 10, 2007, at 9:33 PM, Christopher Browne wrote: > The fundamental trouble with this mechanism is that a power outage can > instantly turn a database into crud. > > One may try to mitigate that problem by supporting the memory device > with multiple power supplies *and* multiple UPSes. Ask me about the time a year ago that I had a 24x7 database, with two power supplies connected to two UPSes on independent mains circuits, dropped dead because one UPS was overloaded (more than one server connected to it, apparently too much) and the other UPS was simply dead (undetected zombie UPS), when a catastrophic power failure killed both of the generator backed mains circuits. I wasn't pleased, but it happened nonetheless. A UPS is not a 100% guarantee of anything. They fail more often than they should. No amount of paranoia guarantees uptime. That said, I see plenty of use for loosening restrictions on databases where the contents do not matter and a little loss is acceptable. Cheers, J. Andrew Rogers jrogers@neopolitan.com
Gene <genekhart@gmail.com> writes: > ... just my two cents. on a side note, would putting the wal on a > tmpfs partition give you something similar? Indeed, I'm wondering why one needs to hack the Postgres core to throw away data integrity guarantees; there are plenty of ways to do that already :-(. Hideyuki-san has not explained exactly what integrity assumptions he wants to make or not make. I'm surely willing to listen to supporting a different set of assumptions than we currently use, but I'd like to see a clear explanation of what assumptions are being made and why they represent a useful case. regards, tom lane
Christopher Browne wrote: > kawasima@cs.tsukuba.ac.jp (Hideyuki Kawashima) wrote: >> Joshua, >> >> I appreciate your quick & informative reply. And, I also really >> appreciate your kind comments. Since I have joined this ML 3 hours >> ago, I tried to be polite and slightly nervous. But I was relieved >> by your message. > > Your idea sounds interesting; there is likely to be a considerable > resistance to mechanisms, however, that would be likely to make > PostgreSQL less robust. > > Be aware, also, that in a public forum like this, people are sometimes > less gentle than Joshua. > > The fundamental trouble with this mechanism is that a power outage can > instantly turn a database into crud. Correct, that is certainly a problem. However, I think the interesting opportunity here is in devices that don't typically loose power. A PDA for example. The PostgreSQL footprint is actually quite small, and PDAs are getting larger and larger in capacity. Heck, they even have 32GB SD now. In the near future I believe we can expect to see always on, mini laptops as well. From an deployable application perspective, this could be a big deal. We are already starting to see very large traction in the Win32 desktop app arena. > > One may try to mitigate that problem by supporting the memory device > with multiple power supplies *and* multiple UPSes. > > But there is a not-inconsiderable risk that people will fail to read > warnings, deploy databases in a way that leaves them exposed to total > loss, and then lay blame at this community's feet. I'm sure you can > understand why the community might resist that... I certainly can, but a feature left off by default and documented thoroughly can mitigate a lot of that. Heck if we really wanted to we could even make it a custom build; --with-lazy-wal ;) Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
Tom Lane wrote: > Gene <genekhart@gmail.com> writes: >> ... just my two cents. on a side note, would putting the wal on a >> tmpfs partition give you something similar? > > Indeed, I'm wondering why one needs to hack the Postgres core to throw > away data integrity guarantees; there are plenty of ways to do that > already :-(. Under a Linux or FreeBSD environment sure... but what about where a good portion of the memory *is* flash? We have a customer right now where they have a device that has 64 megs of ram and 512 megs of flash. The system itself considers it total ram. They run over the 64 megs almost on boot. Allowing postgresql to be less aggressive on writes would help them quite a bit. > Hideyuki-san has not explained exactly what integrity > assumptions he wants to make or not make. I'm surely willing to listen > to supporting a different set of assumptions than we currently use, but > I'd like to see a clear explanation of what assumptions are being made > and why they represent a useful case. Absolutely agreed there. Sincerely, Joshua D. Drake > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
"Joshua D. Drake" <jd@commandprompt.com> writes: > Tom Lane wrote: >> I'd like to see a clear explanation of what assumptions are being made >> and why they represent a useful case. > Absolutely agreed there. Just to be clear: I believe our current assumptions can be stated as "Postgres will not lose data if the kernel and disk drive do not lose data that they have acknowledged as being successfully fsync'd." This is independent of any questions about Postgres bugs or measures that we take to limit the impact of our bugs --- it's about what our extent of responsibility is. I think that Hideyuki-san is proposing a different contract for data integrity, and I want to understand what that contract is and why someone would want it. regards, tom lane
Christopher, Thanks for your comments and I am sorry for my bad response because of my business. In the first place, criticisms are welcome since without criticisms and discussions, I cannot improve my idea. Christopher, thanks to your comments, I understood why this community does not incline to accept Sigres approach. Reason 1: Sigres method may make PostgreSQL less robust. Reason 2: There is a not-inconsiderable risk that people will fail to read warnings Let me write my opinions for the above. Opinion for Reason 1: Right. Apparently Sigres method makes PostgreSQL less robust. Thus I propose to use Sigres method as an option like fsync=off. In my understanding, fsync=off option is ambiguous. I think unfortunately it is unclear when exactly the option should be used. Even if one uses UPS, the option is dangerous because the final issue_xlog_sync called by XLogWrite called by CreateCheckPoint called by ShutdownXLOG called by bgWriter process at postgres shutdown is not executed. (Sorry for the exact but wired descriptions). Without using UPS, the option never be used. On the other hand, *IF* UPS is reliable, Sigres assures synchronization of wal records to a disk because Sigres executes issue_xlog_fsync at each XLogFileClose() and at shutdown. That is, the question would be expressed as "Can an administrator have a reliable UPS system ?" If an administrator is confident on his/her UPS system, the administrator may select Sigres method. In summary, fsync=off is ambiguous option, and Sigres can be used *IF* an administrator is confident on his/her UPS system. Opinion for Reason 2: Right. I think it is easy to avoid the risk. Sigres should not be used in default such as fsync=off. If my opinion lacks in common sense, I am sorry. Regards, -- Hideyuki Christopher Browne wrote: > kawasima@cs.tsukuba.ac.jp (Hideyuki Kawashima) wrote: > >> Joshua, >> >> I appreciate your quick & informative reply. And, I also really >> appreciate your kind comments. Since I have joined this ML 3 hours >> ago, I tried to be polite and slightly nervous. But I was relieved >> by your message. >> > > Your idea sounds interesting; there is likely to be a considerable > resistance to mechanisms, however, that would be likely to make > PostgreSQL less robust. > > Be aware, also, that in a public forum like this, people are sometimes > less gentle than Joshua. > > The fundamental trouble with this mechanism is that a power outage can > instantly turn a database into crud. > > One may try to mitigate that problem by supporting the memory device > with multiple power supplies *and* multiple UPSes. > > But there is a not-inconsiderable risk that people will fail to read > warnings, deploy databases in a way that leaves them exposed to total > loss, and then lay blame at this community's feet. I'm sure you can > understand why the community might resist that... >
Christopher, Thanks for your advices ! OK, I will submit a patch as soon as possible. In the patch, Sigres will be activated by postgresql.conf. Best Regards, -- Hideyuki A long time ago, in a galaxy far, far away, kawasima@cs.tsukuba.ac.jp (Hideyuki Kawashima) wrote: > > I appreciate your great suggestion! > > It is great honor for me if Sigres will be merged to PostgreSQL. > > Since the changes of Sigres from PostgreSQL-8.2.1 are not many, > > and moreover, all of changes are surrounded with #ifdef SIGRES --- #endif, > > incorporating Sigres into PostgreSQL would be easy. You should consider submitting a patch for this against CVS HEAD. And actually, I'd think it a better idea to define a GUC variable and use that to control whether Sigres is active or not. At the more sophisticated end of the spectrum, you might set things up so that it could be activated/deactivated at runtime by a superuser. At the less sophisticated end, it might need to be configured in postgresql.conf... -- output = ("cbbrowne" "@" "acm.org") http://linuxfinances.info/info/ If you've done six impossible things this morning, why not round it off with breakfast at Milliways, the Restaurant at the End of the Universe? ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Andrew, Your experience is really informative. Thanks for giving me such a really precious information. Since I am a researcher, I rarely faces on real troubles. Andrew, I agree with you. UPS is not reliable all the time. On the other hand, however, disks or raids are also not reliable all the time. Moreover, on the earth, there is NO device which completely assures durability of the ACID transaction property. Even if disks and power supplies are perfect, the system would be destroyed by an earthquake (thus, disaster recovery techniques are recently popular). Thus I think the problem is not in a device. It is in the management of system administrators. (However, this is just my opinion and it is not objective one required by large questionnaire). On the other hand, speed is required. My first motivation of this work is a friend of mine who is working at a financial firm. He strongly required performance of UPDATE/INSERT on the database servers in his company (the database was a major commercial product). I think in such a rich environment, strong power management facilities may be expected though this is just my one thought). In summary, I think the problem lies in system management rather than device itself. Regards, -- Hideyuki J. Andrew Rogers wrote: > > On Feb 10, 2007, at 9:33 PM, Christopher Browne wrote: >> The fundamental trouble with this mechanism is that a power outage can >> instantly turn a database into crud. >> >> One may try to mitigate that problem by supporting the memory device >> with multiple power supplies *and* multiple UPSes. > > > Ask me about the time a year ago that I had a 24x7 database, with two power supplies connected to two UPSes on independentmains circuits, dropped dead because one UPS was overloaded (more than one server connected to it, apparentlytoo much) and the other UPS was simply dead (undetected zombie UPS), when a catastrophic power failure killed bothof the generator backed mains circuits. > > I wasn't pleased, but it happened nonetheless. A UPS is not a 100% guarantee of anything. They fail more often than theyshould. No amount of paranoia guarantees uptime. > > That said, I see plenty of use for loosening restrictions on databases where the contents do not matter and a little lossis acceptable. > > Cheers, > > J. Andrew Rogers > jrogers@neopolitan.com > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org > >
Tom, Thanks for your comments ! Let me answer to your beneficial mail. > Indeed, I'm wondering why one needs to hack the Postgres core to throw > away data integrity guarantees; The motivation of my work is accelerating INSERT/UPDATE without losing data integrity guarantees. In the first place, I thought *IF* a battery supplied memory can be considered as a persistent device, Sigres method works. However, my assumption seems not to be accepted unfortunately. Without sending my first email, I could not know your informative comments. Thank you ! > there are plenty of ways to do that > already :-(. Could you please let me know or give me some pointers for the important information ? For ubiquitous computing researchers or sensor data processing researchers, that is a great concern. Since INSERT/UPDATE is too heavy, several universities have developed stream processing engines (SPE) which do not store any data. Also I have already published more than 4 international conference papers in this topic. Moreover, I have been developing a brand-new DBMS from scratch in my research project. I am all ears ! > Hideyuki-san has not explained exactly what integrity > assumptions he wants to make or not make. I'm surely willing to listen > to supporting a different set of assumptions than we currently use, but > I'd like to see a clear explanation of what assumptions are being made > and why they represent a useful case. In the first place, I thought Sigres method works *IF* UPSs are reliable. This was my assumption. However, I am in the process of understanding that unfortunately UPSs may not be reliable usually through this discussion :-( On the other hand, however, there not negligible people who hopes to dramatically accelerate INSERT/UPDATE such as financial companies or sensor data processing research groups such as our team. For example, In our team, the default postgresql is too slow and thus sensor data insertions delays over an hour. The purpose of our team is detecting real-world events in real-time, but it does not work because of bad performance. In such a case, Sigres is useful if I can believe UPS. ..... I am not confident whether my explanation is clear or not. If unclear, sorry. Best Regards, -- Hideyuki Tom Lane wrote: > Gene <genekhart@gmail.com> writes: >> ... just my two cents. on a side note, would putting the wal on a >> tmpfs partition give you something similar? > > Indeed, I'm wondering why one needs to hack the Postgres core to throw > away data integrity guarantees; there are plenty of ways to do that > already :-(. Hideyuki-san has not explained exactly what integrity > assumptions he wants to make or not make. I'm surely willing to listen > to supporting a different set of assumptions than we currently use, but > I'd like to see a clear explanation of what assumptions are being made > and why they represent a useful case. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate > > --
Tom and all, I apologize destroying the thread information with this topic. Unfortunately my basic smtp server does not work now and I am writing all the responses via gmail tonight... Tom, Thanks for teaching me about the development assumption of PostgreSQL. The assumption and my direction are different since the assumption considers only disk drive as persistent device while my assumption is battery-supplied memory. Honestly speaking, I am not sure whether there is a person who accepts my assumption (i.e. battery-supplied memory as persistent device) or not. And I am not sure whether my approach can be integrated into PostgreSQL since some accept and some reject. So, anyway I will write a patch and submit. Then I leave to this community the decision of accept/reject. If someone has interest on Sigres, please download it from http://sourceforge.jp/projects/sigres/ and try it. I will continue to accelerate Sigres more anyway under my assumption. Since I believe "time always wins in transaction processing" as Jim Gray told last June in ACM SIGMOD 2006 keynote talk, I wish UPS will be reliable or nice non volatile memories such as MRAM will appear in the near future. Finally, this discussion was really beneficial for me. I would like to say thank you for everyone who gave me great information. Thank you all ! Best Regards, -- Hideyuki Tom Lane wrote: > "Joshua D. Drake" <jd@commandprompt.com> writes: >> Tom Lane wrote: >>> I'd like to see a clear explanation of what assumptions are being made >>> and why they represent a useful case. > >> Absolutely agreed there. > > Just to be clear: I believe our current assumptions can be stated as > "Postgres will not lose data if the kernel and disk drive do not lose > data that they have acknowledged as being successfully fsync'd." > This is independent of any questions about Postgres bugs or measures > that we take to limit the impact of our bugs --- it's about what our > extent of responsibility is. I think that Hideyuki-san is proposing > a different contract for data integrity, and I want to understand what > that contract is and why someone would want it. > > regards, tom lane >
Christopher Browne wrote: > kawasima@cs.tsukuba.ac.jp (Hideyuki Kawashima) wrote: >> Joshua, >> >> I appreciate your quick & informative reply. And, I also really >> appreciate your kind comments. Since I have joined this ML 3 hours >> ago, I tried to be polite and slightly nervous. But I was relieved >> by your message. > > Your idea sounds interesting; there is likely to be a considerable > resistance to mechanisms, however, that would be likely to make > PostgreSQL less robust. > > Be aware, also, that in a public forum like this, people are sometimes > less gentle than Joshua. > > The fundamental trouble with this mechanism is that a power outage can > instantly turn a database into crud. I can think of a few places where I don't care about the data if the power is lost - * Web-based session data A lot of web sites have separate session-only databases. If the database goes down, we have to truncate the tables anyways when it comes back up. * Reporting slaves We have replication slaves setup for internal (staff-only) reporting. Often a lot of temp and summary tables as well. If the data is lost, don't care. Its a reporting database. Re-syncing from another slave is no biggie for total data loss. Less a concern given the speed increase of the data it creates as well as data coming in from the master. * Front-end cache slaves Same type of situation as the reporting slaves. Basic front-end cache that replicates data to take load off the master. The slaves still have to do all the same insert/updates, but this means they'll spend less time in locks. They crash, point the apps to the master or another slave while you fix it. Weslee
Tom Lane wrote: > Gene <genekhart@gmail.com> writes: > > ... just my two cents. on a side note, would putting the wal on a > > tmpfs partition give you something similar? > > Indeed, I'm wondering why one needs to hack the Postgres core to throw > away data integrity guarantees; there are plenty of ways to do that > already :-(. Hideyuki-san has not explained exactly what integrity > assumptions he wants to make or not make. I'm surely willing to listen > to supporting a different set of assumptions than we currently use, but > I'd like to see a clear explanation of what assumptions are being made > and why they represent a useful case. I am unsure why Sigres is significantly faster than just creating a file system on the permanent memory and putting xlog on there. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
I was curious to see how postgres would perform with wal on a tmpfs vs disk here are some numbers I got from pgbench. Letme know if I did something stupid, this is the first time I've used pgbench. The wal on tmpfs method is not significantlyfaster. <br /><br />[[ WAL ON TMPFS ]]<br />pgbench -i -s 10 -U postgres -d benchmark<br />...<br />pgbench-Upostgres -s 10 -c 10 -t 10000 benchmark<br />starting vacuum...end.<br />transaction type: TPC-B (sort of)<br/>scaling factor: 10<br />number of clients: 10 <br />number of transactions per client: 10000<br />number of transactionsactually processed: 100000/100000<br />tps = 5817.693724 (including connections establishing)<br />tps = 5825.646441(excluding connections establishing)<br /><br />[[ WAL ON EXT2 14 U320 DRIVE RAID10 WITH BBU (same as data) ]]<br/>pgbench -Upostgres -s 10 -c 10 -t 10000 benchmark<br />starting vacuum...end.<br />transaction type: TPC-B (sort of)<br/>scaling factor: 10<br />number of clients: 10 <br />number of transactions per client: 10000<br />number of transactionsactually processed: 100000/100000<br />tps = 5653.187997 (including connections establishing)<br />tps = 5660.554438(excluding connections establishing)<br /><br />pgbench -Upostgres -s 100 -c 10 -t 10000 benchmark<br />startingvacuum...end.<br />transaction type: TPC-B (sort of)<br />scaling factor: 100<br />number of clients: 10<br />numberof transactions per client: 10000<br />number of transactions actually processed: 100000/100000 <br />tps = 5536.019864(including connections establishing)<br />tps = 5543.834350 (excluding connections establishing)<br /><br />
On 2/12/07, Gene <genekhart@gmail.com> wrote:
tmpfs uses virtual memory, so it is likely hitting the disk. I would try and repeat your cases with ramfs, which uses real memory. Testing shows that just putting pg_xlog on ramfs yields a 30% boost in performance over disk with a stock configuration. So there is room for improvement, I am sure.
I was curious to see how postgres would perform with wal on a tmpfs vs disk here are some numbers I got from pgbench. Let me know if I did something stupid, this is the first time I've used pgbench. The wal on tmpfs method is not significantly faster.
tmpfs uses virtual memory, so it is likely hitting the disk. I would try and repeat your cases with ramfs, which uses real memory. Testing shows that just putting pg_xlog on ramfs yields a 30% boost in performance over disk with a stock configuration. So there is room for improvement, I am sure.
Gene <genekhart@gmail.com> writes: > I was curious to see how postgres would perform with wal on a tmpfs vs disk > here are some numbers I got from pgbench. Let me know if I did something > stupid, this is the first time I've used pgbench. The wal on tmpfs method is > not significantly faster. This comparison is not very useful because you were using battery-backed write cache, which gives pretty much all the performance improvement that is to be looked for in this area. Try it against a plain vanilla disk drive (that's not lying about write complete) and you'll find the maximum TPS rate is closely related to the disk's rotation rate. At the same time though, the existence of BBWC solutions makes me wonder why we need another. regards, tom lane
Bruce, Thanks for your comments, and let me answer to your question. Sigres is *not* significantly faster than just creating a file system on the permanent memory and putting xlog on there. Sigres is slightly faster than the case because each backend does not call XLogWrite while bgWriter does. -- Hideyuki Bruce Momjian wrote: > Tom Lane wrote: > >> Gene <genekhart@gmail.com> writes: >> >>> ... just my two cents. on a side note, would putting the wal on a >>> tmpfs partition give you something similar? >>> >> Indeed, I'm wondering why one needs to hack the Postgres core to throw >> away data integrity guarantees; there are plenty of ways to do that >> already :-(. Hideyuki-san has not explained exactly what integrity >> assumptions he wants to make or not make. I'm surely willing to listen >> to supporting a different set of assumptions than we currently use, but >> I'd like to see a clear explanation of what assumptions are being made >> and why they represent a useful case. >> > > I am unsure why Sigres is significantly faster than just creating a file > system on the permanent memory and putting xlog on there. > >
Hideyuki Kawashima wrote: > Bruce, > > Thanks for your comments, and let me answer to your question. > Sigres is *not* significantly faster than just creating a file system on > the permanent memory and putting xlog on there. > Sigres is slightly faster than the case because each backend does not > call XLogWrite while bgWriter does. The question then is how much faster is it, and is that worth adding extra code to improve it. --------------------------------------------------------------------------- > > -- Hideyuki > > Bruce Momjian wrote: > > Tom Lane wrote: > > > >> Gene <genekhart@gmail.com> writes: > >> > >>> ... just my two cents. on a side note, would putting the wal on a > >>> tmpfs partition give you something similar? > >>> > >> Indeed, I'm wondering why one needs to hack the Postgres core to throw > >> away data integrity guarantees; there are plenty of ways to do that > >> already :-(. Hideyuki-san has not explained exactly what integrity > >> assumptions he wants to make or not make. I'm surely willing to listen > >> to supporting a different set of assumptions than we currently use, but > >> I'd like to see a clear explanation of what assumptions are being made > >> and why they represent a useful case. > >> > > > > I am unsure why Sigres is significantly faster than just creating a file > > system on the permanent memory and putting xlog on there. > > > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Gene, I think tmpfs/ramfs method would increase the performance of PostgreSQL when workload is write-intensive. Does pgbench issue write intensive queries ? Since I do not know the detailed queries issued in pgbench, I am happy if you teach me. Regards, -- Hideyuki Gene wrote: > I was curious to see how postgres would perform with wal on a tmpfs vs > disk here are some numbers I got from pgbench. Let me know if I did > something stupid, this is the first time I've used pgbench. The wal on > tmpfs method is not significantly faster. > > [[ WAL ON TMPFS ]] > pgbench -i -s 10 -U postgres -d benchmark > ... > pgbench -Upostgres -s 10 -c 10 -t 10000 benchmark > starting vacuum...end. > transaction type: TPC-B (sort of) > scaling factor: 10 > number of clients: 10 > number of transactions per client: 10000 > number of transactions actually processed: 100000/100000 > tps = 5817.693724 (including connections establishing) > tps = 5825.646441 (excluding connections establishing) > > [[ WAL ON EXT2 14 U320 DRIVE RAID10 WITH BBU (same as data) ]] > pgbench -Upostgres -s 10 -c 10 -t 10000 benchmark > starting vacuum...end. > transaction type: TPC-B (sort of) > scaling factor: 10 > number of clients: 10 > number of transactions per client: 10000 > number of transactions actually processed: 100000/100000 > tps = 5653.187997 (including connections establishing) > tps = 5660.554438 (excluding connections establishing) > > pgbench -Upostgres -s 100 -c 10 -t 10000 benchmark > starting vacuum...end. > transaction type: TPC-B (sort of) > scaling factor: 100 > number of clients: 10 > number of transactions per client: 10000 > number of transactions actually processed: 100000/100000 > tps = 5536.019864 (including connections establishing) > tps = 5543.834350 (excluding connections establishing) >
Bruce, Let me answer to your question. Question 1: How much faster is it ? To answer to your question, I measured execution times of INSERT operations on concurrent accesses on dual-core x 2 CPU (each 2.80GHz), In the experiment, each client issues 5000 INSERT operations concurrently. The following shows the average times (seconds) of 5 measurements. #Cli: The number of concurrent clients P-D: PostgreSQL Default (i.e. usual file system) P-T: PostgreSQL tmpfs S-D: Sigres Default (i.e. usual file system) S-T: Sigres tmpfs P-T/S-T Improve ratio of S-T to P-T The result shows S-T is 10% to 18% faster than P-T. Thus my answer to your question is 10% to 18% when concurrency is from1 to 100. #Cli P-D P-T S-D S-T P-T/S-T 1 1.72 0.50 0.46 0.45 1.10 2 2.87 0.62 0.58 0.54 1.15 3 3.08 0.89 0.77 0.77 1.15 4 3.14 0.98 0.86 0.84 1.16 5 3.31 1.23 1.09 1.07 1.15 6 3.57 1.44 1.31 1.27 1.14 7 3.91 1.68 1.51 1.48 1.14 8 4.49 1.89 1.71 1.67 1.13 9 4.78 2.21 1.9 1.92 1.15 10 5.33 2.47 2.22 2.14 1.15 20 11.50 5.66 5.16 4.86 1.16 50 32.96 16.54 14.92 13.97 1.18100 79.60 43.71 39.55 38.38 1.14 Question 2: Is that worth adding extra code to improve it ? Yes, I think it is worth. It is because in the case of commercial DBMS, only 5% improvement is achieved with version-up. BTW, I and a friend of mine try to design & implement a parallel access way to the wal buffer on a shared memory. I think this is promising direction since WALInsertLock is more frequently issued than WALWriteLock, and the number of CPU-cores will increase definitely. -- Hideyuki Bruce Momjian wrote: > Hideyuki Kawashima wrote: > >> Bruce, >> >> Thanks for your comments, and let me answer to your question. >> Sigres is *not* significantly faster than just creating a file system on >> the permanent memory and putting xlog on there. >> Sigres is slightly faster than the case because each backend does not >> call XLogWrite while bgWriter does. >> > > The question then is how much faster is it, and is that worth adding > extra code to improve it. > > --------------------------------------------------------------------------- > > > >> -- Hideyuki >> >> Bruce Momjian wrote: >> >>> Tom Lane wrote: >>> >>> >>>> Gene <genekhart@gmail.com> writes: >>>> >>>> >>>>> ... just my two cents. on a side note, would putting the wal on a >>>>> tmpfs partition give you something similar? >>>>> >>>>> >>>> Indeed, I'm wondering why one needs to hack the Postgres core to throw >>>> away data integrity guarantees; there are plenty of ways to do that >>>> already :-(. Hideyuki-san has not explained exactly what integrity >>>> assumptions he wants to make or not make. I'm surely willing to listen >>>> to supporting a different set of assumptions than we currently use, but >>>> I'd like to see a clear explanation of what assumptions are being made >>>> and why they represent a useful case. >>>> >>>> >>> I am unsure why Sigres is significantly faster than just creating a file >>> system on the permanent memory and putting xlog on there. >>> >>> >>> >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 1: if posting/reading through Usenet, please send an appropriate >> subscribe-nomail command to majordomo@postgresql.org so that your >> message can get through to the mailing list cleanly >> > > -- Hideyuki Kawashima (Ph.D), University of Tsukuba, Graduate School of Systems and Information Engineering Assistant Professor, TEL: +81-29-853-5322
Tom, In my experience, in last April, a BBWC solution did not accelerate PostgreSQL well. The device which I tried was i-ram by gigabyte (http://techreport.com/reviews/2006q1/gigabyte-iram/index.x?pg=1 ). The i-ram showed only a little performance improvement compared to PostgreSQL with fsync to disk. (However, in then case of PostgreSQL fsync=off, the performance improvement was great). Thus I think Sigres is better than BBWC, to the best of my knowledge. However, I do not know other BBWC technologies such as HP smart array E200 controller. (http://h18004.www1.hp.com/products/servers/proliantstorage/arraycontrollers/smartarraye200/index.html) So, I am sorry if I describe wrong conclusion. Best Regards, -- Hideyuki Tom Lane wrote: > Gene <genekhart@gmail.com> writes: > >> I was curious to see how postgres would perform with wal on a tmpfs vs disk >> here are some numbers I got from pgbench. Let me know if I did something >> stupid, this is the first time I've used pgbench. The wal on tmpfs method is >> not significantly faster. >> > > This comparison is not very useful because you were using battery-backed > write cache, which gives pretty much all the performance improvement > that is to be looked for in this area. Try it against a plain vanilla > disk drive (that's not lying about write complete) and you'll find the > maximum TPS rate is closely related to the disk's rotation rate. > > At the same time though, the existence of BBWC solutions makes me wonder > why we need another. > > regards, tom lane > > > -- Hideyuki Kawashima (Ph.D), University of Tsukuba, Graduate School of Systems and Information Engineering Assistant Professor, TEL: +81-29-853-5322
Joshua, I revised. Now Sigres can be activated by setting "sigres = on" in postgresql.conf. You can download the version (0.1.2) from http://sourceforge.jp/projects/sigres . And, I attach the diff between PostgreSQL-8.2.1 and Sigres-0.1.2 to this mail. Thanks for your comments. -- Hideyuki Joshua D. Drake wrote: > Hideyuki Kawashima wrote: > >> Joshua, >> >> I appreciate your great suggestion! >> It is great honor for me if Sigres will be merged to PostgreSQL. >> Since the changes of Sigres from PostgreSQL-8.2.1 are not many, >> and moreover, all of changes are surrounded with #ifdef SIGRES --- #endif, >> incorporating Sigres into PostgreSQL would be easy. >> > > The best way is to create a patch against -head and submit that patch > with a complete description of why, and what. If you have test cases > that show the improvement all the better. > > I would suggest though if you are going to submit the patch that you > take a look at how you could disable/enable the feature within the > postgresql.conf via a guc. > > Sincerely, > > Joshua D. Drake > > >> However, Sigres modifies WAL which is the most important point of DBMS >> on stability. >> Although I myself could not find any bugs in Sigres, I am really afraid >> of it. It a bug exists on Sigres, it puts everyone to huge >> inconvenience... Therefore, before incorporating Sigres into PostgreSQL, >> the code must be checked, and the behaviors of Sigres must be checked >> carefully. Since PostgreSQL is a famous and wide spread software, I >> strongly want to avoid losing its great reputation. Unfortunately in >> Japan, I do not know any WAL hackers except for a friend of mine, and he >> is too busy to check Sigres. So, if pgsql-hackers checks Sigres, I am >> really happy. >> >> Best Regards, >> >> -- Hideyuki >> >> Joshua D. Drake wrote: >> >>> Hideyuki Kawashima wrote: >>> >>> >>>> Joshua, >>>> >>>> >>> :) >>> >>> >>> >>>> The reason why I made the Sigres is, the advances of recent non volatile >>>> memories. Just now we do not usually use non volatile memories. But in >>>> the near future, situation would change. I think if a non volatile >>>> memories can be considered as a persistence device, PostgreSQL WAL >>>> mechanism should be modified. >>>> However, I do not use such devices usually. Thus I made Sigres which >>>> requires UPS. >>>> >>>> >>> This is actually very interesting. We (www.commandprompt.com) have had >>> several customers ask us how we can make PostgreSQL more reasonable >>> within a flash environment. >>> >>> I agree with you that in the future you will see many such databases >>> including PostgreSQL living on these devices. >>> >>> Tom? What do you think? Is there some room for movement here within the >>> postgresql.conf to make something like sigres usable within PostgreSQL >>> proper? >>> >>> >>> >>>> Currently I have just ignored XLogWrite and WALWriteLock, but a friend >>>> of mine (a Japanese great hacker of PostgreSQL) has more idea to improve >>>> WAL if a battery supplied memory can be considered as a persistent device. >>>> >>>> >>>> >>> We are coming up very quickly on a feature freeze for the next version >>> of PostgreSQL. If... we can has something out quickly enough and in a >>> thought out fashion, the hackers may be willing to accept a patch for >>> 8.3.. If not there is always 8.4.. >>> >>> Sincerely, >>> >>> Joshua D. Drake >>> >>> >>> >>> >>> >>> > > > -- Hideyuki Kawashima (Ph.D), University of Tsukuba, Graduate School of Systems and Information Engineering Assistant Professor, TEL: +81-29-853-5322 Only in sigres-0.1.2: CHANGELOG Only in postgresql-8.2.1: README Only in sigres-0.1.2: README.pgsql Only in sigres-0.1.2: README.sigres diff -c -r postgresql-8.2.1/src/backend/access/transam/xlog.c sigres-0.1.2/src/backend/access/transam/xlog.c *** postgresql-8.2.1/src/backend/access/transam/xlog.c Fri Dec 1 03:29:11 2006 --- sigres-0.1.2/src/backend/access/transam/xlog.c Fri Feb 16 09:45:05 2007 *************** *** 10,15 **** --- 10,18 ---- * $PostgreSQL: pgsql/src/backend/access/transam/xlog.c,v 1.258 2006/11/30 18:29:11 tgl Exp $ * *------------------------------------------------------------------------- + * + * Extended to Sigres by Hideyuki Kawashima (kawasima@cs.tsukuba.ac.jp) + * */ #include "postgres.h" *************** *** 142,147 **** --- 145,154 ---- bool XLOG_DEBUG = false; #endif + /* Entities are in "globals.c", for SIGRES */ + extern int BgWriterPid; + extern bool enableSigres; + /* * XLOGfileslop is used in the code as the allowed "fuzz" in the number of * preallocated XLOG segments --- we try to have at least XLOGfiles advance *************** *** 474,480 **** static bool XLogCheckBuffer(XLogRecData *rdata, bool doPageWrites, XLogRecPtr *lsn, BkpBlock *bkpb); static bool AdvanceXLInsertBuffer(bool new_segment); ! static void XLogWrite(XLogwrtRqst WriteRqst, bool flexible, bool xlog_switch); static int XLogFileInit(uint32 log, uint32 seg, bool *use_existent, bool use_lock); static bool InstallXLogFileSegment(uint32 *log, uint32 *seg, char *tmppath, --- 481,487 ---- static bool XLogCheckBuffer(XLogRecData *rdata, bool doPageWrites, XLogRecPtr *lsn, BkpBlock *bkpb); static bool AdvanceXLInsertBuffer(bool new_segment); ! static void _XLogWrite(XLogwrtRqst WriteRqst, bool flexible, bool xlog_switch); static int XLogFileInit(uint32 log, uint32 seg, bool *use_existent, bool use_lock); static bool InstallXLogFileSegment(uint32 *log, uint32 *seg, char *tmppath, *************** *** 510,515 **** --- 517,529 ---- XLogRecPtr *minRecoveryLoc); static void rm_redo_error_callback(void *arg); + #define XLogWrite(WriteRqst, flexible, xlog_switch) \ + do { \ + if (enableSigres == false) \ + _XLogWrite(WriteRqst, flexible, xlog_switch); \ + else if ((IsUnderPostmaster && BgWriterPid == MyProcPid) || !IsUnderPostmaster) \ + _XLogWrite(WriteRqst, flexible, xlog_switch); \ + } while (0) /* * Insert an XLOG record having the specified RMID and info bytes, *************** *** 727,746 **** (LogwrtRqst.Write.xrecoff >= LogwrtResult.Write.xrecoff + XLogCtl->XLogCacheByte / 2)) { ! if (LWLockConditionalAcquire(WALWriteLock, LW_EXCLUSIVE)) ! { ! /* ! * Since the amount of data we write here is completely optional ! * anyway, tell XLogWrite it can be "flexible" and stop at a ! * convenient boundary. This allows writes triggered by this ! * mechanism to synchronize with the cache boundaries, so that in ! * a long transaction we'll basically dump alternating halves of ! * the buffer array. ! */ ! LogwrtResult = XLogCtl->Write.LogwrtResult; ! if (XLByteLT(LogwrtResult.Write, LogwrtRqst.Write)) ! XLogWrite(LogwrtRqst, true, false); ! LWLockRelease(WALWriteLock); } } --- 741,762 ---- (LogwrtRqst.Write.xrecoff >= LogwrtResult.Write.xrecoff + XLogCtl->XLogCacheByte / 2)) { ! if (enableSigres == false) { ! if (LWLockConditionalAcquire(WALWriteLock, LW_EXCLUSIVE)) ! { ! /* ! * Since the amount of data we write here is completely optional ! * anyway, tell XLogWrite it can be "flexible" and stop at a ! * convenient boundary. This allows writes triggered by this ! * mechanism to synchronize with the cache boundaries, so that in ! * a long transaction we'll basically dump alternating halves of ! * the buffer array. ! */ ! LogwrtResult = XLogCtl->Write.LogwrtResult; ! if (XLByteLT(LogwrtResult.Write, LogwrtRqst.Write)) ! XLogWrite(LogwrtRqst, true, false); ! LWLockRelease(WALWriteLock); ! } } } *************** *** 890,897 **** RecPtr.xlogid -= 1; RecPtr.xrecoff = XLogFileSize; } ! ! LWLockAcquire(WALWriteLock, LW_EXCLUSIVE); LogwrtResult = XLogCtl->Write.LogwrtResult; if (!XLByteLE(RecPtr, LogwrtResult.Flush)) { --- 906,913 ---- RecPtr.xlogid -= 1; RecPtr.xrecoff = XLogFileSize; } ! if (enableSigres == false) ! LWLockAcquire(WALWriteLock, LW_EXCLUSIVE); LogwrtResult = XLogCtl->Write.LogwrtResult; if (!XLByteLE(RecPtr, LogwrtResult.Flush)) { *************** *** 901,908 **** FlushRqst.Flush = RecPtr; XLogWrite(FlushRqst, false, false); } ! LWLockRelease(WALWriteLock); ! END_CRIT_SECTION(); return RecPtr; --- 917,924 ---- FlushRqst.Flush = RecPtr; XLogWrite(FlushRqst, false, false); } ! if (enableSigres == false) ! LWLockRelease(WALWriteLock); END_CRIT_SECTION(); return RecPtr; *************** *** 1016,1024 **** XLogCtlWrite *Write = &XLogCtl->Write; XLogwrtRqst FlushRqst; XLogRecPtr OldSegEnd; ! ! LWLockAcquire(WALWriteLock, LW_EXCLUSIVE); ! /* * Flush through the end of the page containing XLOG_SWITCH, and * perform end-of-segment actions (eg, notifying archiver). --- 1032,1039 ---- XLogCtlWrite *Write = &XLogCtl->Write; XLogwrtRqst FlushRqst; XLogRecPtr OldSegEnd; ! if (enableSigres == false) ! LWLockAcquire(WALWriteLock, LW_EXCLUSIVE); /* * Flush through the end of the page containing XLOG_SWITCH, and * perform end-of-segment actions (eg, notifying archiver). *************** *** 1067,1075 **** } Write->LogwrtResult = LogwrtResult; ! ! LWLockRelease(WALWriteLock); ! updrqst = false; /* done already */ } else --- 1082,1089 ---- } Write->LogwrtResult = LogwrtResult; ! if (enableSigres == false) ! LWLockRelease(WALWriteLock); updrqst = false; /* done already */ } else *************** *** 1376,1382 **** WriteRqst.Write = OldPageRqstPtr; WriteRqst.Flush.xlogid = 0; WriteRqst.Flush.xrecoff = 0; ! XLogWrite(WriteRqst, false, false); LWLockRelease(WALWriteLock); Insert->LogwrtResult = LogwrtResult; } --- 1390,1396 ---- WriteRqst.Write = OldPageRqstPtr; WriteRqst.Flush.xlogid = 0; WriteRqst.Flush.xrecoff = 0; ! _XLogWrite(WriteRqst, false, false); /* SIGRES */ LWLockRelease(WALWriteLock); Insert->LogwrtResult = LogwrtResult; } *************** *** 1463,1469 **** * Must be called with WALWriteLock held. */ static void ! XLogWrite(XLogwrtRqst WriteRqst, bool flexible, bool xlog_switch) { XLogCtlWrite *Write = &XLogCtl->Write; bool ispartialpage; --- 1477,1483 ---- * Must be called with WALWriteLock held. */ static void ! _XLogWrite(XLogwrtRqst WriteRqst, bool flexible, bool xlog_switch) { XLogCtlWrite *Write = &XLogCtl->Write; bool ispartialpage; *************** *** 1581,1587 **** if (XLOG_DEBUG) elog(LOG, "time for a checkpoint, signaling bgwriter"); #endif ! RequestCheckpoint(false, true); } } } --- 1595,1602 ---- if (XLOG_DEBUG) elog(LOG, "time for a checkpoint, signaling bgwriter"); #endif ! if (enableSigres == false) ! RequestCheckpoint(false, true); } } } *************** *** 1861,1871 **** * section. However, calls from bufmgr.c are not within critical sections * and so we will not force a restart for a bad LSN on a data page. */ ! if (XLByteLT(LogwrtResult.Flush, record)) ! elog(InRecovery ? WARNING : ERROR, ! "xlog flush request %X/%X is not satisfied --- flushed only to %X/%X", ! record.xlogid, record.xrecoff, ! LogwrtResult.Flush.xlogid, LogwrtResult.Flush.xrecoff); } /* --- 1876,1887 ---- * section. However, calls from bufmgr.c are not within critical sections * and so we will not force a restart for a bad LSN on a data page. */ ! if (enableSigres == false) ! if (XLByteLT(LogwrtResult.Flush, record)) ! elog(InRecovery ? WARNING : ERROR, ! "xlog flush request %X/%X is not satisfied --- flushed only to %X/%X", ! record.xlogid, record.xrecoff, ! LogwrtResult.Flush.xlogid, LogwrtResult.Flush.xrecoff); } /* *************** *** 2342,2347 **** --- 2358,2366 ---- #endif #endif /* NOT_USED */ + if (enableSigres == true) + issue_xlog_fsync(); + if (close(openLogFile)) ereport(PANIC, (errcode_for_file_access(), *************** *** 5420,5425 **** --- 5439,5447 ---- void ShutdownXLOG(int code, Datum arg) { + if (enableSigres == true) + enableFsync = true; + ereport(LOG, (errmsg("shutting down"))); *************** *** 5428,5433 **** --- 5450,5458 ---- ShutdownSUBTRANS(); ShutdownMultiXact(); + if (enableSigres == true) + XLogFileClose(); + ereport(LOG, (errmsg("database system is shut down"))); } *************** *** 6183,6190 **** * have different checkpoint positions and hence different history * file names, even if nothing happened in between. */ ! RequestCheckpoint(true, false); ! /* * Now we need to fetch the checkpoint record location, and also its * REDO pointer. The oldest point in WAL that would be needed to --- 6208,6215 ---- * have different checkpoint positions and hence different history * file names, even if nothing happened in between. */ ! if (enableSigres == false) ! RequestCheckpoint(true, false); /* * Now we need to fetch the checkpoint record location, and also its * REDO pointer. The oldest point in WAL that would be needed to diff -c -r postgresql-8.2.1/src/backend/commands/dbcommands.c sigres-0.1.2/src/backend/commands/dbcommands.c *** postgresql-8.2.1/src/backend/commands/dbcommands.c Mon Nov 6 07:42:08 2006 --- sigres-0.1.2/src/backend/commands/dbcommands.c Fri Feb 16 10:35:01 2007 *************** *** 59,64 **** --- 59,66 ---- static void remove_dbtablespaces(Oid db_id); static bool check_db_file_conflict(Oid db_id); + /* For Sigres */ + extern bool enableSigres; /* * CREATE DATABASE *************** *** 504,510 **** * Perhaps if we ever implement CREATE DATABASE in a less cheesy way, * we can avoid this. */ ! RequestCheckpoint(true, false); /* * Close pg_database, but keep lock till commit (this is important to --- 506,513 ---- * Perhaps if we ever implement CREATE DATABASE in a less cheesy way, * we can avoid this. */ ! if (enableSigres == false) ! RequestCheckpoint(true, false); /* * Close pg_database, but keep lock till commit (this is important to diff -c -r postgresql-8.2.1/src/backend/postmaster/bgwriter.c sigres-0.1.2/src/backend/postmaster/bgwriter.c *** postgresql-8.2.1/src/backend/postmaster/bgwriter.c Sat Dec 2 04:55:28 2006 --- sigres-0.1.2/src/backend/postmaster/bgwriter.c Fri Feb 16 11:18:06 2007 *************** *** 42,48 **** *------------------------------------------------------------------------- */ #include "postgres.h" - #include <signal.h> #include <time.h> #include <unistd.h> --- 42,47 ---- *************** *** 63,68 **** --- 62,75 ---- #include "utils/memutils.h" #include "utils/resowner.h" + #include "debug.h" + #include <sys/types.h> + #include <sys/stat.h> + #include <fcntl.h> + + /* For Sigres */ + extern int BgWriterPid; /* Entity is in "globals.c" */ + extern bool enableSigres; /* Entity is in "globals.c" */ /*---------- * Shared memory area for communication between bgwriter and backends *************** *** 171,176 **** --- 178,193 ---- BgWriterShmem->bgwriter_pid = MyProcPid; am_bg_writer = true; + if (enableSigres == true) + { + int fd; + + if ((fd = open(BGWRITER_PIDFILE, O_WRONLY|O_TRUNC|O_CREAT, 0644)) == -1) ERR; + if (write(fd, &MyProcPid, sizeof(int)) == -1) ERR; + if (close(fd) == -1) ERR; + BgWriterPid = MyProcPid; + } + /* * If possible, make this process a group leader, so that the postmaster * can signal any child processes too. (bgwriter probably never has *************** *** 321,326 **** --- 338,379 ---- */ PG_SETMASK(&UnBlockSig); + if (enableSigres == true) { + for (;;) { + if (!PostmasterIsAlive(true)) + exit(1); + + /* + * Process any requests or signals received recently. + */ + AbsorbFsyncRequests(); + + if (got_SIGHUP) + { + got_SIGHUP = false; + ProcessConfigFile(PGC_SIGHUP); + } + if (shutdown_requested) + { + /* + * From here on, elog(ERROR) should end with exit(1), not send + * control back to the sigsetjmp block above + */ + ExitOnAnyError = true; + /* Close down the database */ + ShutdownXLOG(0, 0); + DumpFreeSpaceMap(0, 0); + + remove(BGWRITER_PIDFILE); /* SIGRES */ + + /* Normal exit from the bgwriter is here */ + proc_exit(0); /* done */ + } + + usleep(100000L); + } + } + /* * Loop forever */ *************** *** 506,512 **** } } - /* -------------------------------- * signal handler routines * -------------------------------- --- 559,564 ---- diff -c -r postgresql-8.2.1/src/backend/postmaster/postmaster.c sigres-0.1.2/src/backend/postmaster/postmaster.c *** postgresql-8.2.1/src/backend/postmaster/postmaster.c Thu Jan 4 09:58:01 2007 --- sigres-0.1.2/src/backend/postmaster/postmaster.c Fri Feb 16 09:48:10 2007 *************** *** 122,127 **** --- 122,129 ---- #include "storage/spin.h" #endif + /* For Sigres */ + extern bool enableSigres; /* Entity is in "globals.c" */ /* * List of active backends (or child processes anyway; we don't actually *************** *** 960,965 **** --- 962,970 ---- */ StartupPID = StartupDataBase(); + if (enableSigres == true) + enableFsync = false; + status = ServerLoop(); /* *************** *** 2841,2846 **** --- 2846,2854 ---- char protobuf[32]; int i; + if (enableSigres == true) + enableFsync = false; + /* * Don't want backend to be able to see the postmaster random number * generator state. We have to clobber the static random_seed *and* start diff -c -r postgresql-8.2.1/src/backend/utils/init/globals.c sigres-0.1.2/src/backend/utils/init/globals.c *** postgresql-8.2.1/src/backend/utils/init/globals.c Wed Oct 4 09:30:02 2006 --- sigres-0.1.2/src/backend/utils/init/globals.c Fri Feb 16 10:35:00 2007 *************** *** 109,111 **** --- 109,115 ---- bool VacuumCostActive = false; int GinFuzzySearchLimit = 0; + + /* Sigres Parameters */ + int BgWriterPid = 0; /* Must be extern */ + bool enableSigres = false; diff -c -r postgresql-8.2.1/src/backend/utils/misc/guc.c sigres-0.1.2/src/backend/utils/misc/guc.c *** postgresql-8.2.1/src/backend/utils/misc/guc.c Wed Nov 29 23:50:07 2006 --- sigres-0.1.2/src/backend/utils/misc/guc.c Fri Feb 16 09:30:42 2007 *************** *** 521,526 **** --- 521,534 ---- true, NULL, NULL }, { + {"sigres", PGC_SIGHUP, WAL_SETTINGS, + gettext_noop("Ignoring XLogWrite to accelerate INSERT/UPDATE operations with UPS ."), + gettext_noop("Without using UPS, enabling sigres makes your system dangerous.") + }, + &enableSigres, + false, NULL, NULL + }, + { {"zero_damaged_pages", PGC_SUSET, DEVELOPER_OPTIONS, gettext_noop("Continues processing past damaged page headers."), gettext_noop("Detection of a damaged page header normally causes PostgreSQL to " diff -c -r postgresql-8.2.1/src/backend/utils/misc/postgresql.conf.sample sigres-0.1.2/src/backend/utils/misc/postgresql.conf.sample *** postgresql-8.2.1/src/backend/utils/misc/postgresql.conf.sample Tue Nov 21 10:23:37 2006 --- sigres-0.1.2/src/backend/utils/misc/postgresql.conf.sample Fri Feb 16 09:22:08 2007 *************** *** 171,176 **** --- 171,183 ---- #archive_timeout = 0 # force a logfile segment switch after this # many seconds; 0 is off + # - Accelerating WAL *IF* you have UPS - + #sigres = off # turns sigres mode on or off + # in default, sigres == off + # without UPS, you never turn on sigres + # if sigres == on, then, + # both of "fsync" and "full_page_writes" + # are turned off. #--------------------------------------------------------------------------- # QUERY TUNING diff -c -r postgresql-8.2.1/src/bin/psql/sql_help.h sigres-0.1.2/src/bin/psql/sql_help.h *** postgresql-8.2.1/src/bin/psql/sql_help.h Sat Jan 6 14:22:23 2007 --- sigres-0.1.2/src/bin/psql/sql_help.h Sat Feb 10 23:57:54 2007 *************** *** 3,9 **** * *** generated from the DocBook documentation. * * generated by ! * /usr/local/bin/suidperl create_help.pl ../../../doc/src/sgml/ref sql_help.h * */ --- 3,9 ---- * *** generated from the DocBook documentation. * * generated by ! * /usr/bin/perl create_help.pl ../../../doc/src/sgml/ref sql_help.h * */ Only in sigres-0.1.2/src/include: debug.h diff -c -r postgresql-8.2.1/src/include/miscadmin.h sigres-0.1.2/src/include/miscadmin.h *** postgresql-8.2.1/src/include/miscadmin.h Fri Oct 20 03:32:47 2006 --- sigres-0.1.2/src/include/miscadmin.h Fri Feb 16 09:53:40 2007 *************** *** 202,207 **** --- 202,208 ---- extern bool allowSystemTableMods; extern DLLIMPORT int work_mem; extern DLLIMPORT int maintenance_work_mem; + extern bool enableSigres; /* SIGRES */ extern int VacuumCostPageHit; extern int VacuumCostPageMiss; *************** *** 324,326 **** --- 325,328 ---- extern void process_local_preload_libraries(void); #endif /* MISCADMIN_H */ + diff -c -r postgresql-8.2.1/src/include/postgres.h sigres-0.1.2/src/include/postgres.h *** postgresql-8.2.1/src/include/postgres.h Fri Jul 14 01:49:18 2006 --- sigres-0.1.2/src/include/postgres.h Fri Feb 16 10:38:45 2007 *************** *** 544,546 **** --- 544,552 ---- typedef int4 aclitem; /* PHONY definition for catalog use only */ #endif /* POSTGRES_H */ + + /* ---------------------------------------------------------------- + * Section 5: Sigres extention + * ---------------------------------------------------------------- + */ + #define BGWRITER_PIDFILE "/tmp/bgwriter.pid" diff -c -r postgresql-8.2.1/src/interfaces/ecpg/preproc/pgc.c sigres-0.1.2/src/interfaces/ecpg/preproc/pgc.c *** postgresql-8.2.1/src/interfaces/ecpg/preproc/pgc.c Sat Jan 6 14:22:23 2007 --- sigres-0.1.2/src/interfaces/ecpg/preproc/pgc.c Sat Feb 10 23:57:54 2007 *************** *** 1,9 **** #line 2 "pgc.c" ! /* A lexical scanner generated by flex */ /* Scanner skeleton version: * $Header: /home/daffy/u0/vern/flex/RCS/flex.skl,v 2.91 96/09/10 16:58:48 vern Exp $ - * $FreeBSD: src/usr.bin/lex/flex.skl,v 1.8 2004/01/06 19:03:44 nectar Exp $ */ #define FLEX_SCANNER --- 1,8 ---- #line 2 "pgc.c" ! /* A lexical scanner generated by flex*/ /* Scanner skeleton version: * $Header: /home/daffy/u0/vern/flex/RCS/flex.skl,v 2.91 96/09/10 16:58:48 vern Exp $ */ #define FLEX_SCANNER *************** *** 11,16 **** --- 10,16 ---- #define YY_FLEX_MINOR_VERSION 5 #include <stdio.h> + #include <unistd.h> /* cfront 1.2 defines "c_plusplus" instead of "__cplusplus" */ *************** *** 24,30 **** #ifdef __cplusplus #include <stdlib.h> - #include <unistd.h> /* Use prototypes in function declarations. */ #define YY_USE_PROTOS --- 24,29 ---- *************** *** 136,141 **** --- 135,149 ---- #define unput(c) yyunput( c, yytext_ptr ) + /* Some routines like yy_flex_realloc() are emitted as static but are + not called by all lexers. This generates warnings in some compilers, + notably GCC. Arrange to suppress these. */ + #ifdef __GNUC__ + #define YY_MAY_BE_UNUSED __attribute__((unused)) + #else + #define YY_MAY_BE_UNUSED + #endif + /* The following is because we cannot portably get our hands on size_t * (without autoconf's help, which isn't available because we want * flex-generated scanners to compile on their own). *************** *** 242,248 **** YY_BUFFER_STATE yy_scan_bytes YY_PROTO(( yyconst char *bytes, int len )); static void *yy_flex_alloc YY_PROTO(( yy_size_t )); ! static void *yy_flex_realloc YY_PROTO(( void *, yy_size_t )); static void yy_flex_free YY_PROTO(( void * )); #define yy_new_buffer yy_create_buffer --- 250,256 ---- YY_BUFFER_STATE yy_scan_bytes YY_PROTO(( yyconst char *bytes, int len )); static void *yy_flex_alloc YY_PROTO(( yy_size_t )); ! static void *yy_flex_realloc YY_PROTO(( void *, yy_size_t )) YY_MAY_BE_UNUSED; static void yy_flex_free YY_PROTO(( void * )); #define yy_new_buffer yy_create_buffer *************** *** 1340,1346 **** * Note that xcstart must appear before operator, as explained above! * Also whitespace (comment) must appear before operator. */ ! #line 1344 "pgc.c" /* Macros after this point can all be overridden by user definitions in * section 1. --- 1348,1354 ---- * Note that xcstart must appear before operator, as explained above! * Also whitespace (comment) must appear before operator. */ ! #line 1352 "pgc.c" /* Macros after this point can all be overridden by user definitions in * section 1. *************** *** 1488,1494 **** YY_DECL { register yy_state_type yy_current_state; ! register char *yy_cp, *yy_bp; register int yy_act; #line 324 "pgc.l" --- 1496,1502 ---- YY_DECL { register yy_state_type yy_current_state; ! register char *yy_cp = NULL, *yy_bp = NULL; register int yy_act; #line 324 "pgc.l" *************** *** 1499,1505 **** token_start = NULL; ! #line 1503 "pgc.c" if ( yy_init ) { --- 1507,1513 ---- token_start = NULL; ! #line 1511 "pgc.c" if ( yy_init ) { *************** *** 1559,1565 **** yy_find_action: yy_current_state = *--yy_state_ptr; yy_lp = yy_accept[yy_current_state]; - goto find_rule; /* avoid `defined but not used' warning */ find_rule: /* we branch to this label when backing up */ for ( ; ; ) /* until we find what rule we matched */ { --- 1567,1572 ---- *************** *** 2961,2967 **** #line 1181 "pgc.l" YY_FATAL_ERROR( "flex scanner jammed" ); YY_BREAK ! #line 2965 "pgc.c" case YY_END_OF_BUFFER: { --- 2968,2974 ---- #line 1181 "pgc.l" YY_FATAL_ERROR( "flex scanner jammed" ); YY_BREAK ! #line 2972 "pgc.c" case YY_END_OF_BUFFER: { *************** *** 3337,3342 **** --- 3344,3350 ---- #endif /* ifndef YY_NO_UNPUT */ + #ifndef YY_NO_INPUT #ifdef __cplusplus static int yyinput() #else *************** *** 3410,3416 **** return c; } ! #ifdef YY_USE_PROTOS void yyrestart( FILE *input_file ) --- 3418,3424 ---- return c; } ! #endif /* YY_NO_INPUT */ #ifdef YY_USE_PROTOS void yyrestart( FILE *input_file ) *************** *** 3521,3531 **** } - #ifndef YY_ALWAYS_INTERACTIVE - #ifndef YY_NEVER_INTERACTIVE - extern int isatty YY_PROTO(( int )); - #endif - #endif #ifdef YY_USE_PROTOS void yy_init_buffer( YY_BUFFER_STATE b, FILE *file ) --- 3529,3534 ---- diff -c -r postgresql-8.2.1/src/interfaces/libpq/libpq.rc sigres-0.1.2/src/interfaces/libpq/libpq.rc *** postgresql-8.2.1/src/interfaces/libpq/libpq.rc Sat Jan 6 14:22:19 2007 --- sigres-0.1.2/src/interfaces/libpq/libpq.rc Fri Feb 16 11:25:08 2007 *************** *** 1,8 **** #include <winver.h> VS_VERSION_INFO VERSIONINFO ! FILEVERSION 8,2,1,7006 ! PRODUCTVERSION 8,2,1,7006 FILEFLAGSMASK 0x3fL FILEFLAGS 0 FILEOS VOS__WINDOWS32 --- 1,8 ---- #include <winver.h> VS_VERSION_INFO VERSIONINFO ! FILEVERSION 8,2,1,7047 ! PRODUCTVERSION 8,2,1,7047 FILEFLAGSMASK 0x3fL FILEFLAGS 0 FILEOS VOS__WINDOWS32
Christopher Browne wrote: > A long time ago, in a galaxy far, far away, kawasima@cs.tsukuba.ac.jp (Hideyuki Kawashima) wrote: >> I appreciate your great suggestion! >> It is great honor for me if Sigres will be merged to PostgreSQL. >> Since the changes of Sigres from PostgreSQL-8.2.1 are not many, >> and moreover, all of changes are surrounded with #ifdef SIGRES --- #endif, >> incorporating Sigres into PostgreSQL would be easy. > > You should consider submitting a patch for this against CVS HEAD. > > And actually, I'd think it a better idea to define a GUC variable and > use that to control whether Sigres is active or not. > > At the more sophisticated end of the spectrum, you might set things up > so that it could be activated/deactivated at runtime by a superuser. > > At the less sophisticated end, it might need to be configured in > postgresql.conf... Whatever happen with this? Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
Joshua D. Drake wrote: > Christopher Browne wrote: > > A long time ago, in a galaxy far, far away, kawasima@cs.tsukuba.ac.jp (Hideyuki Kawashima) wrote: > >> I appreciate your great suggestion! > >> It is great honor for me if Sigres will be merged to PostgreSQL. > >> Since the changes of Sigres from PostgreSQL-8.2.1 are not many, > >> and moreover, all of changes are surrounded with #ifdef SIGRES --- #endif, > >> incorporating Sigres into PostgreSQL would be easy. > > > > You should consider submitting a patch for this against CVS HEAD. > > > > And actually, I'd think it a better idea to define a GUC variable and > > use that to control whether Sigres is active or not. > > > > At the more sophisticated end of the spectrum, you might set things up > > so that it could be activated/deactivated at runtime by a superuser. > > > > At the less sophisticated end, it might need to be configured in > > postgresql.conf... > > Whatever happen with this? I would like to see more analysis about why Sigres is faster than an in-memory file system. I think the idea was that locking was reduced but I am unclear on why locking is different in the two cases. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Mon, 2007-02-26 at 09:44 -0500, Bruce Momjian wrote: > Joshua D. Drake wrote: > > Christopher Browne wrote: > > > A long time ago, in a galaxy far, far away, kawasima@cs.tsukuba.ac.jp (Hideyuki Kawashima) wrote: > > >> I appreciate your great suggestion! > > >> It is great honor for me if Sigres will be merged to PostgreSQL. > > >> Since the changes of Sigres from PostgreSQL-8.2.1 are not many, > > >> and moreover, all of changes are surrounded with #ifdef SIGRES --- #endif, > > >> incorporating Sigres into PostgreSQL would be easy. > > > > > > You should consider submitting a patch for this against CVS HEAD. > > > > > > And actually, I'd think it a better idea to define a GUC variable and > > > use that to control whether Sigres is active or not. > > > > > > At the more sophisticated end of the spectrum, you might set things up > > > so that it could be activated/deactivated at runtime by a superuser. > > > > > > At the less sophisticated end, it might need to be configured in > > > postgresql.conf... > > > > Whatever happen with this? > > I would like to see more analysis about why Sigres is faster than an > in-memory file system. I think the idea was that locking was reduced > but I am unclear on why locking is different in the two cases. Reading through the design, I see the following - bgwriter performs XLogWrite, not each backend - WAL fsync is only performed when WAL file fills - no checkpoints are performed until shutdown This is approximately the same performance as fsync=off, with a big boost because we never do checkpoints either (i.e. 10% as stated by the OP). There is a slight gain because of reduced lock contention, but I wouldn't expect that to be major. This comes back to the idea of deferred fsync we've spoken about a few times, to provide "MySQL mode" performance and reduced robustness guarantees. The idea of having a special backend perform the XLogWrites rather than the individual backends is already a TODO item, even if this implementation is somewhat more extreme in its approach to fsync delay. The actual potential data loss is similar to an archive recovery though, so I do like that touch - i.e. we might lose everything in the current WAL file. IMHO, it shouldn't be the bgwriter that fsyncs WAL because it cannot clean shared_buffers and fsync the WAL concurrently with any effectiveness. Perhaps such infrequent fsyncs mean thats not a problem. Not checkpointing at all is not a good plan, since this will lead to an enormous build up of WAL files and a very long recovery time if the system does fail. Overall, these choices effect the whole system whereas what I think we need is something that can be applied to just certain transactions or tables. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
"Simon Riggs" <simon@2ndquadrant.com> writes: > Reading through the design, I see the following > - bgwriter performs XLogWrite, not each backend > - WAL fsync is only performed when WAL file fills > - no checkpoints are performed until shutdown > Not checkpointing at all is not a good plan, since this will lead to an > enormous build up of WAL files and a very long recovery time if the > system does fail. Indeed, that seems utterly unacceptable; moreover it's hard to see how you get a noticeable performance win, compared to a sane checkpoint interval of some-fraction-of-an-hour. regards, tom lane
Simon, > Not checkpointing at all is not a good plan, since this will lead to an > enormous build up of WAL files and a very long recovery time if the > system does fail. I appreciate your detailed comments. Following your comments, I revised the problem. Sigres-0.1.3 does checkpointings. In summary, the features of Sigres-0.1.3 are as follows. 0: 10% faster than conventional PostgreSQL under tmpfs. 1: Checkpointings are continually executed. 2: Sigres mode is in default (the mode can be turned off via postgresql.conf). 3: issue_xlog_sync is called only by bgwriter (continually, via createcheckpoint) 4: The entity of XLogWrite (_XLogWrite in my code) is called by both backends and a bgwriter. For each backend, _XLogWrite is called only via AdvanceXLInsertBuffer. For a bgwriter, _XLogWrite is called via CreateCheckPoint. Please try it if you have interest. http://sourceforge.jp/projects/sigres/ Again, I really appreciate beneficial comments from this community ! Regards, -- Hideyuki -- Hideyuki Kawashima (Ph.D.) University of Tsukuba Assistant Professor
Joshua D. Drake wrote: > From an deployable application perspective, this could be a big deal. We > are already starting to see very large traction in the Win32 desktop app > arena. > There seem to be a few overlapping proposals in terms of reducing various guarantees in the name of performance. As more and more options are added that affect integrity (fsync, full page writes, commit nowait, sigres) it might be nice to outline and compare the approaches, and particularly to describe clearly the failure scenarios and how they are significantly different from one another. One potentially needs to track an increasing number of ways in which items might be set which reduce certain guarantees on data integrity which is unpleasant. If a setting is wrong on a performance knob, no problem, when there are complaints things are slow you can go through and adjust them. The same is not true of data consistency. When the complaint comes it is usually too late to fiddle with knobs. I'm just thinking some caution should be exercised in adding too many of them in the first place. I happen to love COMMIT NOWAIT though, for many, this replaces fsync=off. - August
I am still unclear why sigres is better than a temporary file system. I relize your patch is faster, but what is about your patch that makes it faster. And if we were going to add such capability, we would name it based on what it does, rather than on a 'sigres' mode. --------------------------------------------------------------------------- Hideyuki Kawashima wrote: > Simon, > > > Not checkpointing at all is not a good plan, since this will lead to an > > enormous build up of WAL files and a very long recovery time if the > > system does fail. > > I appreciate your detailed comments. > Following your comments, I revised the problem. > Sigres-0.1.3 does checkpointings. > > In summary, the features of Sigres-0.1.3 are as follows. > 0: 10% faster than conventional PostgreSQL under tmpfs. > 1: Checkpointings are continually executed. > 2: Sigres mode is in default (the mode can be turned off via postgresql.conf). > 3: issue_xlog_sync is called only by bgwriter (continually, via > createcheckpoint) > 4: The entity of XLogWrite (_XLogWrite in my code) is called by both > backends and a bgwriter. > For each backend, _XLogWrite is called only via AdvanceXLInsertBuffer. > For a bgwriter, _XLogWrite is called via CreateCheckPoint. > > Please try it if you have interest. > http://sourceforge.jp/projects/sigres/ > > Again, I really appreciate beneficial comments from this community ! > > Regards, > > -- Hideyuki > > -- > Hideyuki Kawashima (Ph.D.) > University of Tsukuba > Assistant Professor > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
I will write a technical document about Sigres in a week. Hideyuki Bruce Momjian wrote: > I am still unclear why sigres is better than a temporary file system. I > relize your patch is faster, but what is about your patch that makes it > faster. > > And if we were going to add such capability, we would name it based on > what it does, rather than on a 'sigres' mode. > > --------------------------------------------------------------------------- > > Hideyuki Kawashima wrote: > >> Simon, >> >> >>> Not checkpointing at all is not a good plan, since this will lead to an >>> enormous build up of WAL files and a very long recovery time if the >>> system does fail. >>> >> I appreciate your detailed comments. >> Following your comments, I revised the problem. >> Sigres-0.1.3 does checkpointings. >> >> In summary, the features of Sigres-0.1.3 are as follows. >> 0: 10% faster than conventional PostgreSQL under tmpfs. >> 1: Checkpointings are continually executed. >> 2: Sigres mode is in default (the mode can be turned off via postgresql.conf). >> 3: issue_xlog_sync is called only by bgwriter (continually, via >> createcheckpoint) >> 4: The entity of XLogWrite (_XLogWrite in my code) is called by both >> backends and a bgwriter. >> For each backend, _XLogWrite is called only via AdvanceXLInsertBuffer. >> For a bgwriter, _XLogWrite is called via CreateCheckPoint. >> >> Please try it if you have interest. >> http://sourceforge.jp/projects/sigres/ >> >> Again, I really appreciate beneficial comments from this community ! >> >> Regards, >> >> -- Hideyuki >> >> -- >> Hideyuki Kawashima (Ph.D.) >> University of Tsukuba >> Assistant Professor >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 3: Have you checked our extensive FAQ? >> >> http://www.postgresql.org/docs/faq >> > > -- Hideyuki Kawashima (Ph.D), University of Tsukuba, Graduate School of Systems and Information Engineering Assistant Professor, TEL: +81-29-853-5322