Thread: Re: [HACKERS] Priorities for 6.6
I obtained Oracle for Linux and did some comparisons with PostgreSQL 6.5 current using the test suite I mentioned before that is good to create heavy loads. With default postmaster settings (postmaster -S -i), PostgreSQL was several times slower than Oracle. However with -F (postmaster -S -i -o '-F'), PostgreSQL was much faster than the default settings. Yes, this is well known behavior of PostgreSQL. Without -F PostgreSQL does fsync() every time a transaction is committed, and it is the bottle neck of the performance. I observed the disk activity LED almost always on while running PostgreSQL without -F. However with -F, there may be a chance that we loose committed data if the computer gets crashed. On the other hand the LED was on only every few secs while running Oracle. I heard that Oracle has a "REDO log file" and a log is written into there when a transaction is committed. If so, apparently Oracle does not issue sync() or fsync() every time a transaction gets committed. I don't know how Oracle guarantees the log be written into the disk without sync() or fsync() at the commit time, but seems something like it is one of the most important technique to enhance the performance of PostgreSQL. Does anybody have an idea on this? --- Tatsuo Ishii
At 08:50 AM 6/5/99 +0900, Tatsuo Ishii wrote: > >On the other hand the LED was on only every few secs while running >Oracle. I heard that Oracle has a "REDO log file" and a log is written >into there when a transaction is committed. If so, apparently Oracle >does not issue sync() or fsync() every time a transaction gets >committed. I don't know how Oracle guarantees the log be written into >the disk without sync() or fsync() at the commit time, but seems >something like it is one of the most important technique to enhance >the performance of PostgreSQL. >Does anybody have an idea on this? It's a well-known bug in the current Oracle release for Linux, the redo log is supposed to be fsynch'd on commitment. Oracle does fsynch on other Unices. It will be interesting to see if the upcoming 8.1.5 (or "8i", "i" for internet, as it's called) will have the bug fixed. This still won't cause a lot of disk thrashing in a recommended Oracle installation as the redo log should be on a separate spindle from any db spindles, and Oracle grabs the entire file when the db's created in order to increase the odds that the file will be one sequential series of blocks (of course, real Oracle studs use raw disks in which case the db can guarantee serial block writing). There's a separate demeon hanging around that writes dirty database pages to the disk at its leisure. Of course, if I've understood past postings to this list, Postgres also fsynch's after read-only selects, too, and my own experience would seem to confirm it (putting a string of selects in a transaction makes the disk get quiet, just as it does with inserts). I can guarantee that Oracle NEVER does that :) - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, and other goodies at http://donb.photo.net
Don Baccus <dhogaza@pacifier.com> writes: > Of course, if I've understood past postings to this list, > Postgres also fsynch's after read-only selects, too, I recently learned something about this that I hadn't understood before. When a tuple is written out during an insert/update transaction, it is marked as not definitely committed (since of course Postgres can't know whether you'll abort the transaction later). The ID of the transaction that wrote it is stored with it. Subsequently, whenever the tuple is scanned, the backend has to go to the "transaction log" to see if that transaction has been committed yet --- if not, it ignores the tuple. As soon as the transaction is known to be committed, the next operation that visits that tuple will mark it as "known committed", so as to avoid future consultations of the transaction log. This happens *even if the current operation is a select*. That is why selects can cause disk writes in Postgres. Similar things happen when a tuple is replaced or deleted, of course. In short, if you load a bunch of tuples into a table, the first select after the load can run a lot slower than you might expect, because it'll be writing back most or all of the pages it touches. But that penalty doesn't affect every select, only the first one to scan a newly-written tuple. regards, tom lane
At 11:31 AM 6/5/99 -0400, Tom Lane wrote: >In short, if you load a bunch of tuples into a table, the first select >after the load can run a lot slower than you might expect, because it'll >be writing back most or all of the pages it touches. But that penalty >doesn't affect every select, only the first one to scan a newly-written >tuple. While I don't doubt your analysis is correct for the case you've uncovered, it doesn't explain why surrounding a bunch of selects with a begin/end block greatly descreases disk activity for tables that don't change. I'm pulling out "select" lists (html <select>) from small tables of counties, states, countries for the project I'm working on. The two countries, for instance, are "USA" and "CA" and the table's not been updated in two months :). I'm building a form and doing a very simple "select * from county_locales" type selects, then building a <select> list containing all of the possible values (not as many as you might think, this project involves only the Pacific Northwest). There are several of these selects executed for each form. Without the transaction block, there's a lot of disk activity. With it, much less. I can go pull out the begin/end blocks, they're conditionalized in my Tcl scripts based on a "postgres" predicate so they'll disappear if I migrate the database to another engine. Maybe I'll have time this afternoon, if you'd like me to confirm, I'm going to a brunch right now... - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, and other goodies at http://donb.photo.net
> In short, if you load a bunch of tuples into a table, the first select > after the load can run a lot slower than you might expect, because it'll > be writing back most or all of the pages it touches. But that penalty > doesn't affect every select, only the first one to scan a newly-written > tuple. I have removed this from the TODO list: * Prevent fsync in SELECT-only queries -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Don Baccus <dhogaza@pacifier.com> writes: > While I don't doubt your analysis is correct for the case you've > uncovered, it doesn't explain why surrounding a bunch of selects > with a begin/end block greatly descreases disk activity for tables > that don't change. Hmm, I'm not sure why that should be, either. Anyone? regards, tom lane
Tom Lane wrote: > > Don Baccus <dhogaza@pacifier.com> writes: > > While I don't doubt your analysis is correct for the case you've > > uncovered, it doesn't explain why surrounding a bunch of selects > > with a begin/end block greatly descreases disk activity for tables > > that don't change. > > Hmm, I'm not sure why that should be, either. Anyone? >From a recent discussion I remember that every block that is read in is marked as dirty, regardless of weather it is modified or not. It is not a genuine bug (as it only slows thong down instead of getting wrong results), but still a misfeature. It is most likely an ancient quickfix for some execution path that failed to set the dirty mark when it should have. --------------------- Hannu
Hannu Krosing <hannu@trust.ee> writes: >> Hmm, I'm not sure why that should be, either. Anyone? > From a recent discussion I remember that every block that is read > in is marked as dirty, regardless of weather it is modified or not. No, that was me claiming that, on the basis of a profile I had taken that showed an unreasonably large number of writes --- but the case I was profiling was a selective UPDATE on a table that had just been loaded. When I repeated the test, the number of writes decreased to the right ballpark. I am not sure what effect Don is seeing, but I don't think it's quite as dumb a mistake as that... regards, tom lane
At 04:58 PM 6/5/99 -0400, Tom Lane wrote: >I am not sure what effect Don is seeing, but I don't think it's quite >as dumb a mistake as that... If you want, I can wait until the 6.5 release is out, then play some more to make sure I can make the disk thrash with old tables. This certainly isn't the kind of thing that deserves rush treatment. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, and other goodies at http://donb.photo.net
At 11:51 PM 6/5/99 +0300, Hannu Krosing wrote: >It is not a genuine bug (as it only slows thong down instead of >getting wrong results), but still a misfeature. Well, it depends on how one defines "bug", I suppose :) In the strictest sense you're correct, yet for real world use, particularly in environments with high traffic, it's a killer. >It is most likely an ancient quickfix for some execution path that >failed to set the dirty mark when it should have. Yep, I remember this from the earlier conversation, too. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, and other goodies at http://donb.photo.net
> While I don't doubt your analysis is correct for the case you've > uncovered, it doesn't explain why surrounding a bunch of selects > with a begin/end block greatly descreases disk activity for tables > that don't change. I'm pulling out "select" lists (html <select>) > from small tables of counties, states, countries for the project > I'm working on. The two countries, for instance, are "USA" and > "CA" and the table's not been updated in two months :). I'm > building a form and doing a very simple "select * from county_locales" > type selects, then building a <select> list containing all of the > possible values (not as many as you might think, this project > involves only the Pacific Northwest). There are several of > these selects executed for each form. Without the transaction > block, there's a lot of disk activity. With it, much less. > > I can go pull out the begin/end blocks, they're conditionalized > in my Tcl scripts based on a "postgres" predicate so they'll > disappear if I migrate the database to another engine. Maybe > I'll have time this afternoon, if you'd like me to confirm, I'm > going to a brunch right now... PostgreSQL writes into pg_log each time a transaction gets committed even if it is a read only one. Once whatever file writings happen in the transaction, fsync() would be forced at the commit time. Probably that's why you observe less disk activity when you surround some selects in begin/end blocks. By the way, may I ask more question regarding Oracle? You mentioned the magic of no-fsync in Oracle is actually a bug. Ok, I understand. I also heard that Oracle does some kind of redo-log bufferings. Does this mean certain committed data might be lost if the system crashed before the buffered data is written into the disk? --- Tatsuo Ishii
> By the way, may I ask more question regarding Oracle? You mentioned > the magic of no-fsync in Oracle is actually a bug. Ok, I understand. I > also heard that Oracle does some kind of redo-log bufferings. Does > this mean certain committed data might be lost if the system crashed > before the buffered data is written into the disk? That is my guess. Informix does that. No run runs with non-buffered logging. They run with buffered logging, which may loose transactions for a few seconds or minutes before a crash. I think we need that, and it should be the default, but few people agree with me. I have some schemes to do this. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> I think we need that, and it should be the default, but few people agree > with me. I have some schemes to do this. I believe you're absolutely right. To most people, performance matters more than security in a system break down. After all we're talking Linux, FreeBSD and other systems here. And if people worry, they can buy UPS'es, duplicate hardware and stuff. It's extremely rare that the hardware needs to fail. To counter this, I think Postgresql needs some roll forward mechanism. Maybe that's what Vadim means with savepoints? Now we're at the Enterprise end, i could add that companies need hot backup. And if you include the parallelizing server I believe the commercial community will be served very well. I was at a seminar last week where Oracle bragged about 8i. Maybe Postgresql some time in the future could have hooks for other languages? I know there's a PL-thing and a C-thing, but I would personally like a Perl interface.
On a personal note, I hope that outer joins and views on unions will get attention in 6.6. Ryan Bradetich is working on views. Maybe I can get my wish in 6.6? If Tom's idea about removing the 8K tuble limit and Bruce's idea about relaxed sync'ing will make it into the next release, it should be version 7.0 in my opinion.
Kaare Rasmussen wrote: > > I was at a seminar last week where Oracle bragged about 8i. Maybe > Postgresql some time in the future could have hooks for other > languages? I know there's a PL-thing and a C-thing, but I would > personally like a Perl interface. The hooks are already in place, thanks to Jan. He started by Tcl first and PL only after that. It should be quite possible to add others with not too much work. Hannu
Tom Lane wrote: > > Don Baccus <dhogaza@pacifier.com> writes: > > Of course, if I've understood past postings to this list, > > Postgres also fsynch's after read-only selects, too, > > I recently learned something about this that I hadn't understood before. > When a tuple is written out during an insert/update transaction, it is > marked as not definitely committed (since of course Postgres can't know > whether you'll abort the transaction later). The ID of the transaction > that wrote it is stored with it. Subsequently, whenever the tuple is > scanned, the backend has to go to the "transaction log" to see if that > transaction has been committed yet --- if not, it ignores the tuple. > > As soon as the transaction is known to be committed, the next operation > that visits that tuple will mark it as "known committed", so as to avoid > future consultations of the transaction log. This happens *even if the > current operation is a select*. That is why selects can cause disk > writes in Postgres. Right. But we could avoid fsync for such write operation, i.e. do write call but not fsync. This will not avoid real disk writes but select will not wait for them. Vadim
Tom Lane wrote: > > Don Baccus <dhogaza@pacifier.com> writes: > > While I don't doubt your analysis is correct for the case you've > > uncovered, it doesn't explain why surrounding a bunch of selects > > with a begin/end block greatly descreases disk activity for tables > > that don't change. > > Hmm, I'm not sure why that should be, either. Anyone? pg_log fsync for read-only xactions... And more of that, commit fsyncs ALL dirty buffers in pool, even dirtied not by xaction being committed! Vadim
Bruce Momjian wrote: > > > In short, if you load a bunch of tuples into a table, the first select > > after the load can run a lot slower than you might expect, because it'll > > be writing back most or all of the pages it touches. But that penalty > > doesn't affect every select, only the first one to scan a newly-written > > tuple. > > I have removed this from the TODO list: > > * Prevent fsync in SELECT-only queries When selecting (i.e. - read-only) transaction commits, it change pg_log - we obviously can avoid this! No sense to store commit/abort status of read-only xactions! Vadim
Hannu Krosing wrote: > > Tom Lane wrote: > > > > Don Baccus <dhogaza@pacifier.com> writes: > > > While I don't doubt your analysis is correct for the case you've > > > uncovered, it doesn't explain why surrounding a bunch of selects > > > with a begin/end block greatly descreases disk activity for tables > > > that don't change. > > > > Hmm, I'm not sure why that should be, either. Anyone? > > >From a recent discussion I remember that every block that is read > in is marked as dirty, regardless of weather it is modified or not. No! Vadim
Kaare Rasmussen wrote: > > > I think we need that, and it should be the default, but few people agree > > with me. I have some schemes to do this. I remember this, Bruce. But I would like to see it implemented in right way. I'm not happy with "two sync() in postmaster" idea. We have to implement Shared Catalog Cache (SCC), mark all dirtied relation files there and than just fsync() these files, before fsync() of pg_log. > To counter this, I think Postgresql needs some roll forward mechanism. > Maybe that's what Vadim means with savepoints? Now we're at the No. Savepoints are short-term things, living during xaction. Vadim
> The hooks are already in place, thanks to Jan. > He started by Tcl first and PL only after that. > It should be quite possible to add others with not too much work. Explain a bit more - I'd like to have a Perl interface. It has to be added by some of the clever postgresql hackers? A non-C-speaking individual like me can't do it, right?
> > > In short, if you load a bunch of tuples into a table, the first select > > after the load can run a lot slower than you might expect, because it'll > > be writing back most or all of the pages it touches. But that penalty > > doesn't affect every select, only the first one to scan a newly-written > > tuple. > > I have removed this from the TODO list: > > * Prevent fsync in SELECT-only queries I think this entry should stay. In fact, there is a write on every transaction that commits/aborts even if it's one that doesn't modify any data. pg_log is written for SELECT only transactions too. I'm nearly 99.5% sure that not fsync()'ing those transaction would not hit reliability and we might have to work it out. This might be one cause that surrounding a bunch of SELECT statements by BEGIN/END speeds up PostgreSQL in non -F mode. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
> > Tom Lane wrote: > > > > Don Baccus <dhogaza@pacifier.com> writes: > > > While I don't doubt your analysis is correct for the case you've > > > uncovered, it doesn't explain why surrounding a bunch of selects > > > with a begin/end block greatly descreases disk activity for tables > > > that don't change. > > > > Hmm, I'm not sure why that should be, either. Anyone? > > >From a recent discussion I remember that every block that is read > in is marked as dirty, regardless of weather it is modified or not. > > It is not a genuine bug (as it only slows thong down instead of > getting wrong results), but still a misfeature. > > It is most likely an ancient quickfix for some execution path that > failed to set the dirty mark when it should have. Can't believe that this is true - uhhhhhh! If it is, then it's surely a severe BUG! Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
> > > By the way, may I ask more question regarding Oracle? You mentioned > > the magic of no-fsync in Oracle is actually a bug. Ok, I understand. I > > also heard that Oracle does some kind of redo-log bufferings. Does > > this mean certain committed data might be lost if the system crashed > > before the buffered data is written into the disk? > > That is my guess. Informix does that. No run runs with non-buffered > logging. They run with buffered logging, which may loose transactions > for a few seconds or minutes before a crash. > > I think we need that, and it should be the default, but few people agree > with me. I have some schemes to do this. The major problem in this area is, that with the given model of telling which tuples are committed, noone can guarantee a consistent PostgreSQL database in the case of a non-fsynced crash. You might loose some tuples and might get some outdated ones back. But it depends on subsequently executed SELECT's which ones and it all doesn't have anything to do with transaction boundaries or with the order in which transactions committed. As I understand Oracle the entire reliability depends on the redo logs. If a crash is too badly, you can allways restore the last backup and recover from that. The database crash recovery will roll forward until the last COMMIT that occurs in the redolog (except for point in time recovery). Someone can live with the case, that the last COMMIT's (sorted by time) cannot get recovered. But noone can live with a database that's left corrupt. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
At 12:22 PM 6/6/99 +0900, Tatsuo Ishii wrote: >By the way, may I ask more question regarding Oracle? You mentioned >the magic of no-fsync in Oracle is actually a bug. Ok, I understand. I >also heard that Oracle does some kind of redo-log bufferings. Does >this mean certain committed data might be lost if the system crashed >before the buffered data is written into the disk? Not sure, actually, I'm by no means an Oracle expert, I was just passing alone information gleaned from the Oracle/linux newsgroup. You can access this via the main Oracle website, go to the Oracle Technology Network and register, much as you did to download your developer's copy of the db engine. Some very experienced Oracle types hang out there. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, and other goodies at http://donb.photo.net
It would be cool to have Perl interface to postgres internals! Unfortunately I'm not C programmer. I think Edmund could do this. Regards, Oleg On Sun, 6 Jun 1999, Kaare Rasmussen wrote: > Date: Sun, 6 Jun 1999 16:35:14 +0200 (CEST) > From: Kaare Rasmussen <kar@webline.dk> > To: pgsql-hackers@postgreSQL.org > Subject: Re: [HACKERS] Priorities for 6.6 > > > The hooks are already in place, thanks to Jan. > > He started by Tcl first and PL only after that. > > It should be quite possible to add others with not too much work. > > Explain a bit more - I'd like to have a Perl interface. It has to be > added by some of the clever postgresql hackers? A non-C-speaking > individual like me can't do it, right? > > _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
> > It would be cool to have Perl interface to postgres internals! > Unfortunately I'm not C programmer. I think Edmund could do this. Now I've got a sequence overflow counting the requests for something like PL/Perl! Even if I can't believe, could it be that there are too many Perl users vs. Perl developers? Several times I clearly said that I don't know much more about Perl than that it's YASL (Yet Another Scripting Language). But ... 1. I've designed and implemented the PL interface into the PostgreSQL function manager including the CREATE/DROP PROCEDURAL LANGUAGE statements. 2. I've added the procedural languages PL/Tcl and PL/pgSQL, which are described in the programmers manual and which are both used since v6.4 outside in the world. 3. I've offered help for building PL/Perl several times now. But only "it would be nice if someone else ..." requests are coming in. Some years ago I searched for a general purpose scripting language. I found Tcl, which has a graphical user interface (Tk) and is supported on the platforms I need (all UN*X and (for sake) Windows-NT/95). Since then I've created a couple of Tcl interfaces to things, it cannot do by default (like SAP remote function calls - not available as open source so don't call for :-( ). The simpleness it uses for interfacing foreign things is why it's called the "Tool Command Language". This simpleness gave me the power to create PL/Tcl. PL/pgSQL was my answer to requests about a native language that doesn't depend on any other thing installed on a PostgreSQL target system. To explain point 3 in detail: I still feel responsible for the function managers procedural language interface -- since I created it. BUT I WOULDN'T LEARN PERL PLUS IT'S API ONLY TO PROVIDE PL/Perl TO THE {U|LOO}SER-COMMUNITY! That would mean to get responsible for one more thing I don't need for myself. If there's (only one) Perl PROGRAMMER out in the world reading this, who does see a (however small) possibility to explain how to integrate a Perl interpreter into PostgreSQL, RESPOND!!!!!!!!!!! I'll let y'all know about the responses I got. Even if I don't expect a single one where a PL/Perl could result from. Maybe Perl isn't the scripting language someone should choose because it is too limited in it's capabilities - remember that real programmers don't use pascal... - maybe real programmer wouldn't ever use Perl... Maybe - (please don't) - Jan > > Regards, > > Oleg > > > On Sun, 6 Jun 1999, Kaare Rasmussen wrote: > > > Date: Sun, 6 Jun 1999 16:35:14 +0200 (CEST) > > From: Kaare Rasmussen <kar@webline.dk> > > To: pgsql-hackers@postgreSQL.org > > Subject: Re: [HACKERS] Priorities for 6.6 > > > > > The hooks are already in place, thanks to Jan. > > > He started by Tcl first and PL only after that. > > > It should be quite possible to add others with not too much work. > > > > Explain a bit more - I'd like to have a Perl interface. It has to be > > added by some of the clever postgresql hackers? A non-C-speaking > > individual like me can't do it, right? -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
Jan Wieck wrote: > > > > > It would be cool to have Perl interface to postgres internals! > > Unfortunately I'm not C programmer. I think Edmund could do this. > > Now I've got a sequence overflow counting the requests for > something like PL/Perl! > [personal view on perl and tcl deleted] Jan: I've been looking for a project to get me active in the postgresql community after lurking since before it was (officially) PostgreSQL. I will do the PL/Perl interface. Perl is a great integration tool. This can be seen from the enormous growth it its use in such areas as CGI programming. And imbedding functionality into perl from other sources is rarely a hard problem. But embedding perl in other applications is not as easy as it could be. -- Mark Hollomon mhh@nortelnetworks.com
Oh, I see. SELECT is a transaction, so it flushes pglog. Re-added to TODO. > Bruce Momjian wrote: > > > > > In short, if you load a bunch of tuples into a table, the first select > > > after the load can run a lot slower than you might expect, because it'll > > > be writing back most or all of the pages it touches. But that penalty > > > doesn't affect every select, only the first one to scan a newly-written > > > tuple. > > > > I have removed this from the TODO list: > > > > * Prevent fsync in SELECT-only queries > > When selecting (i.e. - read-only) transaction commits, > it change pg_log - we obviously can avoid this! > No sense to store commit/abort status of read-only xactions! > > Vadim > -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Added to TODO list. > Jan Wieck wrote: > > > > > > > > It would be cool to have Perl interface to postgres internals! > > > Unfortunately I'm not C programmer. I think Edmund could do this. > > > > Now I've got a sequence overflow counting the requests for > > something like PL/Perl! > > > [personal view on perl and tcl deleted] > > Jan: > > I've been looking for a project to get me active in the postgresql > community after lurking since before it was (officially) PostgreSQL. > > I will do the PL/Perl interface. > > Perl is a great integration tool. This can be seen from the enormous > growth it its use in such areas as CGI programming. And imbedding > functionality into perl from other sources is rarely a hard problem. > > But embedding perl in other applications is not as easy as it could be. > > -- > > Mark Hollomon > mhh@nortelnetworks.com > > -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> Kaare Rasmussen wrote: > > > > > I think we need that, and it should be the default, but few people agree > > > with me. I have some schemes to do this. > > I remember this, Bruce. But I would like to see it implemented > in right way. I'm not happy with "two sync() in postmaster" idea. > We have to implement Shared Catalog Cache (SCC), mark all dirtied > relation files there and than just fsync() these files, before > fsync() of pg_log. I see. You want to use the shared catalog cache to flag relations that have been modified, and fsync those before fsync of pglog. Another idea is to send a signal to each backend that has marked a bit in shared memory saying it has written to a relation, and have the signal handler fsync all its dirty relations, set a finished bit, and have the postmaster then fsync pglog. The shared catalog cache still requires the postmaster to open every relation that is marked as dirty to fsync it, which could be a performance problem. Now, if we could pass file descriptors between processes, that would make things easy. I think BSD can do it, but I don't believe it is portable. My idea would be: backend 1 2 3 4 5 6 7 dirtied: 1 2 3 4 5 6 7 fsync'ed: Each backend sets it's 'dirtied' bit when it modifies and relation. Every 5 seconds, postmaster scans dirtied list, sends signal to each backend that has dirtied. Each backend fsyncs its relations, then sets its fsync'ed bit. When all have signaled fsynced, the postmaster can update pg_log on disk. Another issue is that now that we update the transaction status as part of SELECT, pg_log is not the only representation of committed status. Of course, we have to prevent flush of pglog by OS, perhaps by making a copy of the last two pages of pg_log before this and remove it after. If a backend starts up and sees that pg_log copy file, it puts that in place of the current last two pages of pg_log. Also, for 6.6, I am going to add system table indexes so all cache lookups use indexes. I am unsure that shared catalog cache is going to do that buffer cache doesn't already do. Perhaps if we just flushed the system table cache buffers less frequently, there would be no need for a shared system cache. Basically, this fsync() thing is killing performance, and I think we can come up with an smart solution to this if we discuss the options. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> The major problem in this area is, that with the given model > of telling which tuples are committed, noone can guarantee a > consistent PostgreSQL database in the case of a non-fsynced > crash. You might loose some tuples and might get some > outdated ones back. But it depends on subsequently executed > SELECT's which ones and it all doesn't have anything to do > with transaction boundaries or with the order in which > transactions committed. > > As I understand Oracle the entire reliability depends on the > redo logs. If a crash is too badly, you can allways restore > the last backup and recover from that. The database crash > recovery will roll forward until the last COMMIT that occurs > in the redolog (except for point in time recovery). > > Someone can live with the case, that the last COMMIT's > (sorted by time) cannot get recovered. But noone can live > with a database that's left corrupt. Yes, I 100% agree. We have to bring the database back to a consistent case where only the last few transactions are not done at all, and all previous ones are completely done. See previous post on methods and issues. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian wrote: > > update pg_log on disk. Another issue is that now that we update the > transaction status as part of SELECT, pg_log is not the only We should don't update pg_log for read-only xactions. > representation of committed status. > > Of course, we have to prevent flush of pglog by OS, perhaps by making a > copy of the last two pages of pg_log before this and remove it after. > If a backend starts up and sees that pg_log copy file, it puts that in > place of the current last two pages of pg_log. Keep two last pg_log pages in shmem, lock them, copy, unlock, write copy to pg_log. > Also, for 6.6, I am going to add system table indexes so all cache > lookups use indexes. I am unsure that shared catalog cache is going to > do that buffer cache doesn't already do. Perhaps if we just flushed the > system table cache buffers less frequently, there would be no need for a > shared system cache. I would like to see ntuples and npages in pg_class up-to-date. Now we do fseek for each heap_insert and for each heap_beginscan. And note that we have to open() system relation files, even if pages are in buffer pool. Vadim
> Bruce Momjian wrote: > > > > update pg_log on disk. Another issue is that now that we update the > > transaction status as part of SELECT, pg_log is not the only > > We should don't update pg_log for read-only xactions. No, I was saying we mark those SELECT'ed rows as being part of committed transactions. When we SELECT a row, we look at pg_log to see if it is committed, and mark that row as part of a committed transaction so we don't have to check pg_log again. We can't do that with the system we invisioning until we put pg_log on disk as a committed transaction. Could be tricky, though having two copies of pg_log in memory, one disk-copy and one active copy, and use disk-copy for row xact status updates would do the trick. > > > representation of committed status. > > > > Of course, we have to prevent flush of pglog by OS, perhaps by making a > > copy of the last two pages of pg_log before this and remove it after. > > If a backend starts up and sees that pg_log copy file, it puts that in > > place of the current last two pages of pg_log. > > Keep two last pg_log pages in shmem, lock them, copy, unlock, > write copy to pg_log. Yes, much better. Control what gets to disk by not updating the file at all. > > > Also, for 6.6, I am going to add system table indexes so all cache > > lookups use indexes. I am unsure that shared catalog cache is going to > > do that buffer cache doesn't already do. Perhaps if we just flushed the > > system table cache buffers less frequently, there would be no need for a > > shared system cache. > > I would like to see ntuples and npages in pg_class up-to-date. > Now we do fseek for each heap_insert and for each heap_beginscan. > And note that we have to open() system relation files, even > if pages are in buffer pool. Why do we have to open system tables if already in buffer cache? I guess so in case we need to write it out, or fault on another page. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian wrote: > > > > > > Also, for 6.6, I am going to add system table indexes so all cache > > > lookups use indexes. I am unsure that shared catalog cache is going to > > > do that buffer cache doesn't already do. Perhaps if we just flushed the > > > system table cache buffers less frequently, there would be no need for a > > > shared system cache. > > > > I would like to see ntuples and npages in pg_class up-to-date. > > Now we do fseek for each heap_insert and for each heap_beginscan. > > And note that we have to open() system relation files, even > > if pages are in buffer pool. > > Why do we have to open system tables if already in buffer cache? I > guess so in case we need to write it out, or fault on another page. Just because of ... heap_open()->RelationBuildDesc() does it. Maybe we could delay smgropen? But in any case note that big guys have shared catalog cache, and this is not because of they haven't good buffer pool -:) Keeping page in pool for just single row is not good. "Oracle itself accesses the data dictionary frequently during the parsing of SQL statements. This access is essential to the continuing operation of Oracle. See Chapter 8, "The Data Dictionary," for more information on the data dictionary. ... Caching of the Data Dictionary for Fast Access Because Oracle constantly accesses the data dictionary during database operation to validate user access and to verify the state of objects, much of the data dictionary information is cached in the SGA. All information is stored in memory using the LRU (least recently used) algorithm. Information typically kept in the caches is that required for parsing." Vadim
> Just because of ... heap_open()->RelationBuildDesc() does it. > Maybe we could delay smgropen? > > But in any case note that big guys have shared catalog cache, > and this is not because of they haven't good buffer pool -:) > Keeping page in pool for just single row is not good. > > "Oracle itself accesses the data dictionary frequently during > the parsing of SQL statements. This access is essential to the > continuing operation of Oracle. See Chapter 8, "The Data Dictionary," > for more information on the data dictionary. > > ... > > Caching of the Data Dictionary for Fast Access > > Because Oracle constantly accesses the data dictionary during database > operation to validate user access and to verify the state of objects, > much of the data dictionary information is cached in the SGA. All > information is stored in memory using the LRU (least recently > used) algorithm. Information typically kept in the caches is that > required for parsing." I agree we need it. I just think we could use better fsync more, and seeing how hard shared catalog cache may be, it may be good to get fsync faster first. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> Just because of ... heap_open()->RelationBuildDesc() does it. > Maybe we could delay smgropen? > > But in any case note that big guys have shared catalog cache, > and this is not because of they haven't good buffer pool -:) > Keeping page in pool for just single row is not good. > > "Oracle itself accesses the data dictionary frequently during > the parsing of SQL statements. This access is essential to the > continuing operation of Oracle. See Chapter 8, "The Data Dictionary," > for more information on the data dictionary. > > ... > > Caching of the Data Dictionary for Fast Access > > Because Oracle constantly accesses the data dictionary during database > operation to validate user access and to verify the state of objects, > much of the data dictionary information is cached in the SGA. All > information is stored in memory using the LRU (least recently > used) algorithm. Information typically kept in the caches is that > required for parsing." I agree we need it. I just think we could use better fsync more, and seeing how hard shared catalog cache may be, it may be good to get fsync faster first. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
My apologies for the apparent triple posting of my earlier message regarding stored procedures. The mailing list send me mail notifying me that the message had bounced due to my not being subscribed so I had tried to subscribe and then send again. Somehow, all three eventually go through. - K Kristofer Munn * http://www.munn.com/~kmunn/ * ICQ# 352499 * AIM: KrMunn
"Mark Hollomon" <mhh@nortelnetworks.com> writes: > I've been looking for a project to get me active in the postgresql > community after lurking since before it was (officially) PostgreSQL. > > I will do the PL/Perl interface. Great! Glad to hear it. regards, tom lane
Bruce Momjian <maillist@candle.pha.pa.us> writes: > ... Another idea > is to send a signal to each backend that has marked a bit in shared > memory saying it has written to a relation, and have the signal handler > fsync all its dirty relations, set a finished bit, and have the > postmaster then fsync pglog. I do not think it's practical to expect any useful work to happen inside a signal handler. The signal could come at any moment, such as when data structures are being updated and are in a transient invalid state. Unless you are willing to do a lot of fooling around with blocking & unblocking the signal, about all the handler can safely do is set a flag variable that will be examined somewhere in the backend main loop. However, if enough information is available in shared memory, perhaps the postmaster could do this scan/update/flush all by itself? > Of course, we have to prevent flush of pglog by OS, perhaps by making a > copy of the last two pages of pg_log before this and remove it after. > If a backend starts up and sees that pg_log copy file, it puts that in > place of the current last two pages of pg_log. It seems to me that one or so disk writes per transaction is not all that big a cost. Does it take much more than one write to update pg_log, and if so why? regards, tom lane
Tom Lane wrote: > > "Mark Hollomon" <mhh@nortelnetworks.com> writes: > > I've been looking for a project to get me active in the postgresql > > community after lurking since before it was (officially) PostgreSQL. > > > > I will do the PL/Perl interface. > > Great! Glad to hear it. And me! Vadim
Mark Hollomon wrote: > > Jan Wieck wrote: > > > > > > > > It would be cool to have Perl interface to postgres internals! > > > Unfortunately I'm not C programmer. I think Edmund could do this. > > > > Now I've got a sequence overflow counting the requests for > > something like PL/Perl! > > > [personal view on perl and tcl deleted] Really sorry for that. It's not my favorite behaviour to talk dirty about something I don't know. But after asking kindly several times I thought making someone angry could work - and it did :-) > > Jan: > > I've been looking for a project to get me active in the postgresql > community after lurking since before it was (officially) PostgreSQL. > > I will do the PL/Perl interface. That's a word! I'll contact you with private mail. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
According to Vadim Mikheev: > > "Mark Hollomon" <mhh@nortelnetworks.com> writes: > > > I've been looking for a project to get me active in the postgresql > > > community after lurking since before it was (officially) PostgreSQL. > > > > > > I will do the PL/Perl interface. > > > > Great! Glad to hear it. > > And me! It would be really nice if the client/server interface could be fixed up to remove the tuple size limits by the time the embedded perl interface is added. I think preloading some perl functions to get arbitrarily processed output back from a select would be handy for a lot of uses, and even better if we didn't have to worry about the size of the returned "record". Les Mikesell les@mcs.com
Bruce Momjian <maillist@candle.pha.pa.us> writes: > ... Another idea > is to send a signal to each backend that has marked a bit in shared > memory saying it has written to a relation, and have the signal handler > fsync all its dirty relations, set a finished bit, and have the > postmaster then fsync pglog. One other problem with signals is that things get complicated if PostgreSQL ever moves to a multi-threading model. -- ===================================================================== | JAVA must have been developed in the wilds of West Virginia. | | After all, why else would it support only single inheritance?? | ===================================================================== | Finger geek@cmu.edu for my public key. | =====================================================================
> Really sorry for that. It's not my favorite behaviour to talk > dirty about something I don't know. But after asking kindly > several times I thought making someone angry could work - and > it did :-) I've never seen yout post before. But then again I've not been on hackers for too long.
> If there's (only one) Perl PROGRAMMER out in the world > reading this, who does see a (however small) possibility to > explain how to integrate a Perl interpreter into PostgreSQL, > RESPOND!!!!!!!!!!! Well I'm a Perl programmer. I don't have a clue about how to integrate Perl into PostgreSQL, but it should be possible. I'd like to help. I know Perl, but nothing about the internals of PostgreSQL and I don't code C. > Maybe Perl isn't the scripting language someone should choose > because it is too limited in it's capabilities - remember Too limited? Perl? You're joking. > that real programmers don't use pascal... - maybe real > programmer wouldn't ever use Perl... Then I'm no real programmer. But then again I program in any language that is needed.
On 07-Jun-99 Kaare Rasmussen wrote: >> If there's (only one) Perl PROGRAMMER out in the world >> reading this, who does see a (however small) possibility to >> explain how to integrate a Perl interpreter into PostgreSQL, >> RESPOND!!!!!!!!!!! Easy. See attachment. > --- Dmitry Samersoff, dms@wplus.net, ICQ:3161705 http://devnull.wplus.net * There will come soft rains ...
Dmitry Samersoff wrote: > > This message is in MIME format > --_=XFMail.1.3.p0.FreeBSD:990608114224:212=_ > Content-Type: text/plain; charset=KOI8-R > > > On 07-Jun-99 Kaare Rasmussen wrote: > >> If there's (only one) Perl PROGRAMMER out in the world > >> reading this, who does see a (however small) possibility to > >> explain how to integrate a Perl interpreter into PostgreSQL, > >> RESPOND!!!!!!!!!!! > > Easy. > See attachment. > > Content-Disposition: attachment; filename="loadmail.pl" Dmitry, it's well known that a Perl script can access a PostgreSQL database. But that's not the thing we're looking for. For building PL/Perl, the Perl INTERPRETER must be INSIDE the backend. Only if it is part of the PostgreSQL backend, it can have access to the SPI. It will not work to spawn off an external interpreter which then contacts the database back via Pg. Thus, there must be some way to link a shared object against the Perl libraries and at the time the PostgreSQL database backend loads our shared object to call functions in the Perl library. The attachment you've sent is simply a Perl script that does some db access. Nice, but not the point. Please show us how easy it is to do what we want. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
Kaare Rasmussen wrote: > > > If there's (only one) Perl PROGRAMMER out in the world > > reading this, who does see a (however small) possibility to > > explain how to integrate a Perl interpreter into PostgreSQL, > > RESPOND!!!!!!!!!!! > > Well I'm a Perl programmer. I don't have a clue about how to integrate > Perl into PostgreSQL, but it should be possible. > > I'd like to help. I know Perl, but nothing about the internals of > PostgreSQL and I don't code C. That's the point and why I wrote programmer in capitals (REAL PROGRAMMERS STILL THINK AND TALK IN CAPITALS). It's not Perl script-writing what's needed at this stage. We need programmers that are familiar with the C API of Perl and could easily write things like the Pg package. And that's still not enough knowledge. > > > Maybe Perl isn't the scripting language someone should choose > > because it is too limited in it's capabilities - remember > > Too limited? Perl? You're joking. No, I wasn't joking. Up to now there's only one person who said that what we need is possible, but it wouldn't be as easy as it should be. I didn't talked about how powerful the Perl language is. And I know that it's easy to integrate anything into Perl. But after all, it's a Perl script that has the entire control. This time, the Perl interpreter has to become a silly little working slave. Beeing quiet until it's called and quiet again after having served one function call until the big master PostgreSQL calls him again. This flexibility requires a real good design of the interpreters internals. And that's what I'm addressing here. > > > that real programmers don't use pascal... - maybe real > > programmer wouldn't ever use Perl... > > Then I'm no real programmer. But then again I program in any language > that is needed. You aren't - you're a script writer and that's a today quiche eater :-) The term "Real Programmer" is something any hacker should know! Top of the article "Real Programmers Don't Use Pascal": <<Back in the good old days -- the "Golden Era" of computers, it was easy to separate the men from the boys (sometimes called "Real Men" and "Quiche Eaters" in the literature). During this period, the Real Men were the ones that understood computer programming, and the Quiche Eaters were the ones that didn't. ...>> Take a look at http://burks.bton.ac.uk/burks/foldoc/33/86.htm and follow the links - enjoy. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
Hello! On Tue, 8 Jun 1999, Jan Wieck wrote: > This time, the Perl interpreter has to become a silly little > working slave. Beeing quiet until it's called and quiet > again after having served one function call until the big > master PostgreSQL calls him again. > > This flexibility requires a real good design of the > interpreters internals. And that's what I'm addressing here. I know exactly 1 (one) program that incorporate (embed) Perl interpreter - it is editor VIM (well-known vi-clone from www.vim.org). I think anyone who want to learn how to embed perl may start looking int vim sources. Once I tried to compile vim+perl myself, but perldidn't work. I am perl-hater, so this probably was the reason. Anothe example - mod_perl for Apache - is rather bad example, as mod_perl is too big, overcomplicated and too perlish :) VIM can also be compiled with builtin Python interpreter, and I had no problem compilng and using vim+python. Python is well known for its extending and embedding capabilities. mod_python (it is called PyApache) is small and elegant example of how to embed python, but of course it is not as powerful as mod_perl (one cannot touch Apache internals from mod_python, though author lead PyApache development this way). Yes, I am biased toward Python, but I cannot say "I recommend embed Python to construct PL/Python" - I have no time to lead the development, and I doubt there are many pythoners here (D'Arcy?). > Jan > > -- > > #======================================================================# > # It's easier to get forgiveness for being wrong than for being right. # > # Let's break this rule - forgive me. # > #========================================= wieck@debis.com (Jan Wieck) # Oleg. ---- Oleg Broytmann http://members.xoom.com/phd2/ phd2@earthling.net Programmers don't die, they justGOSUB without RETURN.
On 08-Jun-99 Jan Wieck wrote: > Kaare Rasmussen wrote: > >> >> > If there's (only one) Perl PROGRAMMER out in the world >> > reading this, who does see a (however small) possibility to >> > explain how to integrate a Perl interpreter into PostgreSQL, >> > RESPOND!!!!!!!!!!! >> >> Well I'm a Perl programmer. I don't have a clue about how to integrate >> Perl into PostgreSQL, but it should be possible. >> >> I'd like to help. I know Perl, but nothing about the internals of >> PostgreSQL and I don't code C. > > That's the point and why I wrote programmer in capitals (REAL > PROGRAMMERS STILL THINK AND TALK IN CAPITALS). It's not Perl > script-writing what's needed at this stage. We need > programmers that are familiar with the C API of Perl and > could easily write things like the Pg package. And that's > still not enough knowledge. Ok! I have no problems writing C package like Pg.pm to use with Perl 5.x. However, IMHO all tasks requiring such packages can better be done using C++ and STL. --- Dmitry Samersoff, dms@wplus.net, ICQ:3161705 http://devnull.wplus.net * There will come soft rains ...
Oleg Broytmann wrote: > > Hello! > > VIM can also be compiled with builtin Python interpreter, and I had no > problem compilng and using vim+python. Python is well known for its > extending and embedding capabilities. mod_python (it is called PyApache) is > small and elegant example of how to embed python, but of course it is not > as powerful as mod_perl (one cannot touch Apache internals from mod_python, > though author lead PyApache development this way). Actually abou 1.5 years ago it used to allow access to internals, but seemingly nobody used it and so it was thrown out in later versions. > Yes, I am biased toward Python, but I cannot say "I recommend embed > Python to construct PL/Python" - I have no time to lead the development, > and I doubt there are many pythoners here (D'Arcy?). I have contemplated it several times, but to be really useful we would first need a nice interface for returning "tables" from PL functions. I suspect this is not something trivial to add ? With it I could use PL/Python to make all kinds of external objects like mailboxes (both local and POP/IMAP/NNTP), conf files (/etc/passwd, pg_hba.conf), DNS/LDAP/... queries or any other nice things available through existing python modules available to postgres queries. ----------------- Hannu
I re-read the Real Programmers don't use Pascal ---- it beats writing this damned proposal I'm working on. If the line about Real Programmers use goto's is anything to go by, then you should nominate Vadim as the v6.5 Real Programmer. You should _see_ all those gotos in nbtree.c! He's done a fine job with them too. Bernie
> > I re-read the Real Programmers don't use Pascal ---- it beats writing > this damned proposal I'm working on. If the line about Real Programmers > use goto's is anything to go by, then you should nominate Vadim as the > v6.5 Real Programmer. You should _see_ all those gotos in nbtree.c! > He's done a fine job with them too. Vadim is surely one of the real programmers in our project. It's not only that he isn't afraid using GOTO's. He also know's very well how to speed things up by complicating the code :-) Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
> Perl language is. And I know that it's easy to integrate > anything into Perl. But after all, it's a Perl script that > has the entire control. I don't think so, but then again I'm only speculating. If the Apache people can embed an entire Perl interpreter in their Web server, shouldn't it be possible for PostgreSQL? Or maybe the Apache people are REALLY REAL PROGRAMMERS? :-) > This flexibility requires a real good design of the > interpreters internals. And that's what I'm addressing here. As I said, I don't code C. I haven't got the time to learn it right now, and not the time to learn PostgreSQL's internals. If my offer to help with any Perl question I can help with is below your standards, I'm sorry. > You aren't - you're a script writer and that's a today quiche > eater :-) That remark shows that you know nothing about Perl. But it's okay; be ignorant in your own little way ;-] Btw. How do you define script writing as oposed to programming?
Kaare Rasmussen wrote: > shouldn't it be possible for PostgreSQL? Or maybe the Apache people are > REALLY REAL PROGRAMMERS? :-) There are some. > > You aren't - you're a script writer and that's a today quiche > > eater :-) > > That remark shows that you know nothing about Perl. But it's okay; be > ignorant in your own little way ;-] Kaare, I would never really flame on a list like this. And I personally prefer scripts wherever possible. Sometimes I can't resist to write some humor - just that my kind of humor is a little hard to understand. But real programmers don't care if other human's understand them as long as their computers do. But even then, they have a programming problem, and totally don't care any more about human's and their communication problems. > > Btw. How do you define script writing as oposed to programming? #define SCRIPT_WRITING "modern form of quiche eating" #define PROGRAMMING (((READABILITY_OF_WRITTEN_CODING <= 0 && \ ABLE_TO_WRITE_FORTRAN_STYLE_IN_C > 4) || \ USES_GOTO_EXCESSIVELY) ? TRUE : FALSE) Disclaimer: This entire message isn't serious! Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
Jan Wieck wrote: > > > > > I re-read the Real Programmers don't use Pascal ---- it beats writing > > this damned proposal I'm working on. If the line about Real Programmers > > use goto's is anything to go by, then you should nominate Vadim as the > > v6.5 Real Programmer. You should _see_ all those gotos in nbtree.c! > > He's done a fine job with them too. Unfortunately, I use loops in 9 cases of 10, seems like I have no chance to win nomination -:( Though, you should _see_ gotos in heapam.c - maybe there is still some chance for me. -:) Actually, I just don't think that breaks in loops are always better than gotos. > Vadim is surely one of the real programmers in our project. > It's not only that he isn't afraid using GOTO's. He also Like someone didn't afraid to use siglongjmp in elog.c. > know's very well how to speed things up by complicating the > code :-) -:) VAdim
> I would never really flame on a list like this. And I Who's flaming. I'm just tickling your side bone. Maybe you'll end up believing yourself if nobody tells you otherwise ;-}
> Actually, I just don't think that breaks in loops are always better > than gotos. > > > Vadim is surely one of the real programmers in our project. > > It's not only that he isn't afraid using GOTO's. He also > > Like someone didn't afraid to use siglongjmp in elog.c. There are much better ones in the PL handlers! memcpy()'s mangling sigjmp_buf's between sigsetjmp() siglongjmp() stuff. > > > know's very well how to speed things up by complicating the > > code :-) > > -:) > > VAdim > Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
Jan Wieck wrote: > > > Actually, I just don't think that breaks in loops are always better > > than gotos. > > > > > Vadim is surely one of the real programmers in our project. > > > It's not only that he isn't afraid using GOTO's. He also > > > > Like someone didn't afraid to use siglongjmp in elog.c. > > There are much better ones in the PL handlers! memcpy()'s > mangling sigjmp_buf's between sigsetjmp() siglongjmp() stuff. Wow! Voodoo! I very like such things -:) This is really the way what Real Programmers follow -:) Vadim
Hey, why don't you just overwrite the jmp instruction with a nop.... On Thu, 10 Jun 1999, Vadim Mikheev wrote: > Jan Wieck wrote: > > > > > Actually, I just don't think that breaks in loops are always better > > > than gotos. > > > > > > > Vadim is surely one of the real programmers in our project. > > > > It's not only that he isn't afraid using GOTO's. He also > > > > > > Like someone didn't afraid to use siglongjmp in elog.c. > > > > There are much better ones in the PL handlers! memcpy()'s > > mangling sigjmp_buf's between sigsetjmp() siglongjmp() stuff. > > Wow! Voodoo! > I very like such things -:) > This is really the way what Real Programmers follow -:) > > Vadim > A.J. (james@fsck.co.uk) Ignorance is not knowing. Stupidity is the active pursuit of ignorance.
> > > Hey, why don't you just overwrite the jmp instruction with a nop.... > Hmmmm - this would require that the code segment is writable what it isn't on most modern systems. But the shared objects are usually compiled with -fPIC (position independent code), so it should be possible to copy the code segment part of the PL handlers into an malloc()'ed area to get it into writable memory and execute it there over function pointers... Nice idea, we'll try it with the upcoming PL/Perl handler. On second thought, there maybe is another tricky way to prevent it all. Copy the entire Perl interpreter into malloc()'ed memory and modify it's calls to malloc(), free() redirecting them to private ones. Then we have total control over it's allocations, can create an image copy of it after each some successful calls into another area and in the case of a transaction abort reset it to the last valid state by restoring the copy. On third thought, we could also do it the Microsoft way. Hook into the kernel's virtual memory control and trace every first write operation into a page. At this time we copy the old pages state to somewhere else. This will save some allocated memory because we only need restorable copies of the pages modified since the last save cycle. Requires to hack down ways to get around access restrictions so the postmaster is able to patch the OS kernel at startup (only requires root permissions so /dev/kmem can get opened for writing), but since this is definitely the best way to do it, it's worth the efford. The result from this work then will become the base for more changes. If the postmaster is already patching the kernel, it can also take over the process scheduling to optimize the system for PostgreSQL performance and we could get rid of these damned SYSV IPC semaphores. Finally the postmaster will control a new type of block cache, by mapping part's of the relations into virtual memory pages of the backends on demand avoiding SYSV shared memories too. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
Then again, I never coded assembler on a modern system...... it was fun though.... Cheers for a great database! If you have to delay 6.5 longer, do it... it's better to have somthing stable. James On Thu, 10 Jun 1999, Jan Wieck wrote: > > > > > > Hey, why don't you just overwrite the jmp instruction with a nop.... > > > > Hmmmm - this would require that the code segment is writable > what it isn't on most modern systems. > > But the shared objects are usually compiled with -fPIC > (position independent code), so it should be possible to copy > the code segment part of the PL handlers into an malloc()'ed > area to get it into writable memory and execute it there over > function pointers... > > Nice idea, we'll try it with the upcoming PL/Perl handler. > > On second thought, there maybe is another tricky way to > prevent it all. Copy the entire Perl interpreter into > malloc()'ed memory and modify it's calls to malloc(), free() > redirecting them to private ones. Then we have total control > over it's allocations, can create an image copy of it after > each some successful calls into another area and in the case > of a transaction abort reset it to the last valid state by > restoring the copy. > > On third thought, we could also do it the Microsoft way. Hook > into the kernel's virtual memory control and trace every > first write operation into a page. At this time we copy the > old pages state to somewhere else. This will save some > allocated memory because we only need restorable copies of > the pages modified since the last save cycle. Requires to > hack down ways to get around access restrictions so the > postmaster is able to patch the OS kernel at startup (only > requires root permissions so /dev/kmem can get opened for > writing), but since this is definitely the best way to do it, > it's worth the efford. > > The result from this work then will become the base for more > changes. If the postmaster is already patching the kernel, > it can also take over the process scheduling to optimize the > system for PostgreSQL performance and we could get rid of > these damned SYSV IPC semaphores. Finally the postmaster will > control a new type of block cache, by mapping part's of the > relations into virtual memory pages of the backends on demand > avoiding SYSV shared memories too. > > > Jan > > -- > > #======================================================================# > # It's easier to get forgiveness for being wrong than for being right. # > # Let's break this rule - forgive me. # > #========================================= wieck@debis.com (Jan Wieck) # > > A.J. (james@fsck.co.uk) Ignorance is not knowing. Stupidity is the active pursuit of ignorance.