Thread: Status of 7.2
OK, 7.2 is looking _very_ good. We have very few open items. They are: Source Code Changes-------------------Compile in syslog feature by default? (Peter, Tom)AIX compile (Tatsuo)Libpq++ compileon Solaris (Peter)Documentation Changes--------------------- The always-updated list is at: ftp://candle.pha.pa.us/pub/postgresql/open_items. I also have created a post-7.2 list of items that are either patches that need to be applied or discussed for 7.3. That is at: http://candle.pha.pa.us/cgi-bin/pgpatches2 This list is longer than usual. Seems we have quite a number of things in-progress that can be worked on as soon as 7.2 is complete. If there are things there than can be decided now, please dig in and send an email to the hackers list. Once we start 7.3, I will use that list to request patches to complete these items. Because we are done development on 7.2, people can start working on patches now. If you send them to the lists, I will load them up on the page and apply them as soon as 7.3 starts. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Thought you might like to know that I should be able to upload regression test reports for: IRIX 6.5 FreeBSD 4.4 on Intel FreeBSD 4.4 on Alpha VMS on Alpha For 7.2b2 when it's available. Is Postgres supported on all these platforms? Chris -----Original Message----- From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Bruce Momjian Sent: Wednesday, 7 November 2001 11:45 AM To: PostgreSQL-development Subject: [HACKERS] Status of 7.2 OK, 7.2 is looking _very_ good. We have very few open items. They are: Source Code Changes-------------------Compile in syslog feature by default? (Peter, Tom)AIX compile (Tatsuo)Libpq++ compileon Solaris (Peter) Documentation Changes--------------------- The always-updated list is at: ftp://candle.pha.pa.us/pub/postgresql/open_items. I also have created a post-7.2 list of items that are either patches that need to be applied or discussed for 7.3. That is at: http://candle.pha.pa.us/cgi-bin/pgpatches2 This list is longer than usual. Seems we have quite a number of things in-progress that can be worked on as soon as 7.2 is complete. If there are things there than can be decided now, please dig in and send an email to the hackers list. Once we start 7.3, I will use that list to request patches to complete these items. Because we are done development on 7.2, people can start working on patches now. If you send them to the lists, I will load them up on the page and apply them as soon as 7.3 starts. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026 ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
I'll be announcing v7.2b2 tomorrow afternoon ... its packaged and ready to go, ifyou want to get a head start (ftp.postgresql.org), but am giving a bit of time for mirrors to catch up ... On Wed, 7 Nov 2001, Christopher Kings-Lynne wrote: > Thought you might like to know that I should be able to upload regression > test reports for: > > IRIX 6.5 > FreeBSD 4.4 on Intel > FreeBSD 4.4 on Alpha > VMS on Alpha > > For 7.2b2 when it's available. Is Postgres supported on all these > platforms? > > Chris > > -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Bruce Momjian > Sent: Wednesday, 7 November 2001 11:45 AM > To: PostgreSQL-development > Subject: [HACKERS] Status of 7.2 > > > OK, 7.2 is looking _very_ good. We have very few open items. They are: > > Source Code Changes > ------------------- > Compile in syslog feature by default? (Peter, Tom) > AIX compile (Tatsuo) > Libpq++ compile on Solaris (Peter) > > Documentation Changes > --------------------- > > The always-updated list is at: > > ftp://candle.pha.pa.us/pub/postgresql/open_items. > > I also have created a post-7.2 list of items that are either patches > that need to be applied or discussed for 7.3. That is at: > > http://candle.pha.pa.us/cgi-bin/pgpatches2 > > This list is longer than usual. Seems we have quite a number of things > in-progress that can be worked on as soon as 7.2 is complete. If there > are things there than can be decided now, please dig in and send an > email to the hackers list. > > Once we start 7.3, I will use that list to request patches to complete > these items. Because we are done development on 7.2, people can start > working on patches now. If you send them to the lists, I will load them > up on the page and apply them as soon as 7.3 starts. > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 853-3000 > + If your life is a hard drive, | 830 Blythe Avenue > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html >
Is there any list of changes between 7.1.3 and 7.2b2 available? -- Turbo __ _ Debian GNU Unix _IS_ user friendly - it's just ^^^^^ / /(_)_ __ _ ___ __ selective aboutwho its friends are / / | | '_ \| | | \ \/ / Debian Certified Linux Developer _ /// / /__| | | | | |_| |> < Turbo Fredriksson turbo@tripnet.se \\\/ \____/_|_| |_|\__,_/_/\_\ Stockholm/Sweden security counter-intelligence [Hello to all my fans in domestic surveillance] Soviet Legion of Doom South Africa SEAL Team 6 subway iodine $400 million in gold bullion Ft. Meade Delta Force killed attack Waco, Texas [See http://www.aclu.org/echelonwatch/index.html for more about this]
On Tue, 6 Nov 2001, Bruce Momjian wrote: > I also have created a post-7.2 list of items that are either patches > that need to be applied or discussed for 7.3. That is at: > > http://candle.pha.pa.us/cgi-bin/pgpatches2 > > This list is longer than usual. Seems we have quite a number of things > in-progress that can be worked on as soon as 7.2 is complete. If there > are things there than can be decided now, please dig in and send an > email to the hackers list. > > Once we start 7.3, I will use that list to request patches to complete > these items. Because we are done development on 7.2, people can start > working on patches now. If you send them to the lists, I will load them > up on the page and apply them as soon as 7.3 starts. Sorry, I´m really unable to send patches but I have a feature request which was addressed in the thread "Serious performance problem" on this list. It mainly concerns the performance increase if there would be an index scan method which doesn´t have to check the validity of data in the table. I´m just waiting for a statement from you guys if you think it will be doable in 7.3 (while now started to optimize my database as you suggested ;-).) I think this would increase acceptance of PostgreSQL for certain people here in Germany which have real influence on decisions about database in medical diagnostics and care in Germany. Kind regards Andreas.
Bruce Momjian wrote: > OK, 7.2 is looking _very_ good. We have very few open items. They are: > > Source Code Changes > ------------------- > Compile in syslog feature by default? (Peter, Tom) > AIX compile (Tatsuo) > Libpq++ compile on Solaris (Peter) > > Documentation Changes > --------------------- > > The always-updated list is at: > > ftp://candle.pha.pa.us/pub/postgresql/open_items. > > I also have created a post-7.2 list of items that are either patches > that need to be applied or discussed for 7.3. That is at: > > http://candle.pha.pa.us/cgi-bin/pgpatches2 > > This list is longer than usual. Seems we have quite a number of things > in-progress that can be worked on as soon as 7.2 is complete. If there > are things there than can be decided now, please dig in and send an > email to the hackers list. I would suggest to schedule my patch (the last on the list) for 7.2 since it finishes the work I began for 7.2. Since some patches (part of the work/redesign) are in but the last two are yet unapplied (IIRC Michael is really busy at the moment), I'd vote for not leaving this work half-done. Christof
> Is there any list of changes between 7.1.3 and 7.2b2 available? > Sure see /HISTORY in the source tarball. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> > This list is longer than usual. Seems we have quite a number of things > > in-progress that can be worked on as soon as 7.2 is complete. If there > > are things there than can be decided now, please dig in and send an > > email to the hackers list. > > I would suggest to schedule my patch (the last on the list) for 7.2 since it > finishes the work I began for 7.2. > Since some patches (part of the work/redesign) are in but the last two are > yet unapplied (IIRC Michael is really busy at the moment), I'd vote for not > leaving this work half-done. OK, this is for ecpg. If you can get an OK from Michael, I will be glad to apply them. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
>> I would suggest to schedule my patch (the last on the list) for 7.2 since it >> finishes the work I began for 7.2. >> Since some patches (part of the work/redesign) are in but the last two are >> yet unapplied (IIRC Michael is really busy at the moment), I'd vote for not >> leaving this work half-done. > OK, this is for ecpg. If you can get an OK from Michael, I will be glad > to apply them. More to the point, I don't think it's core's business to overrule Michael's technical decisions about ecpg. If he thinks the patch is okay, but hasn't time to apply it, then we can do that for him. But we won't apply it without his review and okay. regards, tom lane
On Wed, 7 Nov 2001, Tille, Andreas wrote: > Sorry, I´m really unable to send patches but I have a feature request > which was addressed in the thread "Serious performance problem" on this > list. It mainly concerns the performance increase if there would be > an index scan method which doesn´t have to check the validity of data > in the table. I´m just waiting for a statement from you guys if you > think it will be doable in 7.3 (while now started to optimize my > database as you suggested ;-).) I think this would increase acceptance > of PostgreSQL for certain people here in Germany which have real influence > on decisions about database in medical diagnostics and care in Germany. Is it possible that hackers do any statement according this issue. I want to repeat the problem. It´s hard to argue for PostgreSQL (and I would really like to advocate for PostgreSQL) against MS SQL if we talk about an imaginary possible dataloss if my colleague has not ever faced dataloss and certainly know that other power users of MS SQL are using it. It´s much more hard to argue if there are cases in which MS SQL outperforms PostgreSQL in the order of magnitude. It´s hard to convince somebody if I tell him that the reason is his bad database design. He really isn´t sooo bad and he claims that MS SQL has transparent transaction *and* fast index usage. Don´t ask me how they do this. I repeat that my colleague is in the position to decide about software usage of several medicine related projects in Germany. I just want to know now if this is an issue for PostgreSQL hackers: [ ] yes [ ] no [ ] we are discussing about that In case of "no" I would be happy if you could provide me with some technical reasons which could help me arguing. Kind regards Andreas.
> I just want to know now if this is an issue for PostgreSQL hackers: > [X] yes > [X] no > [X] we are discussing about that > In case of "no" I would be happy if you could provide me with some > technical reasons which could help me arguing. The hacker community has a wide range of interests. From my POV, the overall performance of PostgreSQL is more than competitive with other database products, including M$SQL. There is not much point in arguing a specific query case, though we are happy to talk about specific overall applications and to offer suggestions on how to build databases that are generally well designed and that will perform well on more than one product. If you have a colleague who firmly believes that M$SQL is the best solution, it sounds like he is not listening to all of the facts. That certainly can be frustrating, eh? Maybe after a few more years of crashed machines and increasing costs he will be more open to alternatives ;) - Thomas
> I just want to know now if this is an issue for PostgreSQL hackers: > > [ ] yes > [ ] no > [ ] we are discussing about that > > In case of "no" I would be happy if you could provide me with some > technical reasons which could help me arguing. My guess is that its likely to get discussed again when 7.3 development starts if someone brings it up. I think right now alot of discussion is towards the 7.2betas and bugs and stuff that might possibly get put off that was already talked about earlier in this cycle.
Tille, Andreas writes: > > Sorry, I´m really unable to send patches but I have a feature request > > which was addressed in the thread "Serious performance problem" on this > > list. It mainly concerns the performance increase if there would be > > an index scan method which doesn´t have to check the validity of data > > in the table. > I just want to know now if this is an issue for PostgreSQL hackers: > > [ ] yes > [ ] no > [ ] we are discussing about that We are always willing to discuss changes that improve performance, reliability, standards compliance, etc. However, "MS SQL does it, and MS SQL is fast" is not sufficient proof that a feature would improve average performance in PostgreSQL. This issue has been brought up with similarly unsatisfactory arguments in the past, so you should be able to find out about the discussion in the archives. Some of the arguments against this change were bigger indexes, slower write operations, non-existent proof that it's really faster, putting the index on a different disk will mostly obsolete the issue. Consequently, this is currently not something that has got a chance to be implemented anytime soon. -- Peter Eisentraut peter_e@gmx.net
> We are always willing to discuss changes that improve performance, > reliability, standards compliance, etc. However, "MS SQL does it, and MS > SQL is fast" is not sufficient proof that a feature would improve average > performance in PostgreSQL. This issue has been brought up with similarly > unsatisfactory arguments in the past, so you should be able to find out > about the discussion in the archives. Some of the arguments against this > change were bigger indexes, slower write operations, non-existent proof > that it's really faster, putting the index on a different disk will mostly > obsolete the issue. Consequently, this is currently not something that > has got a chance to be implemented anytime soon. I personally would like to have index scans that look up heap rows record the heap expired status into the index entry via one bit of storage. This will not _prevent_ checking the heap but it will prevent heap lookups for index entries that have been exipred for a long time. However, with the new vacuum, and perhaps autovacuum coming soon, may be little need for this optimization. The underlying problem the user is seeing is how to _know_ an index tuple is valid without checking the heap, and I don't see how to do that unless we start storing the transaction id in the index tuple, and that requires extra storage. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
On Fri, 16 Nov 2001, Thomas Lockhart wrote: > The hacker community has a wide range of interests. For sure, but there will be a raodmap with general consensus of the hackers. > From my POV, the overall performance of PostgreSQL is more than > competitive with other database products, including M$SQL. I never doubt you point of view, but it hardly counts as an argument for my current problem. There is a technical reason why MS SQL server is faster here and they claim to do it safely. (Well personally I do not give a cent for thigs that MS claims about but this does not help here.) > There is not much point in arguing a specific query case, It is no specific query case. It is the speed of an index scan which goes like N if you do it with PostgreSQL and it goes like log N if you do not have to look back into the table like MS SQL server does. > though we are happy to talk > about specific overall applications and to offer suggestions on how to > build databases that are generally well designed and that will perform > well on more than one product. I doubt that you could care about any database designer who does poor database design and just does a straigtforeward index scan. If you think that PostgreSQL is only targeted to high professional database designers which know how to avoid index scans I doubt that PostgreSQL will get the user base it would deserve. I could imagine several cases like my colleague who might think about porting their application and get into the trap as me that the first simple question they try performs that badly. I really want to say that we should address this issue in the documentation. If there exists such cases we should make it clear *why* PostgreSQL fails this performance test (and perhaps include your text in your mail as a base of this documentation). If we ignore that we will not attrakt users. > If you have a colleague who firmly believes that M$SQL is the best > solution, it sounds like he is not listening to all of the facts. He is a little bit MS centric but in principle knows the advantage of OpenSource. On the other hand he is led by pragmatism and just asks: Which software gives the solution quickly. And he found his answer. On the other hand we should also listen to things he presents as "facts" ... > That certainly can be frustrating, eh? Yes. > Maybe after a few more years of > crashed machines and increasing costs he will be more open to > alternatives ;) This does not help currently. I repeat: We should at least upgrade PostgreSQL documentation to address those issues. Kind regards Andreas. PS: I prefer not to be CCed if I do not explicite ask for this service. It seems to be common habit on PostgreSQL liststo CC users. Does this make any sense? On many other lists such bahaviour is banned.
On Fri, 16 Nov 2001, Bruce Momjian wrote: > I personally would like to have index scans that look up heap rows > record the heap expired status into the index entry via one bit of > storage. This will not _prevent_ checking the heap but it will prevent > heap lookups for index entries that have been exipred for a long time. > However, with the new vacuum, and perhaps autovacuum coming soon, may be > little need for this optimization. > > The underlying problem the user is seeing is how to _know_ an index > tuple is valid without checking the heap, and I don't see how to do that > unless we start storing the transaction id in the index tuple, and that > requires extra storage. For my special case I think doubling main memory is about the same price as a MS SQL server license. I can´t say which further problems might occure. Kind regards Andreas.
On Fri, 16 Nov 2001, Peter Eisentraut wrote: > We are always willing to discuss changes that improve performance, > reliability, standards compliance, etc. However, "MS SQL does it, and MS > SQL is fast" is not sufficient proof that a feature would improve average > performance in PostgreSQL. This issue has been brought up with similarly > unsatisfactory arguments in the past, so you should be able to find out > about the discussion in the archives. Sorry, I do not see any favour for PostgreSQL if we want people who consider switching to PostgreSQL to search the archive for useful information. Just stating the issues and principles clearly could convince people. If not PostgreSQL is faster removed from the list of available alternatives of database servers than a web browser is fired up. Kind regards Andreas.
Tille, Andreas wrote: >On Fri, 16 Nov 2001, Bruce Momjian wrote: > >>I personally would like to have index scans that look up heap rows >>record the heap expired status into the index entry via one bit of >>storage. This will not _prevent_ checking the heap but it will prevent >>heap lookups for index entries that have been exipred for a long time. >>However, with the new vacuum, and perhaps autovacuum coming soon, may be >>little need for this optimization. >> >>The underlying problem the user is seeing is how to _know_ an index >>tuple is valid without checking the heap, >> I'd propose a memory-only (or heavily cached) structure of tuple death transaction ids for all transactions since the oldest live trx. And when that oldest finishes then the tombstone marks for all tuples deleted between that and the new oldest are moved to relevant indexes (or the index keys are deleted) by concurrent vacuum or similar process. We could even try to use the index itself as that structure by favoring changed index pages when making caching decisions. It is much safer to cache indexes than it is to cache data pages as for indexes we only need to detect (by keeping info in WAL for example) that it is broken and not what it contained as it can always be rebuilt after computer crash. The problem with using an ndex for this is _which_ index to use when there are many per table. Perhaps a good choice would be the PRIMARY KEY. OTOH, keeping this info in index and not in a dedicated structure makes the amount of data needing to be cached well bigger and thus the whole operation more expensive. >> and I don't see how to do that >>unless we start storing the transaction id in the index tuple, and that >>requires extra storage. >> >For my special case I think doubling main memory is about the same >price as a MS SQL server license. I can´t say which further problems >might occure. > Then you must have really huge amounts of memory already ;) ------------------ Hannu
Hannu Krosing <hannu@tm.ee> writes: > I'd propose a memory-only (or heavily cached) structure of tuple death > transaction > ids for all transactions since the oldest live trx. Seems like just a special-purpose reimplementation of disk pages sitting in shared buffers. If you've got the memory to keep track of tuples you've killed recently, then you've probably got the memory to hold the pages they're in, so a redundant separate caching structure is not obviously a win. The possible win of marking index entries dead (once their tuple is known dead for all transactions) is that it saves visiting disk pages that have *not* been visited recently, and thus that aren't likely to be hanging around in buffers. OTOH there are a lot of potential problems, most notably that is-the-tuple-dead-for-ALL-transactions is not the normal tuple time qual check, and so it'd represent extra overhead in indexscans. I'm also concerned about how to do it without introducing lots of ugly interactions (maybe even deadlocks) between the index access methods and the heap access code. If concurrent vacuuming turns out to be cheap enough, just running vacuum frequently might be a better answer than trying to push the maintenance work into the main line of execution. regards, tom lane
On Mon, 19 Nov 2001, Tille, Andreas wrote: > On Fri, 16 Nov 2001, Thomas Lockhart wrote: > > There is not much point in arguing a specific query case, > It is no specific query case. It is the speed of an index scan which > goes like N if you do it with PostgreSQL and it goes like log N if > you do not have to look back into the table like MS SQL server does. But it is in some way. It's dependant on the number of rows returned by the query. For a small enough number of rows returned, having the additional information in the index could very well make the query slower even if it avoids the reads from the heap. Keeping the information in some other fashion where it doesn't directly do that may alleviate that, but it's not a straightforward one is better than the other in all cases. It's not like postgres never uses an index on a large table, it's just that after a certain amount of expected returns it switches over.
> > There is not much point in arguing a specific query case, > It is no specific query case. It is the speed of an index scan which > goes like N if you do it with PostgreSQL and it goes like log N if > you do not have to look back into the table like MS SQL server does. Have you tried using CLUSTER to match the heap order with the index order. That should help with index scans looking up heap rows. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Tom Lane wrote: >Hannu Krosing <hannu@tm.ee> writes: > >>I'd propose a memory-only (or heavily cached) structure of tuple death >>transaction >>ids for all transactions since the oldest live trx. >> > >Seems like just a special-purpose reimplementation of disk pages sitting >in shared buffers. If you've got the memory to keep track of tuples >you've killed recently, then you've probably got the memory to hold the >pages they're in, so a redundant separate caching structure is not >obviously a win. > I suspect that for even the border case of a table containing just 1 CHAR(1) field the above structure will be a lot smaller than the page cache for the same tuples. >The possible win of marking index entries dead (once their tuple is >known dead for all transactions) is that it saves visiting disk pages >that have *not* been visited recently, and thus that aren't likely to >be hanging around in buffers > > > >OTOH there are a lot of potential problems, most notably that >is-the-tuple-dead-for-ALL-transactions is not the normal tuple time >qual check, and so it'd represent extra overhead in indexscans. >I'm also concerned about how to do it without introducing lots of >ugly interactions (maybe even deadlocks) between the index access >methods and the heap access code. > >If concurrent vacuuming turns out to be cheap enough, just running >vacuum frequently might be a better answer than trying to push the >maintenance work into the main line of execution. > What I proposed would have been mostly the job of concurrent vacuum (marking/removing dead index tuples) Perhaps it would be an overall win for fast changing (vs. fast-growing) databases if we kept the tuple metainfo (attnum < 0) on separate (cache) pages, as it saves writes of tmax updates on both UPDATE and DELETE. If we kept them in a separate table as well that could make the metainfo "table" essentially a kind of index. That table/index could of course be concealed inside the main table by using typed data pages. --------------- Hannu
On Mon, 19 Nov 2001, Bruce Momjian wrote: > > > There is not much point in arguing a specific query case, > > It is no specific query case. It is the speed of an index scan which > > goes like N if you do it with PostgreSQL and it goes like log N if > > you do not have to look back into the table like MS SQL server does. > > Have you tried using CLUSTER to match the heap order with the index > order. That should help with index scans looking up heap rows. Yes, I´ve tried even that and it increase PostgreSQLs performance a little bit for this special query but it did not get nearly the speed of the same query on the MS SQL server. Moreover there are tables with more than one index and I guess it makes only sense to cluster one index per table. Kind regards Andreas.
> On Mon, 19 Nov 2001, Bruce Momjian wrote: > > > > > There is not much point in arguing a specific query case, > > > It is no specific query case. It is the speed of an index scan which > > > goes like N if you do it with PostgreSQL and it goes like log N if > > > you do not have to look back into the table like MS SQL server does. > > > > Have you tried using CLUSTER to match the heap order with the index > > order. That should help with index scans looking up heap rows. > Yes, I?ve tried even that and it increase PostgreSQLs performance a little > bit for this special query but it did not get nearly the speed of the > same query on the MS SQL server. Moreover there are tables with more than > one index and I guess it makes only sense to cluster one index per table. Yes, CLUSTER only matches one index. Something I just realized, that other probably figured out, is that while we have plans to backfill expired tuple status into the index tuples, it is not easy to backfill enough information to know a tuple is valid. Setting aside the problem of different tuple visibilities for different backends, one problem is that when we go to expire a tuple, we would have to update all the index tuples that point to the heap tuple. That is an expensive operation because you have to use the keys in the heap to find the index. So, while we do have plans to mark some index tuples so we _know_ they are expired, we don't know how to efficiently mark index tuples so we _know_ they are valid. This is what I believe you want, where we can scan the index without checking the heap at all. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
On Tue, 20 Nov 2001, Bruce Momjian wrote: > So, while we do have plans to mark some index tuples so we _know_ they > are expired, we don't know how to efficiently mark index tuples so we > _know_ they are valid. > > This is what I believe you want, where we can scan the index without > checking the heap at all. An new index type (say READONLY INDEX or some reasonable name) which is valid all the time between two vacuum processes would suffice for my application. It would fit the needs of people who do a daily database update and vacuum after this. Of course it´s your descision if this makes sense and fits PostgreSQL philosophy, but I think it would speed up some kind of applications. Kind regards Andreas.
Tille, Andreas wrote: >On Tue, 20 Nov 2001, Bruce Momjian wrote: > >>So, while we do have plans to mark some index tuples so we _know_ they >>are expired, we don't know how to efficiently mark index tuples so we >>_know_ they are valid. >> >>This is what I believe you want, where we can scan the index without >>checking the heap at all. >> >An new index type (say READONLY INDEX or some reasonable name) which is >valid all the time between two vacuum processes would suffice for my >application. It would fit the needs of people who do a daily database >update and vacuum after this. > Or perhaps MAINTAINED INDEX, meaning that it has always both tmin and tmax up-to-date. Btw 7.2 still has broken behaviour of xmax which by definition should not have a non-0 value for live tuples pg72b2=# create table parent(pid int primary key); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index 'parent_pkey' for table 'parent' CREATE pg72b2=# create table child(cid int, pid int references parent); NOTICE: CREATE TABLE will create implicit trigger(s) for FOREIGN KEY check(s) CREATE pg72b2=# insert into parent values(1); INSERT 16809 1 pg72b2=# insert into child values(1,1); INSERT 16810 1 pg72b2=# update child set pid=2; ERROR: <unnamed> referential integrity violation - key referenced from child not found in parent pg72b2=# select xmin,xmax,* from child;xmin | xmax | cid | pid ------+------+-----+----- 171 | 172 | 1 | 1 (1 row) pg72b2=# > > >Of course it´s your descision if this makes sense and fits PostgreSQL >philosophy, but I think it would speed up some kind of applications. > >Kind regards > > Andreas. > >---------------------------(end of broadcast)--------------------------- >TIP 3: if posting/reading through Usenet, please send an appropriate >subscribe-nomail command to majordomo@postgresql.org so that your >message can get through to the mailing list cleanly >
Huh, a non-zero XMAX is fine. You mark the XMAX when you _think_ you are updating it. It is only expired when the XMAX on the tuple is committed. > Or perhaps MAINTAINED INDEX, meaning that it has always both tmin and tmax > up-to-date. > Btw 7.2 still has broken behaviour of xmax which by definition should > not have a > non-0 value for live tuples > > pg72b2=# create table parent(pid int primary key); > NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index > 'parent_pkey' for table 'parent' > CREATE > pg72b2=# create table child(cid int, pid int references parent); > NOTICE: CREATE TABLE will create implicit trigger(s) for FOREIGN KEY > check(s) > CREATE > pg72b2=# insert into parent values(1); > INSERT 16809 1 > pg72b2=# insert into child values(1,1); > INSERT 16810 1 > pg72b2=# update child set pid=2; > ERROR: <unnamed> referential integrity violation - key referenced from > child not found in parent > pg72b2=# select xmin,xmax,* from child; > xmin | xmax | cid | pid > ------+------+-----+----- > 171 | 172 | 1 | 1 > (1 row) -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian wrote: > > Huh, a non-zero XMAX is fine. You mark the XMAX when you _think_ you > are updating it. It is only expired when the XMAX on the tuple is > committed. But http://www.postgresql.org/idocs/index.php?sql-syntax-columns.html claims: xmax The identity (transaction ID) of the deleting transaction, or zero for an undeleted tuple. In practice, thisis never nonzero for a visible tuple. cmax The command identifier within the deleting transaction, or zero. Again, this is never nonzero for a visibletuple. Which is IMHO good and useful behaviour, for example for all kinds of mirroring I also think that this kas historically been the behaviour and that this was broken sometime in not too distant past (i.e after postgres95 ;) by foreign keys and/or somesuch. Tom Lane once told me about a way to determine the visibility of a tuple by other means than [x|c][min|max] but I can't find/remember it anymore ;( ----------------- Hannu
Hannu Krosing <hannu@tm.ee> writes: > But > http://www.postgresql.org/idocs/index.php?sql-syntax-columns.html That documentation is in error (my fault). Current docs say xmax The identity (transaction ID) of the deleting transaction, or zero for an undeleted tuple. It is possible for thisfield to be nonzero in a visible tuple: that usually indicates that the deleting transaction hasn't committedyet, or that an attempted deletion was rolled back. > I also think that this kas historically been the behaviour No, it wasn't. regards, tom lane
Tom Lane wrote: >Hannu Krosing <hannu@tm.ee> writes: > >>But >>http://www.postgresql.org/idocs/index.php?sql-syntax-columns.html >> > >That documentation is in error (my fault). Current docs say > >xmax > > The identity (transaction ID) of the deleting transaction, or zero > for an undeleted tuple. It is possible for this field to > be nonzero in a visible tuple: that usually indicates that the > deleting transaction hasn't committed yet, > That seems reasonable > or that an attempted deletion was rolled back. > But could we not make it so that rollback will also reset xmax and cmax to 0. It should be quite cheap to do so as it's on the same page with the commit bits ? The meaning "last transaction that attempted to delete this tuple" seems somewhat weird >>I also think that this kas historically been the behaviour >> >No, it wasn't. > Are you sure that it was a bug not in code but in docs ? --------------- Hannu
Hannu Krosing <hannu@tm.ee> writes: > But could we not make it so that rollback will also reset xmax and cmax > to 0. We never have done that and I don't see why we should start. (And no, I'm not sure that it'd be entirely safe; there are concurrency/atomicity issues involved, because we do not insist on getting exclusive lock to set the it's-dead-Jim flag bit.) We could make the user readout of xmax/cmax be zeroes if the flag bits show they are invalid. But this really just begs the question of what use they are to users in the first place. I can't see any; and if we make them read as zeroes then they for sure won't have any. regards, tom lane
Tom Lane wrote: >Hannu Krosing <hannu@tm.ee> writes: > >>But could we not make it so that rollback will also reset xmax and cmax >>to 0. >> > >We never have done that and I don't see why we should start. >(And no, I'm not sure that it'd be entirely safe; there are >concurrency/atomicity issues involved, because we do not >insist on getting exclusive lock to set the it's-dead-Jim >flag bit.) > >We could make the user readout of xmax/cmax be zeroes if the flag >bits show they are invalid. > If there is a cheap way to get a list of pending transactions, then we could make them read out as 0 if they are about to be deleted (ie xmax in pending_transactions()) and else show the value of the transaction that is about to delete them. >But this really just begs the question >of what use they are to users in the first place. I can't see any; >and if we make them read as zeroes then they for sure won't have any. > I can see some use for xmax user-visible only while being deleted. At least this would be more useful than themeaning last-trx-that-was-about-to-delete. Another way for getting equivalent functionality would be to make the pending_transactions() function available to users. --------------- Hannu