Thread: [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
Attached patch cleanups hash index headers to allow compile hasham for 8.3 version. It helps to improve pg_migrator with capability to migrate database with hash index without reindexing. I discussed this patch year ago with Alvaro when we tried to cleanup include bloating problem. It should reduce also number of including. The main point is that hash functions for datatypes are now in related data files in utils/adt directory. hash_any() and hash_uint32 it now in utils/hashfunc.c. It would be nice to have this in 8.4 because it allows to test index migration functionality. Thanks Zdenek
Attachment
Re: [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
From
Zdenek Kotala
Date:
I forgot to fix contrib. Updated patch attached. Zdenek Zdenek Kotala píše v pá 22. 05. 2009 v 16:23 -0400: > Attached patch cleanups hash index headers to allow compile hasham for > 8.3 version. It helps to improve pg_migrator with capability to migrate > database with hash index without reindexing. > > I discussed this patch year ago with Alvaro when we tried to cleanup > include bloating problem. It should reduce also number of including. > > The main point is that hash functions for datatypes are now in related > data files in utils/adt directory. hash_any() and hash_uint32 it now in > utils/hashfunc.c. > > It would be nice to have this in 8.4 because it allows to test index > migration functionality. > > Thanks Zdenek >
Attachment
Re: [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
From
Kenneth Marshall
Date:
On Sat, May 23, 2009 at 02:52:49PM -0400, Zdenek Kotala wrote: > I forgot to fix contrib. Updated patch attached. > > Zdenek > > Zdenek Kotala p????e v p?? 22. 05. 2009 v 16:23 -0400: > > Attached patch cleanups hash index headers to allow compile hasham for > > 8.3 version. It helps to improve pg_migrator with capability to migrate > > database with hash index without reindexing. > > > > I discussed this patch year ago with Alvaro when we tried to cleanup > > include bloating problem. It should reduce also number of including. > > > > The main point is that hash functions for datatypes are now in related > > data files in utils/adt directory. hash_any() and hash_uint32 it now in > > utils/hashfunc.c. > > > > It would be nice to have this in 8.4 because it allows to test index > > migration functionality. > > > > Thanks Zdenek > > How does that work with the updated hash functions without a reindex? Regards, Ken
Kenneth Marshall <ktm@rice.edu> writes: > On Sat, May 23, 2009 at 02:52:49PM -0400, Zdenek Kotala wrote: >>> Attached patch cleanups hash index headers to allow compile hasham for >>> 8.3 version. It helps to improve pg_migrator with capability to migrate >>> database with hash index without reindexing. > How does that work with the updated hash functions without a reindex? I looked at this patch and I don't see how it helps pg_migrator at all. It's just pushing some code and function declarations around. The rearrangement might be marginally nicer from a code beautification point of view --- right now we're a bit inconsistent about whether datatype-specific hash functions live in hashfunc.c or in the datatype's utils/adt/ file. But I'm not sure that removing hashfunc.c altogether is an appropriate solution to that, not least because of the loss of CVS history for the functions. I'd be inclined to leave the core hash_any() code where it is, if not all of these functions altogether. What does seem useful is to refactor the headers so that datatype hash functions don't need to include all of the AM's implementation details. But this patch seems to do both more and less than that --- I don't think it's gotten rid of all external #includes of access/hash.h, and in any case moving the function code is not necessary to that goal. In any case, the barriers to implementing 8.3-style hash indexes in 8.4 are pretty huge: you'd need to duplicate not only the hash AM code, but also all the hash functions, and therefore all of the hash pg_amop and pg_amproc entries. Given the close-to-zero usefulness of hash indexes in production installations, I don't think it's worth the trouble. It would be much more helpful to look into supporting 8.3-compatible GIN indexes. regards, tom lane
Re: [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
From
Dimitri Fontaine
Date:
Hi, Tom Lane <tgl@sss.pgh.pa.us> writes: > The rearrangement might be marginally nicer from a code beautification > point of view --- right now we're a bit inconsistent about whether > datatype-specific hash functions live in hashfunc.c or in the datatype's > utils/adt/ file. But I'm not sure that removing hashfunc.c altogether is > an appropriate solution to that, not least because of the loss of CVS > history for the functions. I'd be inclined to leave the core hash_any() > code where it is, if not all of these functions altogether. I guess someone has to talk about it: git will follow the code even when the file hosting it changes. It's not all magic though: http://kerneltrap.org/node/11765 "And when using git, the whole 'keep code movement separate from changes' has an even more fundamental reason: git can trackcode movement (again, whether moving a whole file or just a function between files), and doing a 'git blame -C' willactually follow code movement between files. It does that by similarity analysis, but it does mean that if you both movethe code *and* change it at the same time, git cannot see that 'oh, that function came originally from that other file',and now you get worse annotations about where code actually originated." Having better tools maybe could help maintain the high quality standards that are established code wise, too. Regards, -- dim
Dimitri Fontaine <dfontaine@hi-media.com> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> The rearrangement might be marginally nicer from a code beautification >> point of view --- right now we're a bit inconsistent about whether >> datatype-specific hash functions live in hashfunc.c or in the datatype's >> utils/adt/ file. But I'm not sure that removing hashfunc.c altogether is >> an appropriate solution to that, not least because of the loss of CVS >> history for the functions. I'd be inclined to leave the core hash_any() >> code where it is, if not all of these functions altogether. > I guess someone has to talk about it: git will follow the code even when > the file hosting it changes. That might possibly be relevant a year from now; it is 100% irrelevant to a change being proposed for 8.4. regards, tom lane
Re: [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
From
David Fetter
Date:
On Mon, May 25, 2009 at 09:59:14AM -0400, Tom Lane wrote: > Dimitri Fontaine <dfontaine@hi-media.com> writes: > > Tom Lane <tgl@sss.pgh.pa.us> writes: > >> The rearrangement might be marginally nicer from a code > >> beautification point of view --- right now we're a bit > >> inconsistent about whether datatype-specific hash functions live > >> in hashfunc.c or in the datatype's utils/adt/ file. But I'm not > >> sure that removing hashfunc.c altogether is an appropriate > >> solution to that, not least because of the loss of CVS history > >> for the functions. I'd be inclined to leave the core hash_any() > >> code where it is, if not all of these functions altogether. > > > I guess someone has to talk about it: git will follow the code > > even when the file hosting it changes. > > That might possibly be relevant a year from now; it is 100% > irrelevant to a change being proposed for 8.4. It's pretty relevant as far as the schedule goes. I'm not alone thinking that the appropriate place to make this change, given buildfarm support, is at the transition to 8.5. CVS is dead. Long live git! :) Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
From
Zdenek Kotala
Date:
Tom Lane píše v ne 24. 05. 2009 v 18:46 -0400: > Kenneth Marshall <ktm@rice.edu> writes: > > On Sat, May 23, 2009 at 02:52:49PM -0400, Zdenek Kotala wrote: > >>> Attached patch cleanups hash index headers to allow compile hasham for > >>> 8.3 version. It helps to improve pg_migrator with capability to migrate > >>> database with hash index without reindexing. > > > How does that work with the updated hash functions without a reindex? > > I looked at this patch and I don't see how it helps pg_migrator at all. > It's just pushing some code and function declarations around. The main important thing is move hash_any/hash_uint32 function from hash am. It should reduce amount of duplicated code necessary for old hash index implementation. Rest is only rearrangement and cleanup. > The rearrangement might be marginally nicer from a code beautification > point of view --- right now we're a bit inconsistent about whether > datatype-specific hash functions live in hashfunc.c or in the datatype's > utils/adt/ file. I personally prefer to keep it in type definition. AM should be independent on types which are installed and data type code should be self contained. > But I'm not sure that removing hashfunc.c altogether is > an appropriatera solution to that, not least because of the loss of CVS > history for the functions. I'd be inclined to leave the core hash_any() > code where it is, if not all of these functions altogether. Until we will have better version control system, hashfunc.c can stay here, but what I need is hashfunc.h. Minimalistic version of this patch is to create hashfuct.h and modified related #include in C and header files. > What does seem useful is to refactor the headers so that datatype hash > functions don't need to include all of the AM's implementation details. > But this patch seems to do both more and less than that --- I don't > think it's gotten rid of all external #includes of access/hash.h, and > in any case moving the function code is not necessary to that goal. Agree, I will prepare minimalistic version with hashfunc.h only. > In any case, the barriers to implementing 8.3-style hash indexes in 8.4 > are pretty huge: you'd need to duplicate not only the hash AM code, but > also all the hash functions, and therefore all of the hash pg_amop and > pg_amproc entries. I'm not sure if I need duplicate functions. Generally yes but It seems to me that hash index does not changed functions behavior and they could be shared at this moment. > Given the close-to-zero usefulness of hash indexes > in production installations, I don't think it's worth the trouble. It > would be much more helpful to look into supporting 8.3-compatible GIN > indexes. Agree, I wanted to quickly verify function naming collision problem and HASH index seems to me better candidate for this basic test. GIN has priority. Zdenek
Re: [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
From
Stephen Frost
Date:
David, * David Fetter (david@fetter.org) wrote: > It's pretty relevant as far as the schedule goes. I'm not alone > thinking that the appropriate place to make this change, given > buildfarm support, is at the transition to 8.5. > > CVS is dead. Long live git! :) I'm all for moving to git, but not until at least the core folks are more familiar with it and have been using it. I don't believe that experience will be there by the time we open for 8.5 and a forced march when we have numerous big things hopefully hitting on the first commitfest seems like a bad idea. I would encourage core, committers and contributors to start becoming familiar with git on the expectation that we'll be making that move when we open for 8.6/9.0. Ideally, there could be an official decision made about when it's going to happen followed by an announcment when 8.4 is released. Thoughts? Stephen
Re: [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
From
David Fetter
Date:
On Mon, May 25, 2009 at 11:45:33AM -0400, Stephen Frost wrote: > David, > > * David Fetter (david@fetter.org) wrote: > > It's pretty relevant as far as the schedule goes. I'm not alone > > thinking that the appropriate place to make this change, given > > buildfarm support, is at the transition to 8.5. > > > > CVS is dead. Long live git! :) > > I'm all for moving to git, but not until at least the core folks are > more familiar with it and have been using it. Which ones aren't familiar and haven't been using it for at least the past year? I count two. > I don't believe that experience will be there by the time we open > for 8.5 and a forced march when we have numerous big things > hopefully hitting on the first commitfest seems like a bad idea. Your portrayal of a rough and complicated transition is not terribly well supported by other projects' switches to git. > I would encourage core, committers and contributors to start > becoming familiar with git on the expectation that we'll be making > that move when we open for 8.6/9.0. > > Ideally, there could be an official decision made about when it's > going to happen followed by an announcment when 8.4 is released. > > Thoughts? Here's mine: Git delayed is git denied. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
From
Andrew Dunstan
Date:
David Fetter wrote: > On Mon, May 25, 2009 at 09:59:14AM -0400, Tom Lane wrote: > >> Dimitri Fontaine <dfontaine@hi-media.com> writes: >> >>> Tom Lane <tgl@sss.pgh.pa.us> writes: >>> >>>> The rearrangement might be marginally nicer from a code >>>> beautification point of view --- right now we're a bit >>>> inconsistent about whether datatype-specific hash functions live >>>> in hashfunc.c or in the datatype's utils/adt/ file. But I'm not >>>> sure that removing hashfunc.c altogether is an appropriate >>>> solution to that, not least because of the loss of CVS history >>>> for the functions. I'd be inclined to leave the core hash_any() >>>> code where it is, if not all of these functions altogether. >>>> >>> I guess someone has to talk about it: git will follow the code >>> even when the file hosting it changes. >>> >> That might possibly be relevant a year from now; it is 100% >> irrelevant to a change being proposed for 8.4. >> > > It's pretty relevant as far as the schedule goes. I'm not alone > thinking that the appropriate place to make this change, given > buildfarm support, is at the transition to 8.5. > > CVS is dead. Long live git! :) > > That still misses Tom's point, since the change is proposed for 8.4 and at the earliest we would not change SCCMs until after 8.4 is released (and, notwithstanding your eagerness, I suspect it will be rather later). cheers andrew
David Fetter <david@fetter.org> writes: > On Mon, May 25, 2009 at 11:45:33AM -0400, Stephen Frost wrote: >> I'm all for moving to git, but not until at least the core folks are >> more familiar with it and have been using it. > Which ones aren't familiar and haven't been using it for at least the > past year? I count two. I'm not familiar with it, and neither is Bruce, and frankly that's entirely sufficient reason not to change now. What was more or less agreed to at the developer's meeting was that we would move towards git in an orderly fashion. I'm thinking something like six months to a year before cutting over the core repository. If you'd like to accomplish something *useful* about this, how about pestering git upstream to support diff -c output format? regards, tom lane
Re: [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
From
David Fetter
Date:
On Mon, May 25, 2009 at 12:24:05PM -0400, Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > On Mon, May 25, 2009 at 11:45:33AM -0400, Stephen Frost wrote: > >> I'm all for moving to git, but not until at least the core folks are > >> more familiar with it and have been using it. > > > Which ones aren't familiar and haven't been using it for at least > > the past year? I count two. > > I'm not familiar with it, and neither is Bruce, and frankly that's > entirely sufficient reason not to change now. > > What was more or less agreed to at the developer's meeting was that > we would move towards git in an orderly fashion. The rest have already been moving to it in "an orderly fashion," some for over than a year. > I'm thinking something like six months to a year before cutting over > the core repository. What would gate that? > If you'd like to accomplish something *useful* about this, how about > pestering git upstream to support diff -c output format? I've been pestering them :) Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
From
David Fetter
Date:
On Mon, May 25, 2009 at 12:24:05PM -0400, Tom Lane wrote: > If you'd like to accomplish something *useful* about this, how about > pestering git upstream to support diff -c output format? It looks like this is doable with a suitable git configuration file such as $HOME/.gitconfig or (finer grain) a .git/config for the repository :) Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes: > Tom Lane píše v ne 24. 05. 2009 v 18:46 -0400: >> In any case, the barriers to implementing 8.3-style hash indexes in 8.4 >> are pretty huge: you'd need to duplicate not only the hash AM code, but >> also all the hash functions, and therefore all of the hash pg_amop and >> pg_amproc entries. > I'm not sure if I need duplicate functions. Generally yes but It seems > to me that hash index does not changed functions behavior and they could > be shared at this moment. No, the behavior of the hash functions themselves changed during 8.4. Twice, even: 2008-04-06 12:54 tgl * contrib/dblink/expected/dblink.out,contrib/dblink/sql/dblink.sql, src/backend/access/hash/hashfunc.c,src/include/catalog/catversion.h,src/test/regress/expected/portals.out,src/test/regress/sql/portals.sql: Improvehash_any() to useword-wide fetches when hashing suitably aligned data. This makesfor a significant speedup at thecost that the results now varybetween little-endian and big-endian machines; which forces us toadd explicit ORDER BYsin a couple of regression tests to preservemachine-independent comparison results. Also, force initdb bybumping catversion,since the contents of hash indexes will change(at least on big-endian machines).Kenneth Marshall and Tom Lane,based on work from Bob Jenkins. This commit does not adopt Bob's new faster mix() algorithm,however, since we stillneed to convince ourselves that thatdoesn't degrade the quality of the hashing. 2009-02-09 16:18 tgl * src/: backend/access/hash/hashfunc.c,include/catalog/catversion.h,test/regress/expected/polymorphism.out,test/regress/expected/union.out, test/regress/sql/polymorphism.sql:AdoptBob Jenkins' improved hash function for hash_any(). Thischanges the contents of hashindexes (again), so bump catversion.Kenneth Marshall So as far as I can see, you need completely separate copies of both hash_any() and the SQL-level functions that call it. I'm not really seeing that the proposed refactoring makes this any easier. You might as well just copy-and-paste all that old code into a separate set of files, and not worry about what is in access/hash.h. regards, tom lane
Re: [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
From
Alvaro Herrera
Date:
David Fetter wrote: > On Mon, May 25, 2009 at 12:24:05PM -0400, Tom Lane wrote: > > If you'd like to accomplish something *useful* about this, how about > > pestering git upstream to support diff -c output format? > > It looks like this is doable with a suitable git configuration file > such as $HOME/.gitconfig or (finer grain) a .git/config for the > repository :) Can you be more specific on the necessary contents of such file? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
David Fetter <david@fetter.org> writes: > On Mon, May 25, 2009 at 12:24:05PM -0400, Tom Lane wrote: >> If you'd like to accomplish something *useful* about this, how about >> pestering git upstream to support diff -c output format? > It looks like this is doable with a suitable git configuration file > such as $HOME/.gitconfig or (finer grain) a .git/config for the > repository :) Cool, let's see one. If we were to put it into a repository config file, that would more or less have the effect of enforcing a project style for diffs, no? regards, tom lane
Re: [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
From
Andres Freund
Date:
On 05/25/2009 07:20 PM, Alvaro Herrera wrote: > David Fetter wrote: >> On Mon, May 25, 2009 at 12:24:05PM -0400, Tom Lane wrote: >>> If you'd like to accomplish something *useful* about this, how about >>> pestering git upstream to support diff -c output format? >> >> It looks like this is doable with a suitable git configuration file >> such as $HOME/.gitconfig or (finer grain) a .git/config for the >> repository :) > Can you be more specific on the necessary contents of such file? A very sketchy notion of it is at: http://wiki.postgresql.org/wiki/Talk:Working_with_Git I will try to correct the wording + windows information after eating. Andres
Re: [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
From
Andres Freund
Date:
On 05/25/2009 07:31 PM, Tom Lane wrote: > David Fetter<david@fetter.org> writes: >> On Mon, May 25, 2009 at 12:24:05PM -0400, Tom Lane wrote: >>> If you'd like to accomplish something *useful* about this, how about >>> pestering git upstream to support diff -c output format? > >> It looks like this is doable with a suitable git configuration file >> such as $HOME/.gitconfig or (finer grain) a .git/config for the >> repository :) > > Cool, let's see one. > > If we were to put it into a repository config file, that would more or > less have the effect of enforcing a project style for diffs, no? Yes and no. You can define that a subset (or all) files use a specific "diff driver" in the repository - unfortunately the definition of that driver has to be done locally. Defining it currently involves installing a wrapper like the one on http://wiki.postgresql.org/wiki/Talk:Working_with_Git and doing Andres
Re: [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
From
Andres Freund
Date:
On 05/25/2009 07:53 PM, Andres Freund wrote: > On 05/25/2009 07:31 PM, Tom Lane wrote: >> David Fetter<david@fetter.org> writes: >>> On Mon, May 25, 2009 at 12:24:05PM -0400, Tom Lane wrote: >>>> If you'd like to accomplish something *useful* about this, how about >>>> pestering git upstream to support diff -c output format? >> >>> It looks like this is doable with a suitable git configuration file >>> such as $HOME/.gitconfig or (finer grain) a .git/config for the >>> repository :) >> >> Cool, let's see one. >> >> If we were to put it into a repository config file, that would more or >> less have the effect of enforcing a project style for diffs, no? > Yes and no. > > You can define that a subset (or all) files use a specific "diff driver" > in the repository - unfortunately the definition of that driver has to > be done locally. Defining it currently involves installing a wrapper > like the one on http://wiki.postgresql.org/wiki/Talk:Working_with_Git > and doing Ugh, hit the wrong key: and executing `git config --global diff.context.command "git-external-diff"` Andres
Andres Freund <andres@anarazel.de> writes: >> You can define that a subset (or all) files use a specific "diff driver" >> in the repository - unfortunately the definition of that driver has to >> be done locally. Defining it currently involves installing a wrapper >> like the one on http://wiki.postgresql.org/wiki/Talk:Working_with_Git >> and doing > Ugh, hit the wrong key: > and executing > `git config --global diff.context.command "git-external-diff"` Okay, so it will more or less have to be a local option. That's okay ... all I really insist on is being able to get a readable diff out of it. I grant that not everyone may have the same opinion about what's readable. regards, tom lane
Re: [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
From
Andres Freund
Date:
On 05/25/2009 07:58 PM, Andres Freund wrote: > On 05/25/2009 07:53 PM, Andres Freund wrote: >> On 05/25/2009 07:31 PM, Tom Lane wrote: >>> David Fetter<david@fetter.org> writes: >>>> On Mon, May 25, 2009 at 12:24:05PM -0400, Tom Lane wrote: >>>>> If you'd like to accomplish something *useful* about this, how about >>>>> pestering git upstream to support diff -c output format? >>> If we were to put it into a repository config file, that would more or >>> less have the effect of enforcing a project style for diffs, no? >> Yes and no. >> You can define that a subset (or all) files use a specific "diff driver" >> in the repository - unfortunately the definition of that driver has to >> be done locally. Defining it currently involves installing a wrapper >> like the one on http://wiki.postgresql.org/wiki/Talk:Working_with_Git >> and doing > Ugh, hit the wrong key: > and executing > `git config --global diff.context.command "git-external-diff"` The content of the former page is now merged into the main page about git http://wiki.postgresql.org/wiki/Working_with_Git and the notes on the Talk: page are deleted. Andres
Re: [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
From
Peter Eisentraut
Date:
On Monday 25 May 2009 20:58:59 Andres Freund wrote: > and executing > `git config --global diff.context.command "git-external-diff"` We already knew that you could do it with a wrapper. But that isn't the answer we were looking for, because it will basically mean that 98% of casual contributors will get it wrong, and it will probably not work very well on Windows. The goal is to get git-diff to do it itself.
Re: [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
From
Andres Freund
Date:
Hi, On 05/26/2009 01:39 PM, Peter Eisentraut wrote: > On Monday 25 May 2009 20:58:59 Andres Freund wrote: >> and executing >> `git config --global diff.context.command "git-external-diff"` > We already knew that you could do it with a wrapper. But that isn't the > answer we were looking for, because it will basically mean that 98% of casual > contributors will get it wrong, and it will probably not work very well on > Windows. It works on windows, linux, solaris (thats what I could get my hands on without bothering). I tested it - it works on any non ancient version of git. (Ancient in the sense, that git at that time didnt work properly on win anyway). And providing a 5-line wrapper download-ready surely makes it easier than figuring it out how to write one out of some git manpages. Also it allows at least those who prefer context diffs to use them easily when using git - that are the ones which seem to prefer using them most. > The goal is to get git-diff to do it itself. I do not disagree. Andres
I'll repeat my suggestion that everyone poo-pooed: we can have the mail list filters recognize patches, run filterdiff on them with our prefered options, and attach the result as an additional attachment (or link to some web directory). I think it would be simple to do and would be happy to give it a go if I can get the necessary access. It doesn't solve *all* the problems since the committee still needs a unified diff if he wants to take advantage of git's merge abilities. I think this is actually all a red herring since it's pretty easy for the reviewer to run filterdiff anyways. But having things be automatic is still always easier than not. -- Greg On 26 May 2009, at 13:54, Andres Freund <andres@anarazel.de> wrote: > Hi, > > On 05/26/2009 01:39 PM, Peter Eisentraut wrote: >> On Monday 25 May 2009 20:58:59 Andres Freund wrote: >>> and executing >>> `git config --global diff.context.command "git-external-diff"` >> We already knew that you could do it with a wrapper. But that >> isn't the >> answer we were looking for, because it will basically mean that 98% >> of casual >> contributors will get it wrong, and it will probably not work very >> well on >> Windows. > It works on windows, linux, solaris (thats what I could get my hands > on without bothering). I tested it - it works on any non ancient > version of git. (Ancient in the sense, that git at that time didnt > work properly on win anyway). > And providing a 5-line wrapper download-ready surely makes it easier > than figuring it out how to write one out of some git manpages. > > Also it allows at least those who prefer context diffs to use them > easily when using git - that are the ones which seem to prefer using > them most. > >> The goal is to get git-diff to do it itself. > I do not disagree. > > Andres > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
Re: [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
From
Zdenek Kotala
Date:
Tom Lane píše v po 25. 05. 2009 v 13:07 -0400: > Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes: > > Tom Lane píše v ne 24. 05. 2009 v 18:46 -0400: > >> In any case, the barriers to implementing 8.3-style hash indexes in 8.4 > >> are pretty huge: you'd need to duplicate not only the hash AM code, but > >> also all the hash functions, and therefore all of the hash pg_amop and > >> pg_amproc entries. > > > I'm not sure if I need duplicate functions. Generally yes but It seems > > to me that hash index does not changed functions behavior and they could > > be shared at this moment. > > No, the behavior of the hash functions themselves changed during 8.4. > Twice, even: hmm, I'm missed it. :( > So as far as I can see, you need completely separate copies of both > hash_any() and the SQL-level functions that call it. I'm not really > seeing that the proposed refactoring makes this any easier. You might > as well just copy-and-paste all that old code into a separate set of > files, and not worry about what is in access/hash.h. Yeah, in this case everything have to be duplicated which is not big deal in comparison to do same amount of work for GIN. Then I can start with GIN. The advantage of refactoring is then only nicer code. thanks Zdenek
Greg Stark <greg.stark@enterprisedb.com> writes: > I'll repeat my suggestion that everyone poo-pooed: we can have the > mail list filters recognize patches, run filterdiff on them with our > prefered options, and attach the result as an additional attachment > (or link to some web directory). The argument that was made at the developer meeting is that the preferred way of working will be to apply the submitted patch in one's local git repository, and then do any needed editorialization as a second patch on top of it. So the critical need as I see it is to be able to see a -c version of a patch-in-progress (ie, diff current working state versus some previous committed state). Readability of the patch as-submitted is useful for quick eyeball checks, but I think all serious reviewing is going to be done on local copies. > I think this is actually all a red herring since it's pretty easy for > the reviewer to run filterdiff anyways. I don't trust filterdiff one bit :-( regards, tom lane
Re: [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
From
Robert Haas
Date:
On Tue, May 26, 2009 at 10:09 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Greg Stark <greg.stark@enterprisedb.com> writes: >> I'll repeat my suggestion that everyone poo-pooed: we can have the >> mail list filters recognize patches, run filterdiff on them with our >> prefered options, and attach the result as an additional attachment >> (or link to some web directory). > > The argument that was made at the developer meeting is that the > preferred way of working will be to apply the submitted patch in one's > local git repository, and then do any needed editorialization as a > second patch on top of it. So the critical need as I see it is to be > able to see a -c version of a patch-in-progress (ie, diff current > working state versus some previous committed state). Readability of the > patch as-submitted is useful for quick eyeball checks, but I think all > serious reviewing is going to be done on local copies. > >> I think this is actually all a red herring since it's pretty easy for >> the reviewer to run filterdiff anyways. > > I don't trust filterdiff one bit :-( For any particular reason, or just natural skepticism? I believe there have been some wild-eyed claims tossed around in this space previously that unified diffs don't provide all the same information as context diffs, which is flatly false. AIUI, the reason for the name "unified diff" is that it combines, or unifies, the "before" and "after" versions of the code into a single chunk. The nice thing about this is that when you have a bunch of small changes in a file, you don't end up with all of the surrounding lines repeated in both the "before" and "after" sections. If you change four consecutive lines and run a unified diff, you end up with 4 +s, 4 -s, and 6 lines of context (3 before and 3 after), for a total of 14 lines. If you run a context diff, you end up with 4 !s and 6 lines of context in the before section and the same in the after section, for a total of 20 lines, 6 of which are duplicated. This means that in many cases you can see what's changed without having to page up and down in the diff. The not-so-nice thing about unified diffs is that when there is a huge hunk of code that's changed, there are probably by chance a few identical lines buried in there, like " }", so the + and - lines end up mixed together in a way that wouldn't happen in a context diff (which would turn the whole thing into two big "!" sections). It's no problem for a machine to understand this, but it's hard to read for a human being. I haven't personally verified the filterdiff code, but the transformation is pretty mechanical so I'm not sure why we should believe that it hasn't been implemented correctly without some evidence along those lines. I don't think there's any way to make anyone 100% happy here. I personally prefer unified diffs, so when I'm reviewing a complex patch formatted as a context diff I typically apply it and then run a unified diff using git. When I'm submitting a patch I use a unified diff to check my work and then convert it to a context diff for submission. On the other hand, I assume that, if you were presented with a complex unified diff, would just apply it and then run a context-diff to review it. Since, as you say, serious reviewing will be done on local copies anyway, I really don't see the point of worrying too much about how they're submitted to the mailing list. Let's just tell everyone to keep using context diffs as the have been doing, and if anyone doesn't then let's THROW THEIR PATCH ON THE DUST-HEAP OF HISTORY AND HAUL THEM OUT TO BE DRAWN AND QUARTERED... er, um, I mean, ask them not to do it that way the next time. If there's an issue here that's worth getting worked up about, I'm not seeing it. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, May 26, 2009 at 10:09 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I don't trust filterdiff one bit :-( > For any particular reason, or just natural skepticism? IIRC it was demonstrated to be broken the last time it was proposed as a solution to our problems. Maybe it's been fixed since then, but I don't have any confidence in it, since evidently it's not been stress tested very hard. > I believe there have been some wild-eyed claims tossed around in this > space previously that unified diffs don't provide all the same > information as context diffs, which is flatly false. No, the gripe has always been just that they're less readable for nontrivial changes. > The not-so-nice thing about unified diffs is that when there is a huge > hunk of code that's changed, there are probably by chance a few > identical lines buried in there, like " }", so the + and - lines > end up mixed together in a way that wouldn't happen in a context diff > (which would turn the whole thing into two big "!" sections). It's no > problem for a machine to understand this, but it's hard to read for a > human being. Exactly. Even without identical lines, I find that the old and new code gets intermixed in easily-confusing ways. -u is very readable for isolated single-line changes, but for anything larger, not so much. regards, tom lane
Re: [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)
From
Alvaro Herrera
Date:
Tom Lane escribió: > Robert Haas <robertmhaas@gmail.com> writes: > > On Tue, May 26, 2009 at 10:09 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> I don't trust filterdiff one bit :-( > > > For any particular reason, or just natural skepticism? > > IIRC it was demonstrated to be broken the last time it was proposed > as a solution to our problems. Maybe it's been fixed since then, but > I don't have any confidence in it, since evidently it's not been stress > tested very hard. I think you're probably confusing it with interdiff. I've had the latter fail several times (and I haven't really used it all that much), but I've never seem filterdiff make a mistake even though I use it frequently. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane escribi�: >> IIRC it was demonstrated to be broken the last time it was proposed >> as a solution to our problems. Maybe it's been fixed since then, but >> I don't have any confidence in it, since evidently it's not been stress >> tested very hard. > I think you're probably confusing it with interdiff. No, because I never heard of interdiff before. Checking the archives, the discussion I was remembering was definitely about filterdiff, but the rap on it was undocumented (so maybe "demonstrated" is too harsh): http://archives.postgresql.org/pgsql-hackers/2007-10/msg01243.php regards, tom lane
Uhm the rap you quoted was ambiguous but I read it as referring to the ability I described if viewing the difference between two patches -- which I didn't name but is in fact interdiff. -- Greg On 26 May 2009, at 19:58, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> Tom Lane escribió: >>> IIRC it was demonstrated to be broken the last time it was proposed >>> as a solution to our problems. Maybe it's been fixed since then, >>> but >>> I don't have any confidence in it, since evidently it's not been >>> stress >>> tested very hard. > >> I think you're probably confusing it with interdiff. > > No, because I never heard of interdiff before. Checking the archives, > the discussion I was remembering was definitely about filterdiff, but > the rap on it was undocumented (so maybe "demonstrated" is too harsh): > > http://archives.postgresql.org/pgsql-hackers/2007-10/msg01243.php > > regards, tom lane
Greg Stark <greg.stark@enterprisedb.com> writes: > On 26 May 2009, at 19:58, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> http://archives.postgresql.org/pgsql-hackers/2007-10/msg01243.php > Uhm the rap you quoted was ambiguous but I read it as referring to the > ability I described if viewing the difference between two patches -- > which I didn't name but is in fact interdiff. [ squint... ] Hmm, maybe you're right. I see how it could be read that way, anyway. regards, tom lane