Thread: Windows buildfarm animals are still not happy with abbreviated keys patch
Windows buildfarm animals are still not happy with abbreviated keys patch
From
Peter Geoghegan
Date:
Even following Robert's disabling of abbreviated keys on Windows, buildfarm animals hamerkop, brolga, currawong and bowerbird remain unhappy, with failing regression tests for "collate" and sometimes (but not always) "aggregates". Some of these only use the C locale. I think that "aggregates" does not fail for brolga because it's built without "locale_t" support (not sure how to interpret MSVC "configure" output, but I think that the other animals do have such support). So maybe this code being executed within btsortsupport_worker(), for the C locale on Windows is at issue (note that abbreviation is still disabled): tss->locale = pg_newlocale_from_collation(collid); That doesn't explain the other problems, though. -- Peter Geoghegan
Re: Windows buildfarm animals are still not happy with abbreviated keys patch
From
Robert Haas
Date:
On Wed, Jan 21, 2015 at 2:30 PM, Peter Geoghegan <pg@heroku.com> wrote: > Even following Robert's disabling of abbreviated keys on Windows, > buildfarm animals hamerkop, brolga, currawong and bowerbird remain > unhappy, with failing regression tests for "collate" and sometimes > (but not always) "aggregates". Some of these only use the C locale. > > I think that "aggregates" does not fail for brolga because it's built > without "locale_t" support (not sure how to interpret MSVC "configure" > output, but I think that the other animals do have such support). So > maybe this code being executed within btsortsupport_worker(), for the > C locale on Windows is at issue (note that abbreviation is still > disabled): > > tss->locale = pg_newlocale_from_collation(collid); Yes, you seem to have rather unwisely changed around the order of the checks in btsortsupport_worker(). I've just rewritten it heavily to try to more clearly separate the decision about whether to do fmgr-elision, and if so which comparator to use, from the decision about whether to use abbreviation. Naturally I can't promise that this is completely right, but I hope that, if it's not, it will at least be easier to fix. There's no sanity in removing the lc_collate_is_c() check from the top of the function and then trying to remember to exclude that case individually from the multiple places further down that count on the fact that they'll never be reached in that case. This function may even have a third decision to make someday, and they can't all be intertwined. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Windows buildfarm animals are still not happy with abbreviated keys patch
From
Robert Haas
Date:
On Thu, Jan 22, 2015 at 10:57 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Jan 21, 2015 at 2:30 PM, Peter Geoghegan <pg@heroku.com> wrote: >> Even following Robert's disabling of abbreviated keys on Windows, >> buildfarm animals hamerkop, brolga, currawong and bowerbird remain >> unhappy, with failing regression tests for "collate" and sometimes >> (but not always) "aggregates". Some of these only use the C locale. >> >> I think that "aggregates" does not fail for brolga because it's built >> without "locale_t" support (not sure how to interpret MSVC "configure" >> output, but I think that the other animals do have such support). So >> maybe this code being executed within btsortsupport_worker(), for the >> C locale on Windows is at issue (note that abbreviation is still >> disabled): >> >> tss->locale = pg_newlocale_from_collation(collid); > > Yes, you seem to have rather unwisely changed around the order of the > checks in btsortsupport_worker(). I've just rewritten it heavily to > try to more clearly separate the decision about whether to do > fmgr-elision, and if so which comparator to use, from the decision > about whether to use abbreviation. Naturally I can't promise that > this is completely right, but I hope that, if it's not, it will at > least be easier to fix. There's no sanity in removing the > lc_collate_is_c() check from the top of the function and then trying > to remember to exclude that case individually from the multiple places > further down that count on the fact that they'll never be reached in > that case. This function may even have a third decision to make > someday, and they can't all be intertwined. This seems to have broken more stuff. My working hypothesis is that the culprit is here: /* * There is no special handling of the C locale here, unlike with * varstr_cmp(). strxfrm() is usedindifferently. */ As far as I can see, that's just hoping that when the locale is C, strxfrm() is memcpy(). If it isn't, then you will potentially get different results when the abbreviated keys stuff is used vs. when it isn't, because when it isn't, we're going to memcmp(), and when it is, we're going to memcmp() the results of strxfrm(). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Windows buildfarm animals are still not happy with abbreviated keys patch
From
Robert Haas
Date:
On Thu, Jan 22, 2015 at 11:28 AM, Robert Haas <robertmhaas@gmail.com> wrote: > This seems to have broken more stuff. My working hypothesis is that > the culprit is here: > > /* > * There is no special handling of the C locale here, unlike with > * varstr_cmp(). strxfrm() is used indifferently. > */ > > As far as I can see, that's just hoping that when the locale is C, > strxfrm() is memcpy(). If it isn't, then you will potentially get > different results when the abbreviated keys stuff is used vs. when it > isn't, because when it isn't, we're going to memcmp(), and when it is, > we're going to memcmp() the results of strxfrm(). As noted also on the thread Kevin started, I've now pushed a fix for this and am eagerly awaiting new buildfarm returns. I wonder if this might account for the ordering failures we were seeing on Windows before. We thought it was a Windows-specific issue, but I bet the real problem was here: if (collid != DEFAULT_COLLATION_OID) { if (!OidIsValid(collid)) { /* * This typically means that the parser could not resolve a * conflict of implicit collations, so report it that way. */ ereport(ERROR, (errcode(ERRCODE_INDETERMINATE_COLLATION), errmsg("could not determine which collation to use for string comparison"), errhint("Use the COLLATE clause to set the collation explicitly."))); } #ifdef HAVE_LOCALE_T tss->locale = pg_newlocale_from_collation(collid); #endif } Before the abbreviated keys commit, this code only ran if we'd already determined that lc_collate_is_c(collid) was false. But that commit made this also run when that value was true. So if HAVE_LOCALE_T and collid != DEFAULT_COLLATION_ID, then tss->locale was getting set. If it got set to something that made strxfrm() behave like memcmp(), then all was well. If not, then life was bad. Now you'd hope that would work out, but maybe not. Also, suppose HAVE_LOCALE_T was defined but collid == DEFAULT_COLLATION_ID. Then tss->locale didn't get initialized at all, and bttext_abbrev_convert() just passed whatever stupid value it found there to strxfrm_l(). Finally, suppose we *didn't* HAVE_LOCALE_T. Then, the non-abbreviated comparator knew it should use memcmp(), but the abbreviated comparator was happy to use strxfrm() even though that would use the current locale, but the C locale that the user requested. Long story short: this was broken. It may be that when the dust settles we can try re-enabling this on Windows. It might work now that this issue is (hopefully) fixed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Windows buildfarm animals are still not happy with abbreviated keys patch
From
Robert Haas
Date:
On Thu, Jan 22, 2015 at 12:17 PM, Robert Haas <robertmhaas@gmail.com> wrote: > Long story short: this was broken. It may be that when the dust > settles we can try re-enabling this on Windows. It might work now > that this issue is (hopefully) fixed. Uggh. That still wasn't right; I've pushed commit d060e07fa919e0eb681e2fa2cfbe63d6c40eb2cf to try to fix it. Stay tuned for more exciting dispatches from the department of abbreviated keys! -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Windows buildfarm animals are still not happy with abbreviated keys patch
From
Peter Geoghegan
Date:
On Thu, Jan 22, 2015 at 9:55 AM, Robert Haas <robertmhaas@gmail.com> wrote: > Stay tuned for more exciting dispatches from the department of abbreviated keys! I certainly suspected that we'd have problems with Windows, but nothing this bad. -- Peter Geoghegan
Re: Windows buildfarm animals are still not happy with abbreviated keys patch
From
Robert Haas
Date:
On Thu, Jan 22, 2015 at 1:00 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Thu, Jan 22, 2015 at 9:55 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> Stay tuned for more exciting dispatches from the department of abbreviated keys! > > I certainly suspected that we'd have problems with Windows, but > nothing this bad. This isn't really Windows-specific. The root of the problem is that when LC_COLLATE=C you were trying to use strxfrm() for the abbreviated key even though memcmp() is the authoritative comparator in that case. Exactly which platforms happened to blow up as a result of that is kind of beside the point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Windows buildfarm animals are still not happy with abbreviated keys patch
From
Peter Geoghegan
Date:
On Thu, Jan 22, 2015 at 12:34 PM, Robert Haas <robertmhaas@gmail.com> wrote: > This isn't really Windows-specific. The root of the problem is that > when LC_COLLATE=C you were trying to use strxfrm() for the abbreviated > key even though memcmp() is the authoritative comparator in that case. > Exactly which platforms happened to blow up as a result of that is > kind of beside the point. I don't see how that could be a problem. Even if the strxfrm() blob wasn't identical to the original string (I think it should always be, accept maybe on MacOSX), it's still reasonable to assume that there is equivalent behavior across the C locale and locales with collations like en_US.UTF-8. It wasn't as if I was using strxfrm() within bttextfastcmp_c() -- just within bttext_abbrev_convert(), to form an abbreviated key for text that uses the C locale. The C locale is just another locale - AFAICT, the only reason we have treated it differently in PostgreSQL is because that tended to avoid needing to copy and NUL-terminate, which strcoll() requires. It might be that that doesn't work out for some reason (but not because strxfrm() will not have behavior equivalent to memcpy() with the C locale -- I was *not* relying on that). That having been said, it's clearer to continue to handle each case (C locale vs other locales) separately within the new bttext_abbrev_convert() function, just to be consistent, but also to avoid NUL-terminating the text strings to pass to strxfrm(), which as you point out is an avoidable cost. So, I'm not asking you to restore the terser uniform use of strxfrm() we had before, but, for what it's worth, that was not the real issue. The real issue was that strxfrm() spuriously used the wrong locale, as you said. Provided strxfrm() had the correct locale (the C locale), then AFAICT it ought to have been fine, regardless of whether or not it then behave exactly equivalently to memcpy(). -- Peter Geoghegan
Re: Windows buildfarm animals are still not happy with abbreviated keys patch
From
Robert Haas
Date:
On Thu, Jan 22, 2015 at 5:51 PM, Peter Geoghegan <pg@heroku.com> wrote: > That having been said, it's clearer to continue to handle each case (C > locale vs other locales) separately within the new > bttext_abbrev_convert() function, just to be consistent, but also to > avoid NUL-terminating the text strings to pass to strxfrm(), which as > you point out is an avoidable cost. So, I'm not asking you to restore > the terser uniform use of strxfrm() we had before, but, for what it's > worth, that was not the real issue. The real issue was that strxfrm() > spuriously used the wrong locale, as you said. Provided strxfrm() had > the correct locale (the C locale), then AFAICT it ought to have been > fine, regardless of whether or not it then behave exactly equivalently > to memcpy(). I don't agree. On a system where HAVE_LOCALE_T is not defined, there is *no way* to get strxfrm() to behave like memcpy(), because we're not passing a locale to it. Clearly, strxfrm() is going to do a locale-aware transformation in that case whether the user wrote collate "C" or not. The comments in regc_pg_locale.c explain: * 1. In C/POSIX collations, we use hard-wired code. We can't depend on* the <ctype.h> functions since those will obey LC_CTYPE. Note that these* collations don't give a fig about multibyte characters.** 2. In the "default" collation (whichis supposed to obey LC_CTYPE):** 2a. When working in UTF8 encoding, we use the <wctype.h> functions if* available. This assumes that every platform uses Unicode codepoints* directly as the wchar_t representation of Unicode. On some platforms* wchar_t is only 16 bits wide, so we have to punt for codepoints > 0xFFFF.** 2b. In all other encodings,or on machines that lack <wctype.h>, we use* the <ctype.h> functions for pg_wchar values up to 255, and punt forvalues* above that. This is only 100% correct in single-byte encodings such as* LATINn. However, non-Unicode multibyteencodings are mostly Far Eastern* character sets for which the properties being tested here aren't very* relevantfor higher code values anyway. The difficulty with using the* <wctype.h> functions with non-Unicode multibyte encodingsis that we can* have no certainty that the platform's wchar_t representation matches* what we do in pg_wchar conversions.**3. Other collations are only supported on platforms that HAVE_LOCALE_T.* Here, we use the locale_t-extendedforms of the <wctype.h> and <ctype.h>* functions, under exactly the same cases as #2. In other words, even on systems that don't HAVE_LOCALE_T, we still have to support the default collation and the C collation, and they have to behave differently. There's no way to make that work using only strxfrm(), because nothing gets passed to that function to tell it which of those two things it is supposed to do. Now even if the above were not an issue, for example because we dropped support for systems where !HAVE_LOCALE_T, I think it would be poor form to depend on strxfrm_l() to behave like memcpy() where we could just as easily call memcpy() and be *sure* that it was going to do what we wanted. Much of writing good code is figuring out what could go wrong and then figuring out how to prevent it, and relying on strxfrm_l() would be an unnecessary risk regardless. Given the existence of !HAVE_LOCALE_T systems, it's just plain broken. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Windows buildfarm animals are still not happy with abbreviated keys patch
From
Robert Haas
Date:
On Fri, Jan 23, 2015 at 12:34 PM, Robert Haas <robertmhaas@gmail.com> wrote: > In other words, even on systems that don't HAVE_LOCALE_T, we still > have to support the default collation and the C collation, and they > have to behave differently. There's no way to make that work using > only strxfrm(), because nothing gets passed to that function to tell > it which of those two things it is supposed to do. > > Now even if the above were not an issue, for example because we > dropped support for systems where !HAVE_LOCALE_T, I think it would be > poor form to depend on strxfrm_l() to behave like memcpy() where we > could just as easily call memcpy() and be *sure* that it was going to > do what we wanted. Much of writing good code is figuring out what > could go wrong and then figuring out how to prevent it, and relying on > strxfrm_l() would be an unnecessary risk regardless. Given the > existence of !HAVE_LOCALE_T systems, it's just plain broken. Now that these issues are fixed and the buildfarm is green again, I'm going to try re-enabling this optimization on Windows. My working theory is that disabling that categorically was a mis-diagnosis of the real problem, and that now that the issues mentioned above are cleaned up, it'll just work. That might not be right, but I think it's worth a try. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Windows buildfarm animals are still not happy with abbreviated keys patch
From
Peter Geoghegan
Date:
On Mon, Jan 26, 2015 at 11:05 AM, Robert Haas <robertmhaas@gmail.com> wrote: > Now that these issues are fixed and the buildfarm is green again, I'm > going to try re-enabling this optimization on Windows. My working > theory is that disabling that categorically was a mis-diagnosis of the > real problem, and that now that the issues mentioned above are cleaned > up, it'll just work. That might not be right, but I think it's worth > a try. Right. We're only supporting the C locale on Windows. How could Windows possibly be broken if locale-related functions like strxfrm() and strcoll() are not even called? -- Peter Geoghegan
Re: Windows buildfarm animals are still not happy with abbreviated keys patch
From
Robert Haas
Date:
On Mon, Jan 26, 2015 at 2:10 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Mon, Jan 26, 2015 at 11:05 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> Now that these issues are fixed and the buildfarm is green again, I'm >> going to try re-enabling this optimization on Windows. My working >> theory is that disabling that categorically was a mis-diagnosis of the >> real problem, and that now that the issues mentioned above are cleaned >> up, it'll just work. That might not be right, but I think it's worth >> a try. > > Right. We're only supporting the C locale on Windows. How could > Windows possibly be broken if locale-related functions like strxfrm() > and strcoll() are not even called? That's what I hope to find out. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Windows buildfarm animals are still not happy with abbreviated keys patch
From
Michael Paquier
Date:
On Tue, Jan 27, 2015 at 4:21 AM, Robert Haas <robertmhaas@gmail.com> wrote: > That's what I hope to find out. :-) Buildfarm seems happy now. I just gave a try to that on one of my small Windows VMs and compared the performance with 9.4 for this simple test case when building with MSVC 2010: create table aa as select random()::text as a, 'filler filler filler' as b a from generate_series(1,1000000); create index aai in aa(a): On 9.4, the index creation took 26.5s, while on master it took 18s. That's nice, particularly for things like a restore from a dump. -- Michael
Re: Windows buildfarm animals are still not happy with abbreviated keys patch
From
Robert Haas
Date:
On Mon, Jan 26, 2015 at 9:52 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Tue, Jan 27, 2015 at 4:21 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> That's what I hope to find out. :-) > Buildfarm seems happy now. I just gave a try to that on one of my > small Windows VMs and compared the performance with 9.4 for this > simple test case when building with MSVC 2010: > create table aa as select random()::text as a, 'filler filler filler' > as b a from generate_series(1,1000000); > create index aai in aa(a): > On 9.4, the index creation took 26.5s, while on master it took 18s. > That's nice, particularly for things like a restore from a dump. Cool. That's a little bit smaller win than I would have expected given my Linux results, but it's certainly respectable. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company