Thread: [17] CREATE COLLATION default provider
Currently, CREATE COLLATION always defaults the provider to libc. The attached patch causes it to default to libc if LC_COLLATE/LC_CTYPE are specified, otherwise default to the current database default collation's provider. That way, the provider choice at initdb time then becomes the default for "CREATE DATABASE ... TEMPLATE template0", which then becomes the default provider for "CREATE COLLATION (LOCALE='...')". -- Jeff Davis PostgreSQL Contributor Team - AWS
Attachment
On Wed, Jun 14, 2023 at 9:48 PM Jeff Davis <pgsql@j-davis.com> wrote: > > Currently, CREATE COLLATION always defaults the provider to libc. > > The attached patch causes it to default to libc if LC_COLLATE/LC_CTYPE > are specified, otherwise default to the current database default > collation's provider. + if (lccollateEl || lcctypeEl) + collprovider = COLLPROVIDER_LIBC; + else + collprovider = default_locale.provider; The docs for the CREATE COLLATION option 'locale' say: "This is a shortcut for setting LC_COLLATE and LC_CTYPE at once." So it's not intuitive why the check does not include a test for the presence of 'localeEl', as well? If we consider the presence of LC_COLLATE _or_ LC_CTYPE options to be a determining factor for some decision, then the presence of LOCALE option should also lead to the same outcome. Otherwise the patch looks good. > v11-0001-CREATE-COLLATION-default-provider.patch I believe v11 is a typo, and you really meant v1. Best regards, Gurjeet http://Gurje.et
On Sat, 2023-06-17 at 09:09 -0700, Gurjeet Singh wrote: > The docs for the CREATE COLLATION option 'locale' say: "This is a > shortcut for setting LC_COLLATE and LC_CTYPE at once." > > So it's not intuitive why the check does not include a test for the > presence of 'localeEl', as well? If we consider the presence of > LC_COLLATE _or_ LC_CTYPE options to be a determining factor for some > decision, then the presence of LOCALE option should also lead to the > same outcome. > The docs say: "If provider is libc, this is a shortcut...". The point is that LC_COLLATE and LC_CTYPE act as a signal that what the user really wants is a libc collation. LOCALE works for either, so we need a default. That being said, I'm now having second thoughts about where that default should come from. While getting the default from datlocprovider is convenient, I'm not sure that the datlocprovider provides a good signal. A lot of users will have datlocprovider=c and datcollate=C, which really means they want the built-in memcmp behavior, and to me that doesn't signal that they want CREATE COLLATION to use libc for a non-C locale. A GUC might be a better default, and we could have CREATE COLLATION default to ICU if the server is built with ICU and if PROVIDER, LC_COLLATE and LC_CTYPE are unspecified. Regards, Jeff Davis
On Fri, Jul 7, 2023 at 9:33 AM Jeff Davis <pgsql@j-davis.com> wrote: > > On Sat, 2023-06-17 at 09:09 -0700, Gurjeet Singh wrote: > > The docs for the CREATE COLLATION option 'locale' say: "This is a > > shortcut for setting LC_COLLATE and LC_CTYPE at once." > > > > So it's not intuitive why the check does not include a test for the > > presence of 'localeEl', as well? If we consider the presence of > > LC_COLLATE _or_ LC_CTYPE options to be a determining factor for some > > decision, then the presence of LOCALE option should also lead to the > > same outcome. > > > > The docs say: "If provider is libc, this is a shortcut...". The point > is that LC_COLLATE and LC_CTYPE act as a signal that what the user > really wants is a libc collation. LOCALE works for either, so we need a > default. Sorry about the noise, I was consulting current/v15 docs online. Now that v16 docs are online, I can see that the option in fact says this is the case only if libc is the provider. (note to self: for reviewing patches to master, consult devel docs [1] online) [1]: https://www.postgresql.org/docs/devel/ Best regards, Gurjeet http://Gurje.et
On 15.06.23 06:47, Jeff Davis wrote: > Currently, CREATE COLLATION always defaults the provider to libc. > > The attached patch causes it to default to libc if LC_COLLATE/LC_CTYPE > are specified, otherwise default to the current database default > collation's provider. > > That way, the provider choice at initdb time then becomes the default > for "CREATE DATABASE ... TEMPLATE template0", which then becomes the > default provider for "CREATE COLLATION (LOCALE='...')". I like the general idea. If the user has selected ICU overall, it could be sensible that certain commands default to ICU. I wonder, however, how useful this would be in practice. In most interesting cases, you need to know what the provider is to be able to spell out the locale name appropriately. The cases where some overlap exists, like the common "ll_CC", are already preloaded, so won't actually need to be specified explicitly in many cases. Also, I think the default should only flow one way, top-down: The default provider of CREATE COLLATION is datlocprovider. There shouldn't be a second, botton-up way, based on the other specified CREATE COLLATION parameters. That's just too much logical zig-zag, IMO. Otherwise, if you extend this locally, why not also look at if "deterministic" or "rules" was specified, etc.
On Thu, 2024-01-18 at 11:15 +0100, Peter Eisentraut wrote: > I wonder, however, how useful this would be in practice. In most > interesting cases, you need to know what the provider is to be able > to > spell out the locale name appropriately. The cases where some > overlap > exists, like the common "ll_CC", are already preloaded, so won't > actually need to be specified explicitly in many cases. Good point. > Also, I think the default should only flow one way, top-down: The > default provider of CREATE COLLATION is datlocprovider. There > shouldn't > be a second, botton-up way, based on the other specified CREATE > COLLATION parameters. That's just too much logical zig-zag, IMO. Also a good point. I am withdrawing this patch from the CF, and we can reconsider the idea later perhaps in some other form. Regards, Jeff Davis