Thread: effective_cache_size vs units
Is there any special reason why I can't use "Mb" and "Gb" and such for effective_cache_size, the way I can for say shared_buffers? //Magnus
Magnus Hagander wrote: > Is there any special reason why I can't use "Mb" and "Gb" and such > for effective_cache_size, the way I can for say shared_buffers? You can't use "Mb" or "Gb" for shared_buffers either, because those are not accepted units. -- Peter Eisentraut http://developer.postgresql.org/~petere/
On Mon, 2006-12-18 at 22:08 +0100, Peter Eisentraut wrote: > Magnus Hagander wrote: > > Is there any special reason why I can't use "Mb" and "Gb" and such > > for effective_cache_size, the way I can for say shared_buffers? > > You can't use "Mb" or "Gb" for shared_buffers either, because those are > not accepted units. Magnus, Here is a link that may help: http://www.postgresql.org/docs/8.2/static/config-setting.html It looks like it is very pedantic about the input it can receive. Sincerely, Joshua D. Drake > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
Peter Eisentraut wrote: > Magnus Hagander wrote: >> Is there any special reason why I can't use "Mb" and "Gb" and such >> for effective_cache_size, the way I can for say shared_buffers? > > You can't use "Mb" or "Gb" for shared_buffers either, because those are > not accepted units. > Oh, you mean MB vs Mb. Man, it had to be that simple :) Sorry about the noise. //Magnus
Magnus Hagander <magnus@hagander.net> writes: > Oh, you mean MB vs Mb. Man, it had to be that simple :) ISTM we had discussed whether guc.c should accept units strings in a case-insensitive manner, and the forces of pedantry won the first round. Shall we reopen that argument? regards, tom lane
On Mon, 2006-12-18 at 23:46 -0500, Tom Lane wrote: > Magnus Hagander <magnus@hagander.net> writes: > > Oh, you mean MB vs Mb. Man, it had to be that simple :) > > ISTM we had discussed whether guc.c should accept units strings in > a case-insensitive manner, and the forces of pedantry won the first > round. Shall we reopen that argument? I don't think that anyone is going to think, oh I am using 1000 Mega Bit of ram. Mb == MB in this case. That being said, it is documented and I don't know that it makes that much difference as long as the documentation is clear. Hmm perhaps perhaps a quick statement to the effect of what is legal in the postgresql.conf? E.g; # # When setting memory parameters you may used a shortened sytanx e.g., # 1024MB or 1GB is 1 Gigabyte of ram. Note that MB/GB is capitalized. # Sincerely, Joshua D. Drake > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
On Mon, Dec 18, 2006 at 08:56:22PM -0800, Joshua D. Drake wrote: > On Mon, 2006-12-18 at 23:46 -0500, Tom Lane wrote: > > Magnus Hagander <magnus@hagander.net> writes: > > > Oh, you mean MB vs Mb. Man, it had to be that simple :) > > > > ISTM we had discussed whether guc.c should accept units strings in > > a case-insensitive manner, and the forces of pedantry won the first > > round. Shall we reopen that argument? > > I don't think that anyone is going to think, oh I am using 1000 Mega Bit > of ram. Mb == MB in this case. That being said, it is documented and I > don't know that it makes that much difference as long as the > documentation is clear. Is it possible to add an error hint to the message? Along the line of "HINT: Did you perhaps get your casing wrong" (with better wording, of course). //Magnus
Magnus Hagander wrote: > Is it possible to add an error hint to the message? Along the line of > "HINT: Did you perhaps get your casing wrong" (with better wording, > of course). Or how about we just make everything case-insensitive -- but case-preserving! -- on Windows only? -- Peter Eisentraut http://developer.postgresql.org/~petere/
On Tue, Dec 19, 2006 at 10:01:05AM +0100, Peter Eisentraut wrote: > Magnus Hagander wrote: > > Is it possible to add an error hint to the message? Along the line of > > "HINT: Did you perhaps get your casing wrong" (with better wording, > > of course). > > Or how about we just make everything case-insensitive -- but > case-preserving! -- on Windows only? Wouldn't help me one bit, I had this problem on the install for search.postgresql.org, which runs on Ubuntu Linux. //Magnus
On Tue, 2006-12-19 at 10:01 +0100, Peter Eisentraut wrote: > Magnus Hagander wrote: > > Is it possible to add an error hint to the message? Along the line of > > "HINT: Did you perhaps get your casing wrong" (with better wording, > > of course). > > Or how about we just make everything case-insensitive -- but > case-preserving! -- on Windows only? Or we could simply add a helpful line to the postgresql.conf. Sincerely, Joshua D. Drake > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
On Tue, 2006-12-19 at 13:32 -0800, Joshua D. Drake wrote: > On Tue, 2006-12-19 at 10:01 +0100, Peter Eisentraut wrote: > > Magnus Hagander wrote: > > > Is it possible to add an error hint to the message? Along the line of > > > "HINT: Did you perhaps get your casing wrong" (with better wording, > > > of course). > > > > Or how about we just make everything case-insensitive -- but > > case-preserving! -- on Windows only? > > Or we could simply add a helpful line to the postgresql.conf. Index: postgresql.conf.sample =================================================================== RCS file: /projects/cvsroot/pgsql/src/backend/utils/misc/postgresql.conf.sample,v retrieving revision 1.199 diff -c -r1.199 postgresql.conf.sample *** postgresql.conf.sample 21 Nov 2006 01:23:37 -0000 1.199 --- postgresql.conf.sample 19 Dec 2006 21:36:28 -0000 *************** *** 24,29 **** --- 24,33 ---- # settings, which are marked below, require a server shutdown and restart # to take effect. + # + # Any memory setting may use a shortened notation such as 1024MB or 1GB. + # Please take note of the case next to the unit size. + # #--------------------------------------------------------------------------- # FILE LOCATIONS > > Sincerely, > > Joshua D. Drake > > > > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
Joshua D. Drake wrote: > On Tue, 2006-12-19 at 10:01 +0100, Peter Eisentraut wrote: > > Magnus Hagander wrote: > > > Is it possible to add an error hint to the message? Along the line of > > > "HINT: Did you perhaps get your casing wrong" (with better wording, > > > of course). > > > > Or how about we just make everything case-insensitive -- but > > case-preserving! -- on Windows only? > > Or we could simply add a helpful line to the postgresql.conf. Looking at the documentation I see: (possibly different) unit can also be specified explicitly. Valid memory units are <literal>kB</literal> (kilobytes), <literal>MB</literal> (megabytes), and <literal>GB</literal> (gigabytes); valid time units are <literal>ms</literal> (milliseconds), <literal>s</literal> (seconds), <literal>min</literal> (minutes), <literal>h</literal>(hours), and <literal>d</literal> (days). Note that the multiplier for memory units is 1024, not1000. The only value to being case-sensitive in this area is to allow upper/lower case with different meanings, but I don't see us using that, so why do we bother caring about the case? -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Tue, 2006-12-19 at 16:47 -0500, Bruce Momjian wrote: > Joshua D. Drake wrote: > > On Tue, 2006-12-19 at 10:01 +0100, Peter Eisentraut wrote: > > > Magnus Hagander wrote: > > > > Is it possible to add an error hint to the message? Along the line of > > > > "HINT: Did you perhaps get your casing wrong" (with better wording, > > > > of course). > > > > > > Or how about we just make everything case-insensitive -- but > > > case-preserving! -- on Windows only? > > > > Or we could simply add a helpful line to the postgresql.conf. > > Looking at the documentation I see: > > (possibly different) unit can also be specified explicitly. Valid > memory units are <literal>kB</literal> (kilobytes), > <literal>MB</literal> (megabytes), and <literal>GB</literal> > (gigabytes); valid time units are <literal>ms</literal> > (milliseconds), <literal>s</literal> (seconds), > <literal>min</literal> (minutes), <literal>h</literal> (hours), > and <literal>d</literal> (days). Note that the multiplier for > memory units is 1024, not 1000. > > The only value to being case-sensitive in this area is to allow > upper/lower case with different meanings, but I don't see us using that, > so why do we bother caring about the case? Because it is technically correct :). Sincerely, Joshua D. Drake > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
Joshua D. Drake wrote: > + # > + # Any memory setting may use a shortened notation such as 1024MB or > 1GB. > + # Please take note of the case next to the unit size. > + # Well, if you add that, you should also list all the other valid units. But it's quite redundant, because nearly all the parameters that take units are already listed with units in the default file. (Which makes Magnus's mistake all the more curios.) In my mind, this is pretty silly. There is no reputable precedent anywhere for variant capitalization in unit names. Next thing we point out that zeros are significant in the interior of numbers, or what? -- Peter Eisentraut http://developer.postgresql.org/~petere/
On Tue, 2006-12-19 at 22:59 +0100, Peter Eisentraut wrote: > Joshua D. Drake wrote: > > + # > > + # Any memory setting may use a shortened notation such as 1024MB or > > 1GB. > > + # Please take note of the case next to the unit size. > > + # > > Well, if you add that, you should also list all the other valid units. Why? It is clearly just an example. > But it's quite redundant, because nearly all the parameters that take > units are already listed with units in the default file. (Which makes > Magnus's mistake all the more curios.) Not really, most people I know don't even consider the difference between MB and Mb... shoot most people think that 1000MB equals one Gigabyte. > > In my mind, this is pretty silly. There is no reputable precedent > anywhere for variant capitalization in unit names. I am not suggestion variant capitalization. I am suggestion a simple document patch to help eliminate what may not be obvious. Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
Peter Eisentraut wrote: > Joshua D. Drake wrote: >> + # >> + # Any memory setting may use a shortened notation such as 1024MB or >> 1GB. >> + # Please take note of the case next to the unit size. >> + # > > Well, if you add that, you should also list all the other valid units. > But it's quite redundant, because nearly all the parameters that take > units are already listed with units in the default file. (Which makes > Magnus's mistake all the more curios.) > The explanation is pretty simple. I was in a hurry to set it, just opened the file up in vi, jumped to effective cache size, and set it. I remembered that "hey, I can spec it in Mb now, I don't have to think, brilliant", and just typed it in. Restarted pg and noticed it wouldn't start... Had I actually read through all the documentation before I did it, it certainly wouldn't have been a problem. I doubt many users actually do that, though. In most cases, I just assume they would just assume they can't use units on it because the default value in the file doesn't have units. And frankly, this is the only case I can recall having seen when the input is actually case sensitive between Mb and MB. Could be that I'm not exposed to enough systems that take such input, but I can't imagine there aren't others who would make the same mistake. //Magnus
> > In my mind, this is pretty silly. There is no reputable precedent > > anywhere for variant capitalization in unit names. > > I am not suggestion variant capitalization. I am suggestion a simple > document patch to help eliminate what may not be obvious. Good lord... *suggesting* Joshua D. Drake > Sincerely, > > Joshua D. Drake > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
Bruce Momjian wrote: > The only value to being case-sensitive in this area is to allow > upper/lower case with different meanings, but I don't see us using > that, so why do we bother caring about the case? Because the units are what they are. In broader terms, we may one day want to have other units or a units-aware data type, so introducing incompatibilities now would be unfortunate. -- Peter Eisentraut http://developer.postgresql.org/~petere/
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > Magnus Hagander <magnus@hagander.net> writes: >> Oh, you mean MB vs Mb. Man, it had to be that simple :) > > ISTM we had discussed whether guc.c should accept units strings in > a case-insensitive manner, and the forces of pedantry won the first > round. Shall we reopen that argument? Nope, I just checked back in the archive and that's not what happened. There was an extended discussion about whether to force users to use the silly KiB, MiB, etc units. Thankfully the pedants lost that round soundly. There was no particular discussion about case sensitivity though Simon made the case for user-friendly behaviour: > I think we are safe assume to that > > kB = KB = kb = Kb = 1024 bytes > > mB = MB = mb = Mb = 1024 * 1024 bytes > > gB = GB = gb = Gb = 1024 * 1024 * 1024 bytes > > There's no value in forcing the use of specific case and it will be just > confusing for people. http://archives.postgresql.org/pgsql-hackers/2006-07/msg01253.php And Jim Nasby said something similar: > Forcing people to use a specific casing scheme is just going to lead to > confusion and user frustration. If there's not a very solid *functional* > argument for it, we shouldn't do it. Wanting to enforce a convention that > people rarely use isn't a good reason. http://archives.postgresql.org/pgsql-hackers/2006-07/msg01355.php There was a lone comment from Thomas Hallgren in favour of case sensitivity in the name of consistency. But Nasby's comment was directly in response and nobody else piped up after that. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Magnus Hagander wrote: > In most cases, I just assume they would just assume > they can't use units on it because the default value in the file > doesn't have units. But the default value *does* have units. -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut wrote: > Magnus Hagander wrote: >> In most cases, I just assume they would just assume >> they can't use units on it because the default value in the file >> doesn't have units. > > But the default value *does* have units. > It does? Didn't in my file. I must've overwritten it with a config file from some earlier beta (or snapshot) that didn't have it or so - my default value certainly didn't have it ;-) //Magnus
On Tue, 2006-12-19 at 23:39 +0100, Magnus Hagander wrote: > Peter Eisentraut wrote: > > Magnus Hagander wrote: > >> In most cases, I just assume they would just assume > >> they can't use units on it because the default value in the file > >> doesn't have units. > > > > But the default value *does* have units. > > > It does? Didn't in my file. I must've overwritten it with a config file > from some earlier beta (or snapshot) that didn't have it or so - my > default value certainly didn't have it ;-) Just to confirm, yes the sample I submitted a patch for does have units. Joshua D. Drake > > //Magnus > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
Joshua D. Drake wrote: > I am not suggestion variant capitalization. I am suggestion a simple > document patch to help eliminate what may not be obvious. Perhaps it would be more effective to clarify the error message? Right now it just says something to the effect of "invalid integer". I'd imagine "invalid memory unit: TB" would be less confusing. -- Peter Eisentraut http://developer.postgresql.org/~petere/
"Kenneth Marshall" <ktm@it.is.rice.edu> writes: > My one comment is that a little 'b' is used to indicate bits normally > and a capital 'B' is used to indicate bytes. So > kb = '1024 bits' > kB = '1024 bytes' > I do think that whether or not the k/m/g is upper case or lower case > is immaterial. Yes, well, no actually there are standard capitalizations for the k and M and G. A lowercase g is a gram and a lowercase m means "milli-". But I think that only gets you as far as concluding that Postgres ought to consistently use kB MB and GB in its own output. Which afaik it does. To reach a conclusion about whether it should restrict valid user input similarly you would have to make some sort of argument about what problems it could lead to if we allow users to be sloppy. I could see such an argument being made but it requires a lot of speculation about hypothetical future parameters and future problems. When we have known real problems today. And yes, btw, the case sensitivity of these units had already surprised and bothered me earlier and I failed to mention it at the time. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Gregory Stark wrote: > > "Kenneth Marshall" <ktm@it.is.rice.edu> writes: > > > My one comment is that a little 'b' is used to indicate bits normally > > and a capital 'B' is used to indicate bytes. So > > kb = '1024 bits' > > kB = '1024 bytes' > > I do think that whether or not the k/m/g is upper case or lower case > > is immaterial. > > Yes, well, no actually there are standard capitalizations for the k and M and > G. A lowercase g is a gram and a lowercase m means "milli-". I will have 150 grams of shared memory, please. > But I think that only gets you as far as concluding that Postgres ought to > consistently use kB MB and GB in its own output. Which afaik it does. > > To reach a conclusion about whether it should restrict valid user input > similarly you would have to make some sort of argument about what problems it > could lead to if we allow users to be sloppy. > > I could see such an argument being made but it requires a lot of speculation > about hypothetical future parameters and future problems. When we have known > real problems today. > > And yes, btw, the case sensitivity of these units had already surprised and > bothered me earlier and I failed to mention it at the time. Agreed. However, I see 'ms' as milliseconds, so perhaps the M vs. m is already in use. I think we at least need to document the case sensitivity and improve the error message. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Peter Eisentraut <peter_e@gmx.net> writes: > Perhaps it would be more effective to clarify the error message? Right > now it just says something to the effect of "invalid integer". I'd > imagine "invalid memory unit: TB" would be less confusing. +1 on that, but I think we should just accept the strings case-insensitively, too. SQL in general is not case sensitive for keywords, and neither is anything else in the postgresql.conf file, so I argue it's inconsistent to be strict about the case for units. Nor do I believe that we'd ever accept a future patch that made the distinction between "kb" and "kB" significant --- if you think people are confused now, just imagine what would happen then. regards, tom lane
On Tue, 2006-12-19 at 19:16 -0500, Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > Perhaps it would be more effective to clarify the error message? Right > > now it just says something to the effect of "invalid integer". I'd > > imagine "invalid memory unit: TB" would be less confusing. > > +1 on that, but I think we should just accept the strings > case-insensitively, too. SQL in general is not case sensitive for > keywords, and neither is anything else in the postgresql.conf file, > so I argue it's inconsistent to be strict about the case for units. Hello, Attached is a simple patch that replaces strcmp() with pg_strcasecmp(). Thanks to AndrewS for pointing out that I shouldn't use strcasecp(). I compiled and installed, ran an initdb with 32mb (versus 32MB) and it seems to work correctly with a show shared_buffers; Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
Attachment
Tom Lane wrote: > Nor do I believe that we'd ever accept a future patch that made > the distinction between "kb" and "kB" significant --- if you think > people are confused now, just imagine what would happen then. As I said elsewhere, I'd imagine future functionality like a units-aware data type, which has been talked about several times, and then this would be really bad. -- Peter Eisentraut http://developer.postgresql.org/~petere/
Joshua D. Drake wrote: > I compiled and installed, ran an initdb with 32mb (versus 32MB) and > it seems to work correctly with a show shared_buffers; Did it actually allocate 32 millibits of shared buffers? -- Peter Eisentraut http://developer.postgresql.org/~petere/
Tom Lane wrote: > +1 on that, but I think we should just accept the strings > case-insensitively, too. I think if we'd allow this to spread, documentation, example files and other material would use it inconsistently, and even more people would be confused and it would make us look silly. It's not like anyone has pointed out a real use case here. The default file has the units already, so it's not like they're hard to guess. And Magnus's issue was that the error message was confusing. So let's fix that. -- Peter Eisentraut http://developer.postgresql.org/~petere/
On Wed, 2006-12-20 at 02:19 +0100, Peter Eisentraut wrote: > Joshua D. Drake wrote: > > I compiled and installed, ran an initdb with 32mb (versus 32MB) and > > it seems to work correctly with a show shared_buffers; > > Did it actually allocate 32 millibits of shared buffers? Funny :) Joshua D. Drake > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
> Hello, > > Attached is a simple patch that replaces strcmp() with pg_strcasecmp(). > Thanks to AndrewS for pointing out that I shouldn't use strcasecp(). > That should be AndrewD :) J > I compiled and installed, ran an initdb with 32mb (versus 32MB) and it > seems to work correctly with a show shared_buffers; > > Sincerely, > > Joshua D. Drake > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
Peter Eisentraut <peter_e@gmx.net> writes: > Tom Lane wrote: >> Nor do I believe that we'd ever accept a future patch that made >> the distinction between "kb" and "kB" significant --- if you think >> people are confused now, just imagine what would happen then. > As I said elsewhere, I'd imagine future functionality like a units-aware > data type, which has been talked about several times, and then this > would be really bad. Only if the units-aware datatype insisted on case-sensitive units, which is at variance with the SQL spec's treatment of keywords, the existing practice in postgresql.conf, the existing practice in our datatypes such as timestamp and interval: regression=# select '20-dec-2006'::timestamp; timestamp ---------------------2006-12-20 00:00:00 (1 row) regression=# select '20-DEC-2006'::timestamp; timestamp ---------------------2006-12-20 00:00:00 (1 row) regression=# select '20-Dec-2006 America/New_York'::timestamptz; timestamptz ------------------------2006-12-20 00:00:00-05 (1 row) regression=# select '20-Dec-2006 AMERICA/NEW_york'::timestamptz; timestamptz ------------------------2006-12-20 00:00:00-05 (1 row) regression=# select '20-Dec-2006 PST'::timestamptz; timestamptz ------------------------2006-12-20 03:00:00-05 (1 row) regression=# select '20-Dec-2006 pst'::timestamptz; timestamptz ------------------------2006-12-20 03:00:00-05 (1 row) regression=# select '1 day'::interval;interval ----------1 day (1 row) regression=# select '1 DAY'::interval;interval ----------1 day (1 row) and in general, there is simply not any other part of Postgres or SQL that you can point to that supports the idea that case sensitivity for keywords is expected behavior. So I think we'd flat-out reject any such datatype. (Hmm, I wonder what Tom Dunstan's enum patch does about case sensitivity...) regards, tom lane
Peter Eisentraut wrote: > Tom Lane wrote: >> Nor do I believe that we'd ever accept a future patch that made >> the distinction between "kb" and "kB" significant --- if you think >> people are confused now, just imagine what would happen then. > > As I said elsewhere, I'd imagine future functionality like a units-aware > data type, which has been talked about several times, and then this > would be really bad. > Most if not all of us here with computer knowledge (particularly at the programming level) know the difference between capital and lowercase memory/data size abbreviations. Case insensitive size measurements don't matter if you actually know what the abbreviations mean. The case where case matters ;-) is b and B (bits and Bytes for those that don't know the diff) And if you don't know the difference between m and M - one is a portion of and the other is a multiple of. So mB would technically mean 0.001 of a byte. I'd like to see you allocate that!! As is the case of many english words - the context of the usage makes a big difference in the interpretation. Given that the purpose of the effective_cache_size setting (and similar) is to specify the amount of memory you want allocated/limited to, then that context allows you to assume that all unit abbreviations are specifying bytes/kilobytes/megabytes/gigabytes and not bits/kilobits/millibits/millibytes etc As for the future - well, TB is getting more common, petabytes of storage has been talked about, 64bit systems can access exabytes of ram. Next would be zettabytes and yottabytes. Unless we start a roman numeral system for amounts bigger than that then I seriously doubt that we will hit any confusion with future usage. (and I doubt in our lifetimes) Though I suppose with storage expansion rates increasing the way they have the last few years we may be using yottabyte hard drives on our 256bit systems with 512 zettabytes of ram in about 15 years ;-) That might make it around the end of life for the 8.0 branch so maybe we need to consider handling these future storage needs soon? Maybe in 40 years we will all retire with mega-yotta-byte drives in our pda watches? As for units aware datatypes - what options are available will need to be decided at implementation time. Will we allow megabit (Mb) size allocations or only megabyte? I would say bits would be clearly specified as such (bit instead of b) Let's skip any flame wars on this and concentrate on the humorous future storage sizes. -- Shane Ambler pgSQL@007Marketing.com Get Sheeky @ http://Sheeky.Biz
On 12/19/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I think we should just accept the strings case-insensitively, too. While acknowledging Peter's pedantically-correct points, I say +1 for ease of use. -- Jonah H. Harris, Software Architect | phone: 732.331.1324 EnterpriseDB Corporation | fax: 732.331.1301 33 Wood Ave S, 3rd Floor | jharris@enterprisedb.com Iselin, New Jersey 08830 | http://www.enterprisedb.com/
On Dec 19, 2006, at 9:50 PM, Jonah H. Harris wrote: > On 12/19/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I think we should just accept the strings case-insensitively, too. > > While acknowledging Peter's pedantically-correct points, I say +1 for > ease of use. +1. I spend some time walking people through tuning issues by phone or IM. Anything that complicates supporting users or frustrates users for no actual benefit is a bad thing. (And this is unrelated to any theoretical units-aware data type - we might well be interested in milliwatts and megawatts in a datatype, but in the configuration file we're unlikely to ever need to configure things in units of millibits). Cheers, Steve
Tom Lane wrote: > (Hmm, I wonder what Tom Dunstan's enum patch does about case > sensitivity...) Currently enum labels are case sensitive. I was a bit ambivalent about it... case insensitivity can lead to less surprises in some cases, but many programming languages that have enums are case sensitive, and so this wouldn't be a direct map for them. OTOH, if someone's doing evil things like sticking labels that differ only in case into an enum, perhaps they *should* be dissuaded. :) The question is where does it end, though? Should we treat letters with accents and umlauts as equivalent as well? Do we remove punctuation characters? It gets into a (for me) more murky localization issue, and I'm not familiar with the postgresql apis for handling that. Maybe it's easy. Since we basically accept any old thing into an enum label, I think we probably shouldn't muck with it. If we want to have some sort of normalized version, then we should probably restrict the characters that we accept fairly severely. Also note that enum values are far more likely to be set by application code than by a human typing the value in directly, so in that sense the need for case insensitivity seems somewhat diminished. I suppose we should think about mysql refugees at some point, though. I wonder what they do. The documentation is silent on the matter (and all their examples are in lower case). Mysql is generally case insensitive, right? Cheers Tom
On Tue, Dec 19, 2006 at 10:12:34PM +0000, Gregory Stark wrote: > > "Tom Lane" <tgl@sss.pgh.pa.us> writes: > > > Magnus Hagander <magnus@hagander.net> writes: > >> Oh, you mean MB vs Mb. Man, it had to be that simple :) > > > > ISTM we had discussed whether guc.c should accept units strings in > > a case-insensitive manner, and the forces of pedantry won the first > > round. Shall we reopen that argument? > > Nope, I just checked back in the archive and that's not what happened. There > was an extended discussion about whether to force users to use the silly KiB, > MiB, etc units. Thankfully the pedants lost that round soundly. > > There was no particular discussion about case sensitivity though Simon made > the case for user-friendly behaviour: > > > I think we are safe assume to that > > > > kB = KB = kb = Kb = 1024 bytes > > > > mB = MB = mb = Mb = 1024 * 1024 bytes > > > > gB = GB = gb = Gb = 1024 * 1024 * 1024 bytes > > > > There's no value in forcing the use of specific case and it will be just > > confusing for people. > > http://archives.postgresql.org/pgsql-hackers/2006-07/msg01253.php > > And Jim Nasby said something similar: > > > Forcing people to use a specific casing scheme is just going to lead to > > confusion and user frustration. If there's not a very solid *functional* > > argument for it, we shouldn't do it. Wanting to enforce a convention that > > people rarely use isn't a good reason. > > http://archives.postgresql.org/pgsql-hackers/2006-07/msg01355.php > > There was a lone comment from Thomas Hallgren in favour of case sensitivity in > the name of consistency. But Nasby's comment was directly in response and > nobody else piped up after that. > My one comment is that a little 'b' is used to indicate bits normally and a capital 'B' is used to indicate bytes. So kb = '1024 bits' kB = '1024 bytes' ... I do think that whether or not the k/m/g is upper case or lower case is immaterial. Ken > -- > Gregory Stark > EnterpriseDB http://www.enterprisedb.com > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org >
Am Mittwoch, 20. Dezember 2006 13:42 schrieb Tom Dunstan: > I suppose we should think about mysql refugees at some point, though. I > wonder what they do. The documentation is silent on the matter (and all > their examples are in lower case). Mysql is generally case insensitive, > right? Maybe you can make sense of this, but I can't ... mysql> create table test (a int, b enum ('x', 'X', 'y')); Query OK, 0 rows affected, 1 warning (0.02 sec) mysql> show warnings; +-------+------+---------------------------------------------+ | Level | Code | Message | +-------+------+---------------------------------------------+ | Note | 1291 | Column 'b' has duplicated value 'x' in ENUM | +-------+------+---------------------------------------------+ 1 row in set (0.00 sec) mysql> insert into test values (1, 'x'); Query OK, 1 row affected (0.00 sec) mysql> insert into test values (1, 'X'); Query OK, 1 row affected (0.00 sec) mysql> insert into test values (1, 'y'); Query OK, 1 row affected (0.00 sec) mysql> insert into test values (1, 'Y'); Query OK, 1 row affected (0.00 sec) mysql> insert into test values (1, 'z'); Query OK, 1 row affected, 1 warning (0.00 sec) ## You think that was funny -- now watch this: mysql> select * from test; +------+------+ | a | b | +------+------+ | 1 | x | | 1 | x | | 1 | y | | 1 | y | | 1 | | +------+------+ 5 rows in set (0.00 sec) mysql> drop table test; Query OK, 0 rows affected (0.00 sec) mysql> create table test (a int, b enum ('ä', 'Ä', ' ', ' ')); ## Above is a-diaeresis, A-diaeresis. Query OK, 0 rows affected, 2 warnings (0.00 sec) mysql> show warnings; +-------+------+---------------------------------------------+ | Level | Code | Message | +-------+------+---------------------------------------------+ | Note | 1291 | Column 'b' has duplicated value '?' in ENUM | # literal ? | Note | 1291 | Column 'b' has duplicated value '' in ENUM | +-------+------+---------------------------------------------+ 2 rows in set (0.00 sec) mysql> insert into test values (1, ' '); Query OK, 1 row affected (0.00 sec) mysql> insert into test values (1, ' '); Query OK, 1 row affected (0.00 sec) mysql> insert into test values (1, ' '); Query OK, 1 row affected (0.01 sec) mysql> insert into test values (1, ' |'); Query OK, 1 row affected, 1 warning (0.00 sec) mysql> show warnings; +---------+------+----------------------------------------+ | Level | Code | Message | +---------+------+----------------------------------------+ | Warning | 1265 | Data truncated for column 'b' at row 1 | +---------+------+----------------------------------------+ 1 row in set (0.00 sec) mysql> select distinct * from test; +------+------+ | a | b | +------+------+ | 1 | ä | | 1 | | | 1 | | +------+------+ Better not imitate that. -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut wrote: > Am Mittwoch, 20. Dezember 2006 13:42 schrieb Tom Dunstan: > >> I suppose we should think about mysql refugees at some point, though. I >> wonder what they do. The documentation is silent on the matter (and all >> their examples are in lower case). Mysql is generally case insensitive, >> right? >> > > Maybe you can make sense of this, but I can't ... > > > [lots of amusing non-orthogonal braindead stuff ...] > > Better not imitate that. > > The MySQL treatment of enums is generally quite reprehensible. The proposed patch by contrast fits quite well into our existing type system, I think. MySQL users migrating will have a bit of work to do. I don't think their experience has much to teach us (except how not to do enums). We should decide on case sensitivity without having reference to it. cheers andrew
On Tue, 2006-12-19 at 22:06 -0800, Steve Atkins wrote: > On Dec 19, 2006, at 9:50 PM, Jonah H. Harris wrote: > > > On 12/19/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> I think we should just accept the strings case-insensitively, too. > > > > While acknowledging Peter's pedantically-correct points, I say +1 for > > ease of use. > > +1. I spend some time walking people through tuning issues > by phone or IM. Anything that complicates supporting users or > frustrates users for no actual benefit is a bad thing. > > (And this is unrelated to any theoretical units-aware data type - > we might well be interested in milliwatts and megawatts in a > datatype, but in the configuration file we're unlikely to ever > need to configure things in units of millibits). Where we at on this? Joshua D. Drake > > Cheers, > Steve > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
"Joshua D. Drake" <jd@commandprompt.com> writes: > On 12/19/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I think we should just accept the strings case-insensitively, too. > Where we at on this? Anyone against making it case-insensitive, speak now or hold your peace. regards, tom lane
>>>>> "TL" == Tom Lane <tgl@sss.pgh.pa.us> writes: TL> Anyone against making it case-insensitive, speak now or hold your TL> peace. SI-units are inherently case-sensitive. The obvious example is that now you will allow people to specify an amount in millibytes, while interpreting it in megabytes. You are trying to interpret valid input as misspellings based on context, and then you silently guess at what the user really meant. That's MySQL behaviour! /Benny
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wed, Dec 27, 2006 at 09:39:22AM +0100, Benny Amorsen wrote: > >>>>> "TL" == Tom Lane <tgl@sss.pgh.pa.us> writes: > > TL> Anyone against making it case-insensitive, speak now or hold your > TL> peace. > > SI-units are inherently case-sensitive [...] As a notorious lurker, set my weight to one mg (choose the case ;-) I'd tend to keep case sensitivity, but will hold my peace regardless of outcome. Thanks - -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFFkkDWBcgs9XrR2kYRAmKFAJwPMOwl+IbCbMQvszn9RKpXRW3N6ACfbwne juGsqnMsj2BMyCO+BjP9hq4= =KCiY -----END PGP SIGNATURE-----
On Wed, Dec 27, 2006 at 09:39:22AM +0100, Benny Amorsen wrote: > >>>>> "TL" == Tom Lane <tgl@sss.pgh.pa.us> writes: > > TL> Anyone against making it case-insensitive, speak now or hold your > TL> peace. > > SI-units are inherently case-sensitive. The obvious example is that > now you will allow people to specify an amount in millibytes, while > interpreting it in megabytes. Yes, and I can't think of a single reason why we'd let people specify anything in millibytes, or kilobits. Truth is, I bet many (if not most) DBAs barely know that case matters in the units. And even those that do are likely to prefer ease of use to pedantics. As for the comments about SI datatypes and MySQL-isms, there's a hell of a big difference between muddying the line in a config file for ease of use and stomping on data that's actually stored in the database. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
>>>>> "JCN" == Jim C Nasby <jim@nasby.net> writes: JCN> Truth is, I bet many (if not most) DBAs barely know that case JCN> matters in the units. Sounds like the school system needs fixing, then. /Benny
Benny Amorsen wrote: > >>>>> "JCN" == Jim C Nasby <jim@nasby.net> writes: > > JCN> Truth is, I bet many (if not most) DBAs barely know that case > JCN> matters in the units. > > Sounds like the school system needs fixing, then. Sure, but it probably shows a lot more prominently in other areas than in unit casing though. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Benny Amorsen wrote: > >>>>> "TL" == Tom Lane <tgl@sss.pgh.pa.us> writes: > > TL> Anyone against making it case-insensitive, speak now or hold your > TL> peace. > > SI-units are inherently case-sensitive. The obvious example is that > now you will allow people to specify an amount in millibytes, while > interpreting it in megabytes. And as Peter points out, there may be a desire to support SI-type units in other parts of the database at some time in the future. It seems like a questionable idea to break with convention just for ease of use. > You are trying to interpret valid input as misspellings based on > context, and then you silently guess at what the user really meant. > That's MySQL behaviour! I agree. But perhaps the solution instead of failing is to throw a warning to the effect of "Not to be pedantic, but you said mb and millibits as a unit doesn't make sense in this context. Assuming you meant MB (MegaBits)." and then start up. Generally I'm against guessing what the user really wants, but in this case, it seems pretty difficult to guess wrong. But either way I'm always dead set against _silently_ guessing. Andrew
"Andrew Hammond" <andrew.george.hammond@gmail.com> writes: > I agree. But perhaps the solution instead of failing is to throw a > warning to the effect of "Not to be pedantic, but you said mb and > millibits as a unit doesn't make sense in this context. Assuming you > meant MB (MegaBits)." and then start up. > Generally I'm against guessing what the user really wants, but in this > case, it seems pretty difficult to guess wrong. But either way I'm > always dead set against _silently_ guessing. Bear in mind that the typical place for this sort of thing is postgresql.conf, and for problems in that file the only place we can emit a warning is into the postmaster log, which far too many users never look at (if indeed it's going anywhere but /dev/null in the first place). So I can't get very excited about "report a warning" as a compromise solution. Personally I don't find the argument about "someday we might want to support measurements in millibits" to be convincing at all, and certainly it seems weaker than the argument that "units should be case insensitive because everything else in this file is". The SQL spec has to be considered a more relevant controlling precedent for us than the SI units spec, and there are no case-sensitive keywords in SQL. regards, tom lane
>>>>> "TL" == Tom Lane <tgl@sss.pgh.pa.us> writes: TL> Personally I don't find the argument about "someday we might want TL> to support measurements in millibits" to be convincing at all, and TL> certainly it seems weaker than the argument that "units should be TL> case insensitive because everything else in this file is". The SQL TL> spec has to be considered a more relevant controlling precedent TL> for us than the SI units spec, and there are no case-sensitive TL> keywords in SQL. Units simply are not case sensitive. They are just a more or less random collection of preexisting symbols, because that was easier than drawing up entirely new ones. Not all are English letters, for one µ is not. If you upper case a text with units in, the units do not change with the rest of the text. /Benny
Benny Amorsen <benny+usenet@amorsen.dk> writes: > "TL" == Tom Lane <tgl@sss.pgh.pa.us> writes: > TL> Personally I don't find the argument about "someday we might want > TL> to support measurements in millibits" to be convincing at all, and > TL> certainly it seems weaker than the argument that "units should be > TL> case insensitive because everything else in this file is". The SQL > TL> spec has to be considered a more relevant controlling precedent > TL> for us than the SI units spec, and there are no case-sensitive > TL> keywords in SQL. > Units simply are not case sensitive. They are just a more or less > random collection of preexisting symbols, because that was easier than > drawing up entirely new ones. Not all are English letters, for one µ > is not. You mean "are case sensitive" right? This is not news. The point I'm basically making is that it's not going to hurt us to restrict GUC to supporting a subset of all-possible-units that can be treated case-insensitively. We're already going to restrict the allowed character set: I can guarantee you that µ, or anything else outside 7-bit ASCII, will never be accepted. It's just not worth the trouble of dealing with multiple possible encodings. regards, tom lane
Am Donnerstag, 28. Dezember 2006 13:25 schrieb Jim C. Nasby: > Yes, and I can't think of a single reason why we'd let people specify > anything in millibytes, or kilobits. How about a configuration option related to connection throughput, which is typically measured in bits? -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut <peter_e@gmx.net> writes: > Am Donnerstag, 28. Dezember 2006 13:25 schrieb Jim C. Nasby: >> Yes, and I can't think of a single reason why we'd let people specify >> anything in millibytes, or kilobits. > How about a configuration option related to connection throughput, which is > typically measured in bits? But at least as often in bytes. What's more, if the system really were to accept both units, you could reasonably expect that people would get it wrong at least half the time ... regards, tom lane
Benny Amorsen wrote: > >>>>> "TL" == Tom Lane <tgl@sss.pgh.pa.us> writes: > > TL> Anyone against making it case-insensitive, speak now or hold your > TL> peace. > > SI-units are inherently case-sensitive. The obvious example is that > now you will allow people to specify an amount in millibytes, while > interpreting it in megabytes. And as Peter points out, there may be a desire to support SI-type units in other parts of the database at some time in the future. It seems like a questionable idea to break with convention just for ease of use. > You are trying to interpret valid input as misspellings based on > context, and then you silently guess at what the user really meant. > That's MySQL behaviour! I agree. But perhaps the solution instead of failing is to throw a warning to the effect of "Not to be pedantic, but you said mb and millibits as a unit doesn't make sense in this context. Assuming you meant MB (MegaBits)." and then start up. Generally I'm against guessing what the user really wants, but in this case, it seems pretty difficult to guess wrong. But either way I'm always dead set against _silently_ guessing. Andrew
> > Yes, and I can't think of a single reason why we'd let people specify > > anything in millibytes, or kilobits. > > How about a configuration option related to connection throughput, which is > typically measured in bits? We'd use "kbit". I don't see us using "kb" in that case (or was it kB :-). Andreas
You just proved the case for why the units shouldn't be case sensitive: On Dec 30, 2006, at 6:36 PM, Andrew Hammond wrote: > I agree. But perhaps the solution instead of failing is to throw a > warning to the effect of "Not to be pedantic, but you said mb and > millibits as a unit doesn't make sense in this context. Assuming you > meant MB (MegaBits)." and then start up. Do we really want people specifying effective_cache_size in *bits*, mega or not? I think no. To reply to Peter's comment, yes, bits would be useful if we ever actually have any settings relating to network bandwidth. But that's a really big IF. IF we do eventually decide to add such a setting, I think it would make the most sense to spell out 'bits' in the unit. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)