Thread: Selecting a constant question: A summary
<div class="Section1"><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">First a comment:</span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">At CONNX Solutions Inc., we believe sincerely that we should do whatever is necessary to make our customersprosper. This means creation of excellent tools and being responsive to customer needs. Secondly, we believe thatwe should treat the customers the way that we want to be treated.</span></font><p class="MsoNormal"><font face="Arial"size="2"><span style="font-size:10.0pt; font-family:Arial">I think that the PostgreSQL group has managed the first objective, but not the second. Of course, thatis only an opinion, but I think that success hinges on both factors. Our objective in this issue has also been to improvePostgreSQL so that it can become more useful to the end users and not to denigrate the work of the engineers thathave toiled on it. I will also admit that frustration has caused our tone to become sharp at times. This is clearlya mistake on our part and for this, I apologize.</span></font><p class="MsoNormal"><font face="Arial" size="2"><spanstyle="font-size:10.0pt; font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">Next, the problem:</span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">According to SQL/CLI and ODBC 3.5, we should bind the length of a character column.</span></font><p class="MsoNormal"><fontface="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">Here are some references from the relevant documentation (SQL/CLI and ODBC are clones of one another):</span></font><pclass="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">==============================================================================</span></font><p class="MsoNormal"><fontface="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">ANSI/ISO/IEC 9075-3-1999</span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">for Information Technology</span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">Database Language SQL </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">Part 3: Call-Level Interface (SQL/CLI)</span></font><p class="MsoNormal"><font face="Arial" size="2"><spanstyle="font-size:10.0pt; font-family:Arial">Section 6.5 BindCol</span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">Along with function SQLBindCol from the ODBC specification</span></font><p class="MsoNormal"><font face="Arial"size="2"><span style="font-size:10.0pt; font-family:Arial"><a href="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbc/htm/odbcsqlbindcol.asp">http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbc/htm/odbcsqlbindcol.asp</a></span></font><p class="MsoNormal"><fontface="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">This bit should be sufficient to explain what we are after:</span></font><p class="MsoNormal"><font face="Arial"size="2"><span style="font-size:10.0pt; font-family:Arial">“BufferLength [Input]</span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">Length of the *TargetValuePtr buffer in bytes.</span></font><p class="MsoNormal"><font face="Arial" size="2"><spanstyle="font-size:10.0pt; font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">The driver uses BufferLength to avoid writing past the end of the *TargetValuePtr buffer when returningvariable-length data, such as character or binary data. Note that the driver counts the null-termination characterwhen returning character data to *TargetValuePtr. *TargetValuePtr must therefore contain space for the null-terminationcharacter or the driver will truncate the data.</span></font><p class="MsoNormal"><font face="Arial" size="2"><spanstyle="font-size:10.0pt; font-family:Arial">When the driver returns fixed-length data, such as an integer or a date structure, the driver ignoresBufferLength and assumes the buffer is large enough to hold the data. It is therefore important for the applicationto allocate a large enough buffer for fixed-length data or the driver will write past the end of the buffer.</span></font><pclass="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">SQLBindCol returns SQLSTATE HY090 (Invalid string or buffer length) when BufferLength is less than 0 butnot when BufferLength is 0. However, if TargetType specifies a character type, an application should not set BufferLengthto 0, because ISO CLI-compliant drivers return SQLSTATE HY090 (Invalid string or buffer length) in that case.”</span></font><pclass="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">==============================================================================</span></font><p class="MsoNormal"><fontface="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">Now, there are times when (according to the spec) we have to defer binding. However, this causes greatproblems for end user tools and should only be done in what is basically a dire emergency.</span></font><p class="MsoNormal"><fontface="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">In the case of a SELECT query that selects a fixed constant of any sort, it would be a definite improvementfor PostgreSQL to give some sort of upper maximum.</span></font><p class="MsoNormal"><font face="Arial" size="2"><spanstyle="font-size:10.0pt; font-family:Arial">For example:</span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">SELECT “Eastern Division”, sum(Inventory_level), sum(Inventory_backorder), Manager_last_name FROM <table_name>WHERE division_id = 9 GROUP BY Manager_last_name</span></font><p class="MsoNormal"><font face="Arial" size="2"><spanstyle="font-size:10.0pt; font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">Will return 3 columns of data. The first column is of unknown length. Imagine if you are a spreadsheetin OpenOffice:</span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial"><a href="http://www.openoffice.org/">http://www.openoffice.org/</a></span></font><p class="MsoNormal"><fontface="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">which happens to support ODBC connections. You would like to fill out a report for the president of yourcompany. Unfortunately, the first column is of “unknown length”</span></font><p class="MsoNormal"><font face="Arial"size="2"><span style="font-size:10.0pt; font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">That makes it a bit difficult to format this spreadsheet.</span></font><p class="MsoNormal"><font face="Arial"size="2"><span style="font-size:10.0pt; font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">Now, I will admit that we may not know a-priori if “Eastern Division” is character or Unicode or MBCS. But in the worst case scenario it will be (16 + 1) * element_width bytes in length. For some Unicode character sets,element_width can be as much as 4, so that leaves 68 octets as an upper possible maximum.</span></font><p class="MsoNormal"><fontface="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">Now, you might protest, 68 bytes might be much too large. That is true, but I know that if I allocate68 bytes we will not have data truncation. It is no worse than a varchar(255) field that has a largest item 15 characterswide in it. The grid will successfully bind and we will be able to produce the report.</span></font><p class="MsoNormal"><fontface="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">Generally speaking, grids are smart enough to automatically resize themselves to max_length(grid_column_title,grid_column_data) and so the report will look very nice.</span></font><p class="MsoNormal"><fontface="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">It is also true that it is possible for us to work around the problem. We certainly can know the exacttype information about the constants in our queries and reformat the PostgreSQL queries to decorate them with thingslike:</span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; font-family:Arial">SELECT “Eastern Division”::char(16), sum(Inventory_level)::Numeric(16,4), sum(Inventory_backorder) ::Numeric(16,4),Manager_last_name FROM <table_name> WHERE division_id = 9 GROUP BY Manager_last_name</span></font><pclass="MsoNormal"><font face="Times New Roman" size="3"><span style="font-size: 12.0pt"> </span></font><p class="MsoNormal"><font face="Times New Roman" size="3"><span style="font-size: 12.0pt">But it would be very nice if the database could provide a good estimate for us so that PostgreSQL could work likeall of the other database systems. Code full of kludges is harder to maintain.</span></font><p class="MsoNormal"><fontface="Times New Roman" size="3"><span style="font-size: 12.0pt"> </span></font><p class="MsoNormal"><font face="Times New Roman" size="3"><span style="font-size: 12.0pt">And so I hope that we can get off on a better foot this time. If the answer is “No, the priority for this sort ofthing is low, and we do not consider it important for our customers.” Then we will have to work around it. Hopefully,at least, it will get put into a queue of future enhancements.</span></font></div>
Dann Corbit wrote: > First a comment: > > At CONNX Solutions Inc., we believe sincerely that we should do whatever > is necessary to make our customers prosper. This means creation of > excellent tools and being responsive to customer needs. Secondly, we > believe that we should treat the customers the way that we want to be > treated. > > I think that the PostgreSQL group has managed the first objective, but > not the second. Of course, that is only an opinion, but I think that > success hinges on both factors. Our objective in this issue has also > been to improve PostgreSQL so that it can become more useful to the end > users and not to denigrate the work of the engineers that have toiled on > it. I will also admit that frustration has caused our tone to become > sharp at times. This is clearly a mistake on our part and for this, I > apologize. Woah, now this is interesting. This morning, I read this whole thread, wondering what in the world could possibly be taking so long ;). I will admit that many of us in the community tend to try to provide a solution without actually understanding the problem. I think it kind of comes with the territory, a lot of times it seems like FOSS is all about the work around versus the solution because the solution takes longer. Perception is a powerful thing. Personally, I didn't see any of the community doing anything but trying their best to help you with the problem you were experiencing. What I did see, is a lot of tenseness from your side, to what basically amounts to "free" support. Remember that we are here, at no cost to you. Lastly, the PostgreSQL community doesn't have customer. You have customer, CMD has customers, the PostgreSQL community does not. The best correlation I can give you is this. We (the community) are all a team. You are part of that team. We are not your vendor. Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
Dan, > Secondly, we > believe that we should treat the customers the way that we want to be > treated. > I think that the PostgreSQL group has managed the first objective, but > not the second. I just read this whole thread, and I feel that the sort of comment above is completely unjustified, and counterproductive to your goal of adding a feature to PostgreSQL which will make your driver work better. You'll be a lot more likely to persuade people in the community to work with you if you're not trying to convince them to change the project culture at the same time. You are on the developer mailing list for an open, community-based open source project and *not* a commercial company. Therefore we do not have "customers" and your paradigm is wrong. The PostgreSQL developers *are* treating you exactly has they expect to be treated; as a developer, meaning that you argue things out and defend your desire for a change. If you read anybody else's discussion on this list you'll see that's how everyone else interacts. If anything you've gotten more than your fair share of attention ... 40+ posts from 1/2 dozen senior developers in less than 48 hours! If you would prefer a more formal customer-vendor relationship, then I suggest that you sign up as a customer of EnterpriseDB, Red Hat, Sun, Command Prompt, SRA etc. or similar. Now, that aside: > According to SQL/CLI and ODBC 3.5, we should bind the length of a > character column. This is a much better approach. Standards are always nice. > But it would be very nice if the database could provide a good estimate > for us so that PostgreSQL could work like all of the other database > systems. Code full of kludges is harder to maintain. Do you have any information about how binding works in other databases? A clear roadmap would make it easier for eventual developer implementation, and obviously this is a solved problem elsewhere. > And so I hope that we can get off on a better foot this time. If the > answer is "No, the priority for this sort of thing is low, and we do not > consider it important for our customers." Again, we don't have "customers". So your desire to implement a change in behavior is dependant on: 1. Getting this list to agree on the specification; 2. Convincing an *individual* PostgreSQL developer or contributing company that this issue is in their high priority interest to fix, OR Fixing it yourself and submitting the patch to PostgreSQL.org. > Then we will have to work > around it. Hopefully, at least, it will get put into a queue of future > enhancements. Getting it on the TODO list is a good first step. However, that doesn't get it implemented until it becomes some other developer's problem as well. -- Josh Berkus PostgreSQL @ Sun San Francisco
"Dann Corbit" <DCorbit@connx.com> writes: > In the case of a SELECT query that selects a fixed constant of any sort, > it would be a definite improvement for PostgreSQL to give some sort of > upper maximum. What's the point? You keep reminding us that your code is middleware that can't assume anything much about the queries you're dealing with. Therefore, I see no real value in fixing up one corner case. Your argument about space allocation falls to the ground unless we can provide a guaranteed, and usefully tight, upper bound on the column width in *every* situation. If we cannot (which we can't), you're still going to need those client-side "kluges". In my opinion, variable-length data is a fact of life and you should endeavor to make your code deal with it gracefully. There are bits of the SQL spec that assume fixed-width data specifications are useful, but to be blunt that's all a holdover from 1960s 80-column-punch-card thinking. It's no way to design a modern application. BTW, the reason I'm resistant to even thinking about this is that Postgres is designed as an extensible system. Trying to do what you want is not a matter of fixing literal constants and concatenation and one or two other places --- it's a matter of imposing a new and potentially hard-to-meet requirement on every datatype under the sun, including a lot of user-written code that we don't control and would break by adding such a requirement. So it's not even likely that we'd think very hard about making this work, let alone actually do it. regards, tom lane
As Dann pointed out we were one of the first companies to port Postgres to windows many many years ago (7.1 days), and part of that porting work is in the current postgresql product. As I pointed out in a prior post, for the ODBC specification at least (probably others), a maximum upper bound on returned data *must* be reported and determined ahead of time when using binding ... A technique where the client application allocates memory for the data and supplies a pointer to that memory location for the driver. Postgres unlike other databases shifts the burden of determining this maximum size to the client and/or driver. Our company specializes in access to wide variety of databases, both relational and non relational, including SQL Server, Oracle, DB2, Sybase, Informix, etc. Postgres sticks out as the only database that we have encountered with this behavior .. Which is why we posted the original message. Also as Dann pointed out even if this issue was addressed, it does not help us because every existing installation of Postgres has the metadata bug, so we *have* to bandaid it at the client/driver level anyway. At least we have a record of the issue, so the next time a developer in the community runs across the same oddity hopefully they will find this and won't be scratching their heads like we were for a bit :) Thanks lm -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Tuesday, June 12, 2007 3:11 PM To: Dann Corbit Cc: pgsql-hackers@postgresql.org; Larry McGhaw Subject: Re: [HACKERS] Selecting a constant question: A summary "Dann Corbit" <DCorbit@connx.com> writes: > In the case of a SELECT query that selects a fixed constant of any > sort, it would be a definite improvement for PostgreSQL to give some > sort of upper maximum. What's the point? You keep reminding us that your code is middleware that can't assume anything much about the queries you're dealing with. Therefore, I see no real value in fixing up one corner case. Your argument about space allocation falls to the ground unless we can provide a guaranteed, and usefully tight, upper bound on the column width in *every* situation. If we cannot (which we can't), you're still going to need those client-side "kluges". In my opinion, variable-length data is a fact of life and you should endeavor to make your code deal with it gracefully. There are bits of the SQL spec that assume fixed-width data specifications are useful, but to be blunt that's all a holdover from 1960s 80-column-punch-card thinking. It's no way to design a modern application. BTW, the reason I'm resistant to even thinking about this is that Postgres is designed as an extensible system. Trying to do what you want is not a matter of fixing literal constants and concatenation and one or two other places --- it's a matter of imposing a new and potentially hard-to-meet requirement on every datatype under the sun, including a lot of user-written code that we don't control and would break by adding such a requirement. So it's not even likely that we'd think very hard about making this work, let alone actually do it. regards, tom lane
Tom, > What's the point? You keep reminding us that your code is middleware > that can't assume anything much about the queries you're dealing with. > Therefore, I see no real value in fixing up one corner case. Your > argument about space allocation falls to the ground unless we can > provide a guaranteed, and usefully tight, upper bound on the column > width in *every* situation. If we cannot (which we can't), you're still > going to need those client-side "kluges". Hmmm? I thought that Dann was just talking about constants, and not column results. Am I confused? > BTW, the reason I'm resistant to even thinking about this is that > Postgres is designed as an extensible system. Trying to do what you > want is not a matter of fixing literal constants and concatenation > and one or two other places --- it's a matter of imposing a new and > potentially hard-to-meet requirement on every datatype under the sun, > including a lot of user-written code that we don't control and would > break by adding such a requirement. So it's not even likely that we'd > think very hard about making this work, let alone actually do it. I'd think it would be possible to do this in an abstract way ... having a "DisplayLength()" call for each data type and value. That would require casting the constant, though, or computing all uncast constants as text. -- Josh Berkus PostgreSQL @ Sun San Francisco
On 6/12/07, Josh Berkus <josh@agliodbs.com> wrote: > Tom, > > > What's the point? You keep reminding us that your code is middleware > > that can't assume anything much about the queries you're dealing with. > > Therefore, I see no real value in fixing up one corner case. Your > > argument about space allocation falls to the ground unless we can > > provide a guaranteed, and usefully tight, upper bound on the column > > width in *every* situation. If we cannot (which we can't), you're still > > going to need those client-side "kluges". > > Hmmm? I thought that Dann was just talking about constants, and not column > results. Am I confused? > > > BTW, the reason I'm resistant to even thinking about this is that > > Postgres is designed as an extensible system. Trying to do what you > > want is not a matter of fixing literal constants and concatenation > > and one or two other places --- it's a matter of imposing a new and > > potentially hard-to-meet requirement on every datatype under the sun, > > including a lot of user-written code that we don't control and would > > break by adding such a requirement. So it's not even likely that we'd > > think very hard about making this work, let alone actually do it. > > I'd think it would be possible to do this in an abstract way ... having a > "DisplayLength()" call for each data type and value. That would require > casting the constant, though, or computing all uncast constants as text. The simplest formulation of this problem appears to be that constant strings that are uncast are treated as type unknown. The connx guys seem to think that they should be implicitly cast to char(n) where n is the length of the string. Is that a reasonable description, or are you guys looking for something more general? If you're just talking about the strings, then here are the thoughts I've gleaned from the preceding thread. - This makes possible some performance tweaks for drivers - It achieves spec compliance (albeit for a stupid part of the spec) - Implicit casting of unknown to char(n) or anything else seems rather sketchy to me, but I can't see any specific objection, except that... - I don't know when the right time to do the cast is. And doing it too early seems obviously wrong. - This only helps in corner case of string constants that are 1. not already cast and 2. not manipulated in any way And that seems like a very small corner case with little or no practical use. I guess if you have some code that turns query output into some flavor of pretty-print, it'd make sense to have a constant column as output of a CASE statement or something. - The corner case must already be correctly handled by the general case for arbitrary sized text, or alternatively phrased: there is no way to conform to the standard while supporting arbitrary sized text. Unless you're willing to pay the cost of scanning twice, or maintaining "biggest entry" data for each variable length column. - I don't know how much effort it would require to implement this, nor how much complexity it would add to the code base. Clearly both of these would be non-zero values. Given the above, I agree with Tom: this seems like corner case where the returns are marginal at best, compared to the cost to implement and maintain. Is there something I'm getting wrong in this summary? Andrew
Josh Berkus <josh@agliodbs.com> writes: >> What's the point? You keep reminding us that your code is middleware >> that can't assume anything much about the queries you're dealing with. > Hmmm? I thought that Dann was just talking about constants, and not column > results. Am I confused? Well, the specific example he was on about was a constant, but I don't think it does him any good for us to fix just that one case. He'll still have to deal with columns of indeterminate width in a whole lot of other cases. If there were a reasonable path for us to report a useful width bound in *every* case, then I could see spending time on it ... but there's not. BTW, it would certainly be trivial to extend libpq to report the actual max width of a column within an already-retrieved PGresult. This isn't anything the client code can't compute for itself, of course, but libpq could get it in somewhat fewer cycles. However, I'm under the impression that Dann wants the number at statement prepare time, and we simply haven't got the information then. > I'd think it would be possible to do this in an abstract way ... having a > "DisplayLength()" call for each data type and value. That would require > casting the constant, though, or computing all uncast constants as text. No, the point is about predicting the max width of a column of a query result in advance of actually running the query. After you've got the values in hand, it's not a very interesting problem. Before, well, consider these examples: select repeat(text_col, int_col) from my_table; select repeat(text_col, int_col * random()) from my_table; select repeat(text_col, some_user_defined_function(int_col)) from my_table; The problem's really not soluble unless you want to dumb Postgres down to approximately the capabilities of SQL89 -- no user-defined functions, let alone user-defined types, plus pull out a whole lot of the built-in functions that don't have readily predictable result widths. regards, tom lane
"Andrew Hammond" <andrew.george.hammond@gmail.com> writes: > - Implicit casting of unknown to char(n) or anything else seems rather > sketchy to me, but I can't see any specific objection, except that... > - I don't know when the right time to do the cast is. And doing it too > early seems obviously wrong. Well, I don't see any reason that we'd consider an implicit cast to char(N) without context to drive us in that direction. The system is currently biased to prefer casts to text. You could make a reasonable case for forcing a cast to text if the constant's type is still unresolved at the end of parsing, and indeed people have proposed that off and on just so that clients would have one less type to think about. In itself it doesn't do anything for Dann's problem though, because unspecified width is unspecified width. I've been thinking lately about trying harder to unify the text and varchar types; I'm not sure about details yet, except that text should be *exactly* the same thing as unconstrained-width varchar, rather than almost the same except we claim it's a different type. The reason I'd been thinking about this was mainly to get rid of the complexity and runtime overhead that comes from having RelabelType nodes all over the place when someone uses varchar instead of text. But if we did that, we could also arrange that unknown literals coerce to varchar(N) with N equal to their actual width, rather than coercing to text, and not create any weird corner-case behaviors in the type system. But at the end of the day this all would only solve Dann's problem for the specific case of a SELECT with an undecorated literal constant in its target list. He's still going to have to deal with unknown-width columns in an enormous variety of cases, and so I completely fail to see the point of changing the system's behavior for this one case. regards, tom lane
Just a curiosity question: Why is the type of a literal '1' "unknown" instead of varchar(1)? Wouldn't varchar(1) cast properly to any use of the literal '1'? What is the benefit of assuming it's an unknown?
"Chuck McDevitt" <cmcdevitt@greenplum.com> writes: > Just a curiosity question: Why is the type of a literal '1' "unknown" > instead of varchar(1)? Because, for instance, it might be intended as an integer or float or numeric value. Change the content a little, like '(1,2)' or '12:34', and maybe it's a point or time value. There are plenty of contexts in which the intended type of a literal is obviously not text/varchar. We assign unknown initially as a way of flagging that the type assignment is uncertain. Once we have a value that we think is varchar (a table column for instance), the rules for deciding to cast it to a different type get a lot more stringent. regards, tom lane
I see... PostgreSQL wants to guess the datatype, given no clear syntactic information, and perhaps a varchar(n) wouldn't be a valid cast to some of the possible datatypes. So, where x = '(1,2)' might be legal for comparing to x, but a field of type varchar(5) might not be, as in where x = y, where y is type varchar(5) containing '(1,2)'. (Time values don't have this problem in pure ANSI SQL, since the literal is TIME '12:34', but I can see for user types it might be ambiguous). I find PostgreSQL's handling of this strange, as I come from systems where 'xxx' is either a varchar or char type, in all contexts, and implicit casts handle any needed conversions. But now I understand why it does things this way. Thanks. -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Tuesday, June 12, 2007 9:50 PM To: Chuck McDevitt Cc: Andrew Hammond; Josh Berkus; pgsql-hackers@postgresql.org; Dann Corbit; Larry McGhaw Subject: Re: [HACKERS] Selecting a constant question: A summary "Chuck McDevitt" <cmcdevitt@greenplum.com> writes: > Just a curiosity question: Why is the type of a literal '1' "unknown" > instead of varchar(1)? Because, for instance, it might be intended as an integer or float or numeric value. Change the content a little, like '(1,2)' or '12:34', and maybe it's a point or time value. There are plenty of contexts in which the intended type of a literal is obviously not text/varchar. We assign unknown initially as a way of flagging that the type assignment is uncertain. Once we have a value that we think is varchar (a table column for instance), the rules for deciding to cast it to a different type get a lot more stringent. regards, tom lane
On Wed, Jun 13, 2007 at 02:12:37AM -0400, Chuck McDevitt wrote: > So, where x = '(1,2)' might be legal for comparing to x, but a field of > type varchar(5) might not be, as in where x = y, where y is type > varchar(5) containing '(1,2)'. Normally, just about every type can be converted to or from text. So if postgres converted to varchar first you have problems with the statement x='const' where x is type "foo". It's now ambiguous since you either convert x to varchar or the constant to "foo". Instead, postgres marks the constant as unknown and now it always gets converted because nothing can convert to unknown. Thus far this system has worked excellently, though not perfectly obviously. It matches people's expectations well, which is the most important part. > I find PostgreSQL's handling of this strange, as I come from systems > where 'xxx' is either a varchar or char type, in all contexts, and > implicit casts handle any needed conversions. > But now I understand why it does things this way. User-defined types makes for lots of interesting choices, but they are by far the most powerful feature of postgres and we don't want to mess with that. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
"Chuck McDevitt" <cmcdevitt@greenplum.com> writes: > Just a curiosity question: Why is the type of a literal '1' "unknown" > instead of varchar(1)? Even if it was assigned a text datatype it would be the unconstrainted "text" not varchar(1). If we used varchar(1) then things like: create table foo as select 'foo'; would create a table with varchar(3) which would then complain if you tried to insert 'foobar' into. There doesn't seem to be enough evidence that the user intended to constrain the input to just 'foo' in that case. Of course right now you get a table with a column of type "unknown" which is very unlikely to be what the user expects. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
> For some Unicode character sets, element_width can be as much as 4 In UTF8 one char can be up to 6 bytes, so 4 is not correct in general. Andreas