Thread: Selecting a constant question: A summary

Selecting a constant question: A summary

From

"Dann Corbit"

Date:

12 June 2007, 17:41:09

<div class="Section1"><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">First a comment:</span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;
font-family:Arial">At CONNX Solutions Inc., we believe sincerely that we should do whatever is necessary to make our
customersprosper.  This means creation of excellent tools and being responsive to customer needs.  Secondly, we believe
thatwe should treat the customers the way that we want to be treated.</span></font><p class="MsoNormal"><font
face="Arial"size="2"><span style="font-size:10.0pt; 
font-family:Arial">I think that the PostgreSQL group has managed the first objective, but not the second.  Of course,
thatis only an opinion, but I think that success hinges on both factors.  Our objective in this issue has also been to
improvePostgreSQL so that it can become more useful to the end users and not to denigrate the work of the engineers
thathave toiled on it.  I will also admit that frustration has caused our tone to become sharp at times.  This is
clearlya mistake on our part and for this, I apologize.</span></font><p class="MsoNormal"><font face="Arial"
size="2"><spanstyle="font-size:10.0pt; 
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">Next, the problem:</span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;
font-family:Arial">According to SQL/CLI and ODBC 3.5, we should bind the length of a character column.</span></font><p
class="MsoNormal"><fontface="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">Here are some references from the relevant documentation (SQL/CLI and ODBC are clones of one
another):</span></font><pclass="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial">==============================================================================</span></font><p
class="MsoNormal"><fontface="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial">ANSI/ISO/IEC 9075-3-1999</span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;
font-family:Arial">for Information Technology</span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;
font-family:Arial">Database Language SQL </span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;
font-family:Arial">Part 3: Call-Level Interface (SQL/CLI)</span></font><p class="MsoNormal"><font face="Arial"
size="2"><spanstyle="font-size:10.0pt; 
font-family:Arial">Section 6.5 BindCol</span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">Along with function SQLBindCol from the ODBC specification</span></font><p class="MsoNormal"><font
face="Arial"size="2"><span style="font-size:10.0pt; 
font-family:Arial"><a
href="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbc/htm/odbcsqlbindcol.asp">http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbc/htm/odbcsqlbindcol.asp</a></span></font><p
class="MsoNormal"><fontface="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">This bit should be sufficient to explain what we are after:</span></font><p class="MsoNormal"><font
face="Arial"size="2"><span style="font-size:10.0pt; 
font-family:Arial">“BufferLength [Input]</span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;
font-family:Arial">Length of the *TargetValuePtr buffer in bytes.</span></font><p class="MsoNormal"><font face="Arial"
size="2"><spanstyle="font-size:10.0pt; 
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">The driver uses BufferLength to avoid writing past the end of the *TargetValuePtr buffer when
returningvariable-length data, such as character or binary data. Note that the driver counts the null-termination
characterwhen returning character data to *TargetValuePtr. *TargetValuePtr must therefore contain space for the
null-terminationcharacter or the driver will truncate the data.</span></font><p class="MsoNormal"><font face="Arial"
size="2"><spanstyle="font-size:10.0pt; 
font-family:Arial">When the driver returns fixed-length data, such as an integer or a date structure, the driver
ignoresBufferLength and assumes the buffer is large enough to hold the data. It is therefore important for the
applicationto allocate a large enough buffer for fixed-length data or the driver will write past the end of the
buffer.</span></font><pclass="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">SQLBindCol returns SQLSTATE HY090 (Invalid string or buffer length) when BufferLength is less than 0
butnot when BufferLength is 0. However, if TargetType specifies a character type, an application should not set
BufferLengthto 0, because ISO CLI-compliant drivers return SQLSTATE HY090 (Invalid string or buffer length) in that
case.”</span></font><pclass="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial">==============================================================================</span></font><p
class="MsoNormal"><fontface="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">Now, there are times when (according to the spec) we have to defer binding.  However, this causes
greatproblems for end user tools and should only be done in what is basically a dire emergency.</span></font><p
class="MsoNormal"><fontface="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial">In the case of a SELECT query that selects a fixed constant of any sort, it would be a definite
improvementfor PostgreSQL to give some sort of upper maximum.</span></font><p class="MsoNormal"><font face="Arial"
size="2"><spanstyle="font-size:10.0pt; 
font-family:Arial">For example:</span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">SELECT “Eastern Division”, sum(Inventory_level), sum(Inventory_backorder), Manager_last_name FROM
<table_name>WHERE division_id = 9 GROUP BY Manager_last_name</span></font><p class="MsoNormal"><font face="Arial"
size="2"><spanstyle="font-size:10.0pt; 
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">Will return 3 columns of data.  The first column is of unknown length.  Imagine if you are a
spreadsheetin OpenOffice:</span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial"><a href="http://www.openoffice.org/">http://www.openoffice.org/</a></span></font><p
class="MsoNormal"><fontface="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial">which happens to support ODBC connections.  You would like to fill out a report for the president of
yourcompany.  Unfortunately, the first column is of “unknown length”</span></font><p class="MsoNormal"><font
face="Arial"size="2"><span style="font-size:10.0pt; 
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">That makes it a bit difficult to format this spreadsheet.</span></font><p class="MsoNormal"><font
face="Arial"size="2"><span style="font-size:10.0pt; 
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">Now, I will admit that we may not know a-priori if “Eastern Division” is character or Unicode or
MBCS. But in the worst case scenario it will be (16 + 1) * element_width bytes in length.  For some Unicode character
sets,element_width can be as much as 4, so that leaves 68 octets as an upper possible maximum.</span></font><p
class="MsoNormal"><fontface="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial">Now, you might protest, 68 bytes might be much too large.  That is true, but I know that if I
allocate68 bytes we will not have data truncation.  It is no worse than a varchar(255) field that has a largest item 15
characterswide in it.  The grid will successfully bind and we will be able to produce the report.</span></font><p
class="MsoNormal"><fontface="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial">Generally speaking, grids are smart enough to automatically resize themselves to
max_length(grid_column_title,grid_column_data) and so the report will look very nice.</span></font><p
class="MsoNormal"><fontface="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">It is also true that it is possible for us to work around the problem.  We certainly can know the
exacttype information about the constants in our queries and reformat the PostgreSQL queries to decorate them with
thingslike:</span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial">SELECT “Eastern Division”::char(16), sum(Inventory_level)::Numeric(16,4), sum(Inventory_backorder)
::Numeric(16,4),Manager_last_name FROM <table_name> WHERE division_id = 9 GROUP BY
Manager_last_name</span></font><pclass="MsoNormal"><font face="Times New Roman" size="3"><span style="font-size: 
12.0pt"> </span></font><p class="MsoNormal"><font face="Times New Roman" size="3"><span style="font-size:
12.0pt">But it would be very nice if the database could provide a good estimate for us so that PostgreSQL could work
likeall of the other database systems.  Code full of kludges is harder to maintain.</span></font><p
class="MsoNormal"><fontface="Times New Roman" size="3"><span style="font-size: 
12.0pt"> </span></font><p class="MsoNormal"><font face="Times New Roman" size="3"><span style="font-size:
12.0pt">And so I hope that we can get off on a better foot this time.  If the answer is “No, the priority for this sort
ofthing is low, and we do not consider it important for our customers.” Then we will have to work around it. 
Hopefully,at least, it will get put into a queue of future enhancements.</span></font></div>

Re: Selecting a constant question: A summary

From

"Joshua D. Drake"

Date:

12 June 2007, 18:20:26

Dann Corbit wrote:
> First a comment:
> 
> At CONNX Solutions Inc., we believe sincerely that we should do whatever 
> is necessary to make our customers prosper.  This means creation of 
> excellent tools and being responsive to customer needs.  Secondly, we 
> believe that we should treat the customers the way that we want to be 
> treated.
> 
> I think that the PostgreSQL group has managed the first objective, but 
> not the second.  Of course, that is only an opinion, but I think that 
> success hinges on both factors.  Our objective in this issue has also 
> been to improve PostgreSQL so that it can become more useful to the end 
> users and not to denigrate the work of the engineers that have toiled on 
> it.  I will also admit that frustration has caused our tone to become 
> sharp at times.  This is clearly a mistake on our part and for this, I 
> apologize.

Woah, now this is interesting. This morning, I read this whole thread, 
wondering what in the world could possibly be taking so long ;).

I will admit that many of us in the community tend to try to provide a 
solution without actually understanding the problem. I think it kind of 
comes with the territory, a lot of times it seems like FOSS is all about 
the work around versus the solution because the solution takes longer.

Perception is a powerful thing. Personally, I didn't see any of the 
community doing anything but trying their best to help you with the 
problem you were experiencing.

What I did see, is a lot of tenseness from your side, to what basically 
amounts to "free" support. Remember that we are here, at no cost to you.

Lastly, the PostgreSQL community doesn't have customer. You have 
customer, CMD has customers, the PostgreSQL community does not.

The best correlation I can give you is this. We (the community) are all 
a team. You are part of that team. We are not your vendor.

Sincerely,

Joshua D. Drake

-- 
      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

Re: Selecting a constant question: A summary

From

Josh Berkus

Date:

12 June 2007, 18:31:57

Dan,

> Secondly, we
> believe that we should treat the customers the way that we want to be
> treated.
> I think that the PostgreSQL group has managed the first objective, but
> not the second. 

I just read this whole thread, and I feel that the sort of comment above is 
completely unjustified, and counterproductive to your goal of adding a 
feature to PostgreSQL which will make your driver work better.  You'll be a 
lot more likely to persuade people in the community to work with you if 
you're not trying to convince them to change the project culture at the same 
time.

You are on the developer mailing list for an open, community-based open source 
project and *not* a commercial company.  Therefore we do not have "customers" 
and your paradigm is wrong.  The PostgreSQL developers *are* treating you 
exactly has they expect to be treated; as a developer, meaning that you argue 
things out and defend your desire for a change.  If you read anybody else's 
discussion on this list you'll see that's how everyone else interacts.

If anything you've gotten more than your fair share of attention ... 40+ posts 
from 1/2 dozen senior developers in less than 48 hours!

If you would prefer a more formal customer-vendor relationship, then I suggest 
that you sign up as a customer of EnterpriseDB, Red Hat, Sun, Command Prompt, 
SRA etc. or similar.

Now, that aside:

> According to SQL/CLI and ODBC 3.5, we should bind the length of a
> character column.

This is a much better approach.  Standards are always nice.

> But it would be very nice if the database could provide a good estimate
> for us so that PostgreSQL could work like all of the other database
> systems.  Code full of kludges is harder to maintain.

Do you have any information about how binding works in other databases?  A 
clear roadmap would make it easier for eventual developer implementation, and 
obviously this is a solved problem elsewhere.

> And so I hope that we can get off on a better foot this time.  If the
> answer is "No, the priority for this sort of thing is low, and we do not
> consider it important for our customers."

Again, we don't have "customers".   So your desire to implement a change in 
behavior is dependant on:
1. Getting this list to agree on the specification;
2. Convincing an *individual* PostgreSQL developer or contributing company 
that this issue is in their high priority interest to fix,   OR Fixing it yourself and submitting the patch to
PostgreSQL.org.

> Then we will have to work 
> around it.  Hopefully, at least, it will get put into a queue of future
> enhancements.

Getting it on the TODO list is a good first step.  However, that doesn't get 
it implemented until it becomes some other developer's problem as well.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

Re: Selecting a constant question: A summary

From

Tom Lane

Date:

12 June 2007, 19:11:03

"Dann Corbit" <DCorbit@connx.com> writes:
> In the case of a SELECT query that selects a fixed constant of any sort,
> it would be a definite improvement for PostgreSQL to give some sort of
> upper maximum.

What's the point?  You keep reminding us that your code is middleware
that can't assume anything much about the queries you're dealing with.
Therefore, I see no real value in fixing up one corner case.  Your
argument about space allocation falls to the ground unless we can
provide a guaranteed, and usefully tight, upper bound on the column
width in *every* situation.  If we cannot (which we can't), you're still
going to need those client-side "kluges".

In my opinion, variable-length data is a fact of life and you should
endeavor to make your code deal with it gracefully.  There are bits of
the SQL spec that assume fixed-width data specifications are useful,
but to be blunt that's all a holdover from 1960s 80-column-punch-card
thinking.  It's no way to design a modern application.

BTW, the reason I'm resistant to even thinking about this is that
Postgres is designed as an extensible system.  Trying to do what you
want is not a matter of fixing literal constants and concatenation
and one or two other places --- it's a matter of imposing a new and
potentially hard-to-meet requirement on every datatype under the sun,
including a lot of user-written code that we don't control and would
break by adding such a requirement.  So it's not even likely that we'd
think very hard about making this work, let alone actually do it.
        regards, tom lane

Re: Selecting a constant question: A summary

From

"Larry McGhaw"

Date:

12 June 2007, 19:31:51

As Dann pointed out we were one of the first companies to port Postgres
to windows many many years ago (7.1 days), and part of that porting work
is in the current postgresql product.

As I pointed out in a prior post, for the ODBC specification at least
(probably others), a maximum upper bound on returned data *must* be
reported and determined ahead of time when using binding ... A technique
where the client application allocates memory for the data and supplies
a pointer to that memory location for the driver.

Postgres unlike other databases shifts the burden of determining this
maximum size to the client and/or driver.  Our company specializes in
access to wide variety of databases, both relational and non relational,
including SQL Server, Oracle, DB2, Sybase, Informix, etc.  Postgres
sticks out as the only database that we have encountered with this
behavior .. Which is why we posted the original message.

Also as Dann pointed out even if this issue was addressed, it does not
help us because every existing installation of Postgres has the metadata
bug, so we *have* to bandaid it at the client/driver level anyway.

At least we have a record of the issue, so the next time a developer in
the community runs across the same oddity hopefully they will find this
and won't be scratching their heads like we were for a bit :)

Thanks

lm

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, June 12, 2007 3:11 PM
To: Dann Corbit
Cc: pgsql-hackers@postgresql.org; Larry McGhaw
Subject: Re: [HACKERS] Selecting a constant question: A summary

"Dann Corbit" <DCorbit@connx.com> writes:
> In the case of a SELECT query that selects a fixed constant of any
> sort, it would be a definite improvement for PostgreSQL to give some
> sort of upper maximum.

What's the point?  You keep reminding us that your code is middleware
that can't assume anything much about the queries you're dealing with.
Therefore, I see no real value in fixing up one corner case.  Your
argument about space allocation falls to the ground unless we can
provide a guaranteed, and usefully tight, upper bound on the column
width in *every* situation.  If we cannot (which we can't), you're still
going to need those client-side "kluges".

In my opinion, variable-length data is a fact of life and you should
endeavor to make your code deal with it gracefully.  There are bits of
the SQL spec that assume fixed-width data specifications are useful, but
to be blunt that's all a holdover from 1960s 80-column-punch-card
thinking.  It's no way to design a modern application.

BTW, the reason I'm resistant to even thinking about this is that
Postgres is designed as an extensible system.  Trying to do what you
want is not a matter of fixing literal constants and concatenation and
one or two other places --- it's a matter of imposing a new and
potentially hard-to-meet requirement on every datatype under the sun,
including a lot of user-written code that we don't control and would
break by adding such a requirement.  So it's not even likely that we'd
think very hard about making this work, let alone actually do it.
        regards, tom lane

Re: Selecting a constant question: A summary

From

Josh Berkus

Date:

12 June 2007, 19:45:17

Tom,

> What's the point?  You keep reminding us that your code is middleware
> that can't assume anything much about the queries you're dealing with.
> Therefore, I see no real value in fixing up one corner case.  Your
> argument about space allocation falls to the ground unless we can
> provide a guaranteed, and usefully tight, upper bound on the column
> width in *every* situation.  If we cannot (which we can't), you're still
> going to need those client-side "kluges".

Hmmm?  I thought that Dann was just talking about constants, and not column 
results.  Am I confused?

> BTW, the reason I'm resistant to even thinking about this is that
> Postgres is designed as an extensible system.  Trying to do what you
> want is not a matter of fixing literal constants and concatenation
> and one or two other places --- it's a matter of imposing a new and
> potentially hard-to-meet requirement on every datatype under the sun,
> including a lot of user-written code that we don't control and would
> break by adding such a requirement.  So it's not even likely that we'd
> think very hard about making this work, let alone actually do it.

I'd think it would be possible to do this in an abstract way ... having a 
"DisplayLength()" call for each data type and value.  That would require 
casting the constant, though, or computing all uncast constants as text.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

Re: Selecting a constant question: A summary

From

"Andrew Hammond"

Date:

12 June 2007, 20:18:22

On 6/12/07, Josh Berkus <josh@agliodbs.com> wrote:
> Tom,
>
> > What's the point?  You keep reminding us that your code is middleware
> > that can't assume anything much about the queries you're dealing with.
> > Therefore, I see no real value in fixing up one corner case.  Your
> > argument about space allocation falls to the ground unless we can
> > provide a guaranteed, and usefully tight, upper bound on the column
> > width in *every* situation.  If we cannot (which we can't), you're still
> > going to need those client-side "kluges".
>
> Hmmm?  I thought that Dann was just talking about constants, and not column
> results.  Am I confused?
>
> > BTW, the reason I'm resistant to even thinking about this is that
> > Postgres is designed as an extensible system.  Trying to do what you
> > want is not a matter of fixing literal constants and concatenation
> > and one or two other places --- it's a matter of imposing a new and
> > potentially hard-to-meet requirement on every datatype under the sun,
> > including a lot of user-written code that we don't control and would
> > break by adding such a requirement.  So it's not even likely that we'd
> > think very hard about making this work, let alone actually do it.
>
> I'd think it would be possible to do this in an abstract way ... having a
> "DisplayLength()" call for each data type and value.  That would require
> casting the constant, though, or computing all uncast constants as text.

The simplest formulation of this problem appears to be that constant
strings that are uncast are treated as type unknown. The connx guys
seem to think that they should be implicitly cast to char(n) where n
is the length of the string. Is that a reasonable description, or are
you guys looking for something more general?

If you're just talking about the strings, then here are the thoughts
I've gleaned from the preceding thread.

- This makes possible some performance tweaks for drivers
- It achieves spec compliance (albeit for a stupid part of the spec)
- Implicit casting of unknown to char(n) or anything else seems rather
sketchy to me, but I can't see any specific objection, except that...
- I don't know when the right time to do the cast is. And doing it too
early seems obviously wrong.
- This only helps in corner case of string constants that are 1. not already cast and 2. not manipulated in any way
And that seems like a very small corner case with little or no
practical use. I guess if you have some code that turns query output
into some flavor of pretty-print, it'd make sense to have a constant
column as output of a CASE statement or something.
- The corner case must already be correctly handled by the general
case for arbitrary sized text, or alternatively phrased: there is no
way to conform to the standard while supporting arbitrary sized text.
Unless you're willing to pay the cost of scanning twice, or
maintaining "biggest entry" data for each variable length column.
- I don't know how much effort it would require to implement this, nor
how much complexity it would add to the code base. Clearly both of
these would be non-zero values.

Given the above, I agree with Tom: this seems like corner case where
the returns are marginal at best, compared to the cost to implement
and maintain.

Is there something I'm getting wrong in this summary?

Andrew

Re: Selecting a constant question: A summary

From

Tom Lane

Date:

12 June 2007, 21:43:38

Josh Berkus <josh@agliodbs.com> writes:
>> What's the point?  You keep reminding us that your code is middleware
>> that can't assume anything much about the queries you're dealing with.

> Hmmm?  I thought that Dann was just talking about constants, and not column 
> results.  Am I confused?

Well, the specific example he was on about was a constant, but I don't
think it does him any good for us to fix just that one case.  He'll
still have to deal with columns of indeterminate width in a whole lot of
other cases.  If there were a reasonable path for us to report a useful
width bound in *every* case, then I could see spending time on it ...
but there's not.

BTW, it would certainly be trivial to extend libpq to report the actual
max width of a column within an already-retrieved PGresult.  This isn't
anything the client code can't compute for itself, of course, but libpq
could get it in somewhat fewer cycles.  However, I'm under the
impression that Dann wants the number at statement prepare time, and
we simply haven't got the information then.

> I'd think it would be possible to do this in an abstract way ... having a 
> "DisplayLength()" call for each data type and value.  That would require 
> casting the constant, though, or computing all uncast constants as text.

No, the point is about predicting the max width of a column of a query
result in advance of actually running the query.  After you've got the
values in hand, it's not a very interesting problem.  Before, well,
consider these examples:

select repeat(text_col, int_col) from my_table;
select repeat(text_col, int_col * random()) from my_table;
select repeat(text_col, some_user_defined_function(int_col)) from my_table;

The problem's really not soluble unless you want to dumb Postgres down
to approximately the capabilities of SQL89 -- no user-defined functions,
let alone user-defined types, plus pull out a whole lot of the built-in
functions that don't have readily predictable result widths.
        regards, tom lane

Re: Selecting a constant question: A summary

From

Tom Lane

Date:

12 June 2007, 22:06:25

"Andrew Hammond" <andrew.george.hammond@gmail.com> writes:
> - Implicit casting of unknown to char(n) or anything else seems rather
> sketchy to me, but I can't see any specific objection, except that...
> - I don't know when the right time to do the cast is. And doing it too
> early seems obviously wrong.

Well, I don't see any reason that we'd consider an implicit cast to
char(N) without context to drive us in that direction.  The system is
currently biased to prefer casts to text.  You could make a reasonable
case for forcing a cast to text if the constant's type is still
unresolved at the end of parsing, and indeed people have proposed that
off and on just so that clients would have one less type to think about.
In itself it doesn't do anything for Dann's problem though, because
unspecified width is unspecified width.

I've been thinking lately about trying harder to unify the text and
varchar types; I'm not sure about details yet, except that text should
be *exactly* the same thing as unconstrained-width varchar, rather than
almost the same except we claim it's a different type.  The reason I'd
been thinking about this was mainly to get rid of the complexity and
runtime overhead that comes from having RelabelType nodes all over the
place when someone uses varchar instead of text.  But if we did that,
we could also arrange that unknown literals coerce to varchar(N) with
N equal to their actual width, rather than coercing to text, and not
create any weird corner-case behaviors in the type system.

But at the end of the day this all would only solve Dann's problem for
the specific case of a SELECT with an undecorated literal constant in
its target list.  He's still going to have to deal with unknown-width
columns in an enormous variety of cases, and so I completely fail to see
the point of changing the system's behavior for this one case.
        regards, tom lane

Re: Selecting a constant question: A summary

From

"Chuck McDevitt"

Date:

13 June 2007, 01:41:12

Just a curiosity question:  Why is the type of a literal '1' "unknown"
instead of varchar(1)?
Wouldn't varchar(1) cast properly to any use of the literal '1'?

What is the benefit of assuming it's an unknown?

Re: Selecting a constant question: A summary

From

Tom Lane

Date:

13 June 2007, 01:50:20

"Chuck McDevitt" <cmcdevitt@greenplum.com> writes:
> Just a curiosity question:  Why is the type of a literal '1' "unknown"
> instead of varchar(1)?

Because, for instance, it might be intended as an integer or float or
numeric value.  Change the content a little, like '(1,2)' or '12:34',
and maybe it's a point or time value.  There are plenty of contexts in
which the intended type of a literal is obviously not text/varchar.

We assign unknown initially as a way of flagging that the type
assignment is uncertain.  Once we have a value that we think is varchar
(a table column for instance), the rules for deciding to cast it to a
different type get a lot more stringent.
        regards, tom lane

Re: Selecting a constant question: A summary

From

"Chuck McDevitt"

Date:

13 June 2007, 03:13:24

I see... PostgreSQL wants to guess the datatype, given no clear
syntactic information, and perhaps a varchar(n) wouldn't be a valid cast
to some of the possible datatypes.

So,  where x = '(1,2)' might be legal for comparing to x, but a field of
type varchar(5) might not be, as in where x = y, where y is type
varchar(5) containing '(1,2)'.

(Time values don't have this problem in pure ANSI SQL, since the literal
is TIME '12:34', but I can see for user types it might be ambiguous).

I find PostgreSQL's handling of this strange, as I come from systems
where 'xxx' is either a varchar or char type, in all contexts, and
implicit casts handle any needed conversions.
But now I understand why it does things this way.

Thanks.

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, June 12, 2007 9:50 PM
To: Chuck McDevitt
Cc: Andrew Hammond; Josh Berkus; pgsql-hackers@postgresql.org; Dann
Corbit; Larry McGhaw
Subject: Re: [HACKERS] Selecting a constant question: A summary

"Chuck McDevitt" <cmcdevitt@greenplum.com> writes:
> Just a curiosity question:  Why is the type of a literal '1' "unknown"
> instead of varchar(1)?

Because, for instance, it might be intended as an integer or float or
numeric value.  Change the content a little, like '(1,2)' or '12:34',
and maybe it's a point or time value.  There are plenty of contexts in
which the intended type of a literal is obviously not text/varchar.

We assign unknown initially as a way of flagging that the type
assignment is uncertain.  Once we have a value that we think is varchar
(a table column for instance), the rules for deciding to cast it to a
different type get a lot more stringent.
        regards, tom lane

Re: Selecting a constant question: A summary

From

Martijn van Oosterhout

Date:

13 June 2007, 04:04:25

On Wed, Jun 13, 2007 at 02:12:37AM -0400, Chuck McDevitt wrote:
> So,  where x = '(1,2)' might be legal for comparing to x, but a field of
> type varchar(5) might not be, as in where x = y, where y is type
> varchar(5) containing '(1,2)'.

Normally, just about every type can be converted to or from text. So if
postgres converted to varchar first you have problems with the
statement x='const' where x is type "foo". It's now ambiguous since you
either convert x to varchar or the constant to "foo". Instead, postgres
marks the constant as unknown and now it always gets converted because
nothing can convert to unknown.

Thus far this system has worked excellently, though not perfectly
obviously. It matches people's expectations well, which is the most
important part.

> I find PostgreSQL's handling of this strange, as I come from systems
> where 'xxx' is either a varchar or char type, in all contexts, and
> implicit casts handle any needed conversions.
> But now I understand why it does things this way.

User-defined types makes for lots of interesting choices, but they are
by far the most powerful feature of postgres and we don't want to mess
with that.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: Selecting a constant question: A summary

From

Gregory Stark

Date:

13 June 2007, 06:36:58

"Chuck McDevitt" <cmcdevitt@greenplum.com> writes:

> Just a curiosity question:  Why is the type of a literal '1' "unknown"
> instead of varchar(1)?

Even if it was assigned a text datatype it would be the unconstrainted "text"
not varchar(1). If we used varchar(1) then things like:
create table foo as select 'foo';

would create a table with varchar(3) which would then complain if you tried to
insert 'foobar' into. There doesn't seem to be enough evidence that the user
intended to constrain the input to just 'foo' in that case.

Of course right now you get a table with a column of type "unknown" which is
very unlikely to be what the user expects.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com

Re: Selecting a constant question: A summary

From

"Zeugswetter Andreas ADI SD"

Date:

13 June 2007, 06:55:50

> For some Unicode character sets, element_width can be as much as 4

In UTF8 one char can be up to 6 bytes, so 4 is not correct in general.

Andreas