Thread: Selecting a constant question

Selecting a constant question

From
"Dann Corbit"
Date:
<div class="Section1"><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">SELECT 1  FROM test.dbo.a_003 </span></font><p class="MsoNormal"><font face="Times New Roman"
size="3"><spanstyle="font-size: 
12.0pt"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">gets about 60,000 records per second</span></font><p class="MsoNormal"><font face="Times New Roman"
size="3"><spanstyle="font-size: 
12.0pt"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">SELECT '1'  FROM test.dbo.a_003 </span></font><p class="MsoNormal"><font face="Times New Roman"
size="3"><spanstyle="font-size: 
12.0pt"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">gets about 600 records per second.</span></font><p class="MsoNormal"><font face="Times New Roman"
size="3"><spanstyle="font-size: 
12.0pt"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">The cause is that postgres describes the return column as "unknown" length 65534 in the 2nd
case.</span></font><pclass="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">Since the value is a constant, it seems rather odd to make the length 65534 characters.  Why not
makeit char(1) or some other appropriate and less costly data type?  After all, it isn’t going to grow during the
query.</span></font><pclass="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial"> </span></font></div>

Re: Selecting a constant question

From
Tom Lane
Date:
"Dann Corbit" <DCorbit@connx.com> writes:
> SELECT 1  FROM test.dbo.a_003
> gets about 60,000 records per second
> SELECT '1'  FROM test.dbo.a_003
> gets about 600 records per second.

> The cause is that postgres describes the return column as "unknown"
> length 65534 in the 2nd case.

Postgres describes it in no such fashion --- unknown will always have a
typmod of -1 which means "unspecified".  Possibly you have some client
code that knows much less than it thinks it does about the meanings of
typmod values?

The actual volume of data transmitted is going to be just about the same
either way, so I'm not sure you've diagnosed the cause of slowdown
correctly.  Trying the example in psql seems to be about the same speed
both ways, with if anything a slight advantage to select '1'.
        regards, tom lane


Re: Selecting a constant question

From
Gregory Stark
Date:
"Dann Corbit" <DCorbit@connx.com> writes:

> SELECT 1  FROM test.dbo.a_003 
>
> gets about 60,000 records per second
>
> SELECT '1'  FROM test.dbo.a_003 
>
> gets about 600 records per second.
>
> The cause is that postgres describes the return column as "unknown"
> length 65534 in the 2nd case.

Wait, back up. How does this cause it to go slower?

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com



Re: Selecting a constant question

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: Gregory Stark [mailto:stark@enterprisedb.com]
> Sent: Monday, June 11, 2007 12:48 PM
> To: Dann Corbit
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Selecting a constant question
>
> "Dann Corbit" <DCorbit@connx.com> writes:
>
> > SELECT 1  FROM test.dbo.a_003
> >
> > gets about 60,000 records per second
> >
> > SELECT '1'  FROM test.dbo.a_003
> >
> > gets about 600 records per second.
> >
> > The cause is that postgres describes the return column as "unknown"
> > length 65534 in the 2nd case.
>
> Wait, back up. How does this cause it to go slower?

The issue is this:

Postgres describes the column with a typmod of -1 (unknown) and a length
of 65534.

This means that any client application must reserve 65534 bytes of
spaces for every row of data (like a grid control for example), which
postgres should know (and report) that the maximum length of the column
is 1.

This is not a PSQL issue, it's an issue with other products relying on
the accuracy of the reported postgres metadata for a given SQL
statement.


Re: Selecting a constant question

From
Martijn van Oosterhout
Date:
On Mon, Jun 11, 2007 at 12:55:55PM -0700, Dann Corbit wrote:
> The issue is this:
>
> Postgres describes the column with a typmod of -1 (unknown) and a length
> of 65534.

Postgres does no such thing. How can it possibly know the maximum size
of a column before executing the query?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: Selecting a constant question

From
Tom Lane
Date:
"Dann Corbit" <DCorbit@connx.com> writes:
> The issue is this:
> Postgres describes the column with a typmod of -1 (unknown) and a length
> of 65534.

Oh, you're looking at typlen not typmod.  Please observe the comments in
pg_type.h:
/* * For a fixed-size type, typlen is the number of bytes we use to * represent a value of this type, e.g. 4 for an
int4.   But for a * variable-length type, typlen is negative.  We use -1 to indicate a * "varlena" type (one that has a
lengthword), -2 to indicate a * null-terminated C string. */int2        typlen;
 

You should be treating typlen as signed not unsigned, and not assuming a
fixed width for any negative value.

Since the width refers to the server internal representation, and not to
what comes down the wire, I find it pretty strange for an application to
be using typlen for anything at all actually.
        regards, tom lane


Re: Selecting a constant question

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Monday, June 11, 2007 1:32 PM
> To: Dann Corbit
> Cc: Gregory Stark; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Selecting a constant question
>
> "Dann Corbit" <DCorbit@connx.com> writes:
> > The issue is this:
> > Postgres describes the column with a typmod of -1 (unknown) and a
length
> > of 65534.
>
> Oh, you're looking at typlen not typmod.  Please observe the comments
in
> pg_type.h:
>
>     /*
>      * For a fixed-size type, typlen is the number of bytes we use
to
>      * represent a value of this type, e.g. 4 for an int4.    But for
> a
>      * variable-length type, typlen is negative.  We use -1 to
indicate
> a
>      * "varlena" type (one that has a length word), -2 to indicate a
>      * null-terminated C string.
>      */
>     int2        typlen;
>
> You should be treating typlen as signed not unsigned, and not assuming
a
> fixed width for any negative value.
>
> Since the width refers to the server internal representation, and not
to
> what comes down the wire, I find it pretty strange for an application
to
> be using typlen for anything at all actually.

Thanks for the response.

Since libpq function PQfsize returns -2 for all constant character
strings in SQL statements ... What is the proper procedure to determine
the length of a constant character column after query execution but
before fetching the first row of data?



Re: Selecting a constant question

From
Gregory Stark
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Trying the example in psql seems to be about the same speed both ways, with
> if anything a slight advantage to select '1'.

Fwiw I see a slight advantage for '1' as well. I wonder why.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com



Re: Selecting a constant question

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: Martijn van Oosterhout [mailto:kleptog@svana.org]
> Sent: Monday, June 11, 2007 1:46 PM
> To: Dann Corbit
> Subject: Re: [HACKERS] Selecting a constant question
>
> On Mon, Jun 11, 2007 at 01:29:37PM -0700, Dann Corbit wrote:
> > Our application is using the libPQ interface to access postgres.
> >
> > The query is "select '123' from <tablename>"  .. the table is not
> > important.
> >
> > After executing the query, we interrogate the metadata of the result
set
> > using the PQfsize, PQfmod and PQftype functions.
>
> Did you read the documentation of the PQfsize function?
>
> PQfsize returns the space allocated for this column in a database row,
> in other words the size of the server's internal representation of the
> data type. (Accordingly, it is not really very useful to clients.) A
> negative value indicates the data type is variable-length.
>
>
http://www.postgresql.org/docs/8.2/interactive/libpq-exec.html#LIBPQ-EXE
C-
> SELECT-INFO
>
> > The size of the column is returned as 65534  (or -2 if you consider
this
> > a signed short value)
>
> It's variable length, you can't say anything more.

So what you are saying is that the constant '1' is variable length, and
there is no way to find out the maximum length from the database.



Re: Selecting a constant question

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
> owner@postgresql.org] On Behalf Of Dann Corbit
> Sent: Monday, June 11, 2007 1:52 PM
> To: Martijn van Oosterhout
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Selecting a constant question
>
> > -----Original Message-----
> > From: Martijn van Oosterhout [mailto:kleptog@svana.org]
> > Sent: Monday, June 11, 2007 1:46 PM
> > To: Dann Corbit
> > Subject: Re: [HACKERS] Selecting a constant question
> >
> > On Mon, Jun 11, 2007 at 01:29:37PM -0700, Dann Corbit wrote:
> > > Our application is using the libPQ interface to access postgres.
> > >
> > > The query is "select '123' from <tablename>"  .. the table is not
> > > important.
> > >
> > > After executing the query, we interrogate the metadata of the
result
> set
> > > using the PQfsize, PQfmod and PQftype functions.
> >
> > Did you read the documentation of the PQfsize function?
> >
> > PQfsize returns the space allocated for this column in a database
row,
> > in other words the size of the server's internal representation of
the
> > data type. (Accordingly, it is not really very useful to clients.) A
> > negative value indicates the data type is variable-length.
> >
> >
>
http://www.postgresql.org/docs/8.2/interactive/libpq-exec.html#LIBPQ-EXE
> C-
> > SELECT-INFO
> >
> > > The size of the column is returned as 65534  (or -2 if you
consider
> this
> > > a signed short value)
> >
> > It's variable length, you can't say anything more.
>
> So what you are saying is that the constant '1' is variable length,
and
> there is no way to find out the maximum length from the database.

I have a PostgreSQL feature request:

Report the maximum size of a variable length string from the server.

Surely, we cannot be the only people who will need this information.  If
(for example) someone wants to bind to a grid, then the maximum size has
to be known in advance.



Re: Selecting a constant question

From
Gregory Stark
Date:
"Dann Corbit" <DCorbit@connx.com> writes:

> Surely, we cannot be the only people who will need this information.  If
> (for example) someone wants to bind to a grid, then the maximum size has
> to be known in advance.

In fact psql needs it and implements this. It has to skim through the entire
result set to calculate the column widths. It's quite a lot of work but the
server is in no better position to do it than psql.

On the contrary the server is missing quite a bit of information of how you
intend to display the information. Do you need the number of bytes or
characters? Are all the characters the same width in your display system? What
about currency symbols? Do you intend to reverse any quoting or just display
backslashes?

Even knowing how many characters and assuming fixed character widths that
wouldn't even be enough to set your grid control widths. Usually people like
numeric quantities decimal aligned and so two records "1.00" and "0.01" will
take much more width than two records with "1.00" and "2.00".

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com



Re: Selecting a constant question

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: Gregory Stark [mailto:stark@enterprisedb.com]
> Sent: Monday, June 11, 2007 2:41 PM
> To: Dann Corbit
> Cc: Martijn van Oosterhout; pgsql-hackers@postgresql.org; Larry McGhaw
> Subject: Re: [HACKERS] Selecting a constant question
>
> "Dann Corbit" <DCorbit@connx.com> writes:
>
> > Surely, we cannot be the only people who will need this information.
If
> > (for example) someone wants to bind to a grid, then the maximum size
has
> > to be known in advance.
>
> In fact psql needs it and implements this. It has to skim through the
> entire
> result set to calculate the column widths. It's quite a lot of work
but
> the
> server is in no better position to do it than psql.

Reading the data twice sounds a little painful.  What if there are 30
million rows?
> On the contrary the server is missing quite a bit of information of
how
> you
> intend to display the information. Do you need the number of bytes or
> characters? Are all the characters the same width in your display
system?
> What
> about currency symbols? Do you intend to reverse any quoting or just
> display
> backslashes?

Giving me the information about the data type will be enough.  As an
example, in this case we have varchar data.  If the server should be so
kind as to report varchar(1) for '1' or varchar(3) for '123' then I
would not have any difficulty binding the data to a grid.
> Even knowing how many characters and assuming fixed character widths
that
> wouldn't even be enough to set your grid control widths. Usually
people
> like
> numeric quantities decimal aligned and so two records "1.00" and
"0.01"
> will
> take much more width than two records with "1.00" and "2.00".

SQL*Server, Oracle, Ingres, DB/2 and other database systems somehow
manage to do it, so I guess it is not technically intractable.

I suspect that your own ODBC/JDBC and other drivers suffer from this
same effect.

Now, I do recognize that sometimes nobody is going to know how big
something is, including the server.  But with a query using a constant
it seems like it ought to be well defined to me.  Perhaps the
difficulties are escaping me because I am not familiar with the low
level guts of this problem.  But I suspect that lots of people besides
me would benefit if sizes of things were known when it is possible to
know them.

As I said before, I see that it cannot be known right now. So I am
putting it in as a feature request.

If you could be so kind as to point out the right spot to look in the
server code, I imagine we could fix it and check in the patch ourselves.


Re: Selecting a constant question

From
Alvaro Herrera
Date:
Dann Corbit wrote:

> > "Dann Corbit" <DCorbit@connx.com> writes:
> > 
> > In fact psql needs it and implements this. It has to skim through the
> > entire
> > result set to calculate the column widths. It's quite a lot of work
> but
> > the
> > server is in no better position to do it than psql.
> 
> Reading the data twice sounds a little painful.  What if there are 30
> million rows?

You get an "out of memory" error.

> > On the contrary the server is missing quite a bit of information of
> > how you intend to display the information. Do you need the number of
> > bytes or characters? Are all the characters the same width in your
> > display system?  What about currency symbols? Do you intend to
> > reverse any quoting or just display backslashes?
> 
> Giving me the information about the data type will be enough.  As an
> example, in this case we have varchar data.  If the server should be so
> kind as to report varchar(1) for '1' or varchar(3) for '123' then I
> would not have any difficulty binding the data to a grid.

Oh, you have the length information for each datum all right.  It's on
the first four bytes of it.

-- 
Alvaro Herrera                               http://www.PlanetPostgreSQL.org/
"The West won the world not by the superiority of its ideas or values
or religion but rather by its superiority in applying organized violence.
Westerners often forget this fact, non-Westerners never do."
(Samuel P. Huntington)


Re: Selecting a constant question

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: Alvaro Herrera [mailto:alvherre@commandprompt.com]
> Sent: Monday, June 11, 2007 3:16 PM
> To: Dann Corbit
> Cc: Gregory Stark; Martijn van Oosterhout;
pgsql-hackers@postgresql.org;
> Larry McGhaw
> Subject: Re: [HACKERS] Selecting a constant question
>
> Dann Corbit wrote:
>
> > > "Dann Corbit" <DCorbit@connx.com> writes:
> > >
> > > In fact psql needs it and implements this. It has to skim through
the
> > > entire
> > > result set to calculate the column widths. It's quite a lot of
work
> > but
> > > the
> > > server is in no better position to do it than psql.
> >
> > Reading the data twice sounds a little painful.  What if there are
30
> > million rows?
>
> You get an "out of memory" error.
>
> > > On the contrary the server is missing quite a bit of information
of
> > > how you intend to display the information. Do you need the number
of
> > > bytes or characters? Are all the characters the same width in your
> > > display system?  What about currency symbols? Do you intend to
> > > reverse any quoting or just display backslashes?
> >
> > Giving me the information about the data type will be enough.  As an
> > example, in this case we have varchar data.  If the server should be
so
> > kind as to report varchar(1) for '1' or varchar(3) for '123' then I
> > would not have any difficulty binding the data to a grid.
>
> Oh, you have the length information for each datum all right.  It's on
> the first four bytes of it.

Sure, but when I bind to a grid, I need to know a-priori how big the
biggest returned instance can be.  Reading the entire data set twice to
learn the size of a constant seems rather conceptually odd to me.



Re: Selecting a constant question

From
Alvaro Herrera
Date:
Dann Corbit wrote:

> > Oh, you have the length information for each datum all right.  It's on
> > the first four bytes of it.
> 
> Sure, but when I bind to a grid, I need to know a-priori how big the
> biggest returned instance can be.  Reading the entire data set twice to
> learn the size of a constant seems rather conceptually odd to me.

Did you read up on typmod already?  I think that's part of the info sent
down in the query response.

-- 
Alvaro Herrera                         http://www.flickr.com/photos/alvherre/
"No single strategy is always right (Unless the boss says so)"                                                 (Larry
Wall)


Re: Selecting a constant question

From
Martijn van Oosterhout
Date:
On Mon, Jun 11, 2007 at 03:18:33PM -0700, Dann Corbit wrote:
> Sure, but when I bind to a grid, I need to know a-priori how big the
> biggest returned instance can be.  Reading the entire data set twice to
> learn the size of a constant seems rather conceptually odd to me.

To be honest, the concept that a widget requires a constant that can't
be changed later is also a bit odd. There are many times you won't know
beforehand how big the data is, surely the framework should be smart
enough to handle these cases?

Start the width at 100, if it turns out to be too small, make it
bigger...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: Selecting a constant question

From
Andrew Dunstan
Date:

Dann Corbit wrote:
> I have a PostgreSQL feature request:
>
> Report the maximum size of a variable length string from the server.
>
> Surely, we cannot be the only people who will need this information.  If
> (for example) someone wants to bind to a grid, then the maximum size has
> to be known in advance.
>
>
>
>   

Does PQfmod not tell you what you need if the field is varchar(n) ?

cheers

andrew




Re: Selecting a constant question

From
Tom Lane
Date:
"Dann Corbit" <DCorbit@connx.com> writes:
> Giving me the information about the data type will be enough.  As an
> example, in this case we have varchar data.  If the server should be so
> kind as to report varchar(1) for '1' or varchar(3) for '123' then I
> would not have any difficulty binding the data to a grid.

This seems merest fantasy.  Reflect on multibyte character sets for a
bit --- even if it's known that the column is varchar(3) there is no
guarantee that the value will fit in 3 bytes.
        regards, tom lane


Re: Selecting a constant question

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Monday, June 11, 2007 3:35 PM
> To: Dann Corbit
> Cc: Gregory Stark; Martijn van Oosterhout;
pgsql-hackers@postgresql.org;
> Larry McGhaw
> Subject: Re: [HACKERS] Selecting a constant question
>
> "Dann Corbit" <DCorbit@connx.com> writes:
> > Giving me the information about the data type will be enough.  As an
> > example, in this case we have varchar data.  If the server should be
so
> > kind as to report varchar(1) for '1' or varchar(3) for '123' then I
> > would not have any difficulty binding the data to a grid.
>
> This seems merest fantasy.  Reflect on multibyte character sets for a
> bit --- even if it's known that the column is varchar(3) there is no
> guarantee that the value will fit in 3 bytes.

If the server bound the data as UNICODE, then it will tell me
UNICODE(3).  I know how big this will be.

In the worst case scenario it will fit in 3*4 = 12 bytes.

If the server is built without UNICODE enabled, then it will definitely
fit in 3 bytes.



Re: Selecting a constant question

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: Martijn van Oosterhout [mailto:kleptog@svana.org]
> Sent: Monday, June 11, 2007 3:29 PM
> To: Dann Corbit
> Cc: Alvaro Herrera; Gregory Stark; pgsql-hackers@postgresql.org; Larry
> McGhaw
> Subject: Re: [HACKERS] Selecting a constant question
>
> On Mon, Jun 11, 2007 at 03:18:33PM -0700, Dann Corbit wrote:
> > Sure, but when I bind to a grid, I need to know a-priori how big the
> > biggest returned instance can be.  Reading the entire data set twice
to
> > learn the size of a constant seems rather conceptually odd to me.
>
> To be honest, the concept that a widget requires a constant that can't
> be changed later is also a bit odd.

Not when the data itself is a constant that cannot be changed.

> There are many times you won't know
> beforehand how big the data is, surely the framework should be smart
> enough to handle these cases?

If it were impossible to know the size of a string constant supplied in
the query, then I think I would agree with you here.  However, it seems
to me that the maximum possible size of such a known, constant-width
string is not hard to determine.

> Start the width at 100, if it turns out to be too small, make it
> bigger...

If that were a good idea, then why report data sizes at all?  Just let
it always be a surprise when it comes streaming down the pipe.

Honestly, I cannot fathom this answer.



Re: Selecting a constant question

From
Alvaro Herrera
Date:
Dann Corbit wrote:

> If the server bound the data as UNICODE, then it will tell me
> UNICODE(3).  I know how big this will be.
> 
> In the worst case scenario it will fit in 3*4 = 12 bytes.
> 
> If the server is built without UNICODE enabled, then it will definitely
> fit in 3 bytes.

Unless it's some other multibyte encoding.  And nowadays, the server is
always "unicode enabled".  The stuff sent down the wire is unicode or
not depending on a configuration parameter.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Selecting a constant question

From
Tom Lane
Date:
"Dann Corbit" <DCorbit@connx.com> writes:
>> To be honest, the concept that a widget requires a constant that can't
>> be changed later is also a bit odd.

> Not when the data itself is a constant that cannot be changed.

Surely this case is not sufficiently important to justify designing
your entire application (not to mention the client/server protocol)
around it.  You're always going to have variable-width columns in there
somewhere.
        regards, tom lane


Re: Selecting a constant question

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: Alvaro Herrera [mailto:alvherre@commandprompt.com]
> Sent: Monday, June 11, 2007 3:44 PM
> To: Dann Corbit
> Cc: Tom Lane; Gregory Stark; Martijn van Oosterhout; pgsql-
> hackers@postgresql.org; Larry McGhaw
> Subject: Re: [HACKERS] Selecting a constant question
>
> Dann Corbit wrote:
>
> > If the server bound the data as UNICODE, then it will tell me
> > UNICODE(3).  I know how big this will be.
> >
> > In the worst case scenario it will fit in 3*4 = 12 bytes.
> >
> > If the server is built without UNICODE enabled, then it will
definitely
> > fit in 3 bytes.
>
> Unless it's some other multibyte encoding.  And nowadays, the server
is
> always "unicode enabled".  The stuff sent down the wire is unicode or
> not depending on a configuration parameter.

Even at that, we still know an absolute maximum of 12 bytes.


Re: Selecting a constant question

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Monday, June 11, 2007 3:50 PM
> To: Dann Corbit
> Cc: Martijn van Oosterhout; Alvaro Herrera; Gregory Stark; pgsql-
> hackers@postgresql.org; Larry McGhaw
> Subject: Re: [HACKERS] Selecting a constant question
>
> "Dann Corbit" <DCorbit@connx.com> writes:
> >> To be honest, the concept that a widget requires a constant that
can't
> >> be changed later is also a bit odd.
>
> > Not when the data itself is a constant that cannot be changed.
>
> Surely this case is not sufficiently important to justify designing
> your entire application (not to mention the client/server protocol)
> around it.  You're always going to have variable-width columns in
there
> somewhere.

Right.  But normally I get back a length for those variable length
columns, or I can collect it from the metadata of the database.

Surely, PostgreSQL can determine the size of a constant string.
Otherwise it would be impossible to know if it would be safe to insert a
constant string into a database column.

PostgreSQL has decided upon a data type, and gives me data bound in that
type.  It is only the length that it is unwilling to divulge.


Re: Selecting a constant question

From
"Larry McGhaw"
Date:
I think perhaps we have lost sight of the main issue:

1) libpq can properly describe the maximum internal data size of any
numeric or char column in a table via Pqfsize
2) libpq can properly describe the maximum internal data size of any
varchar column via Pqfmod
3) libpq can properly describe the maximum internal data size of any
numeric constant in a SQL statement via Pqfsize
4) libpq **cannot** describe the maximum internal data size of a char or
varchar constant!
Example:  select '123' from <any table>

This is clearly a bug or serious oversight in libpq that should be
addressed.

The database *knows* this size of the char constant (obviously), and
should report the size via a metadata call, as all other relational
databases do.

Thanks

lm


-----Original Message-----
From: Alvaro Herrera [mailto:alvherre@commandprompt.com]
Sent: Monday, June 11, 2007 3:44 PM
To: Dann Corbit
Cc: Tom Lane; Gregory Stark; Martijn van Oosterhout;
pgsql-hackers@postgresql.org; Larry McGhaw
Subject: Re: [HACKERS] Selecting a constant question

Dann Corbit wrote:

> If the server bound the data as UNICODE, then it will tell me
> UNICODE(3).  I know how big this will be.
>
> In the worst case scenario it will fit in 3*4 = 12 bytes.
>
> If the server is built without UNICODE enabled, then it will
> definitely fit in 3 bytes.

Unless it's some other multibyte encoding.  And nowadays, the server is
always "unicode enabled".  The stuff sent down the wire is unicode or
not depending on a configuration parameter.

--
Alvaro Herrera
http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Selecting a constant question

From
"Larry McGhaw"
Date:
I think perhaps we have lost sight of the main issue:

1) libpq can properly describe the maximum internal data size of any
numeric or char column in a table via Pqfsize
2) libpq can properly describe the maximum internal data size of any
varchar column via Pqfmod
3) libpq can properly describe the maximum internal data size of any
numeric constant in a SQL statement via Pqfsize
4) libpq **cannot** describe the maximum internal data size of a char or
varchar constant!
Example:  select '123' from <any table>

This is clearly a bug or serious oversight in libpq that should be
addressed.

The database *knows* this size of the char constant (obviously), and
should report the size via a metadata call, as all other relational
databases do.

Thanks

lm


-----Original Message-----
From: Alvaro Herrera [mailto:alvherre@commandprompt.com]
Sent: Monday, June 11, 2007 3:44 PM
To: Dann Corbit
Cc: Tom Lane; Gregory Stark; Martijn van Oosterhout;
pgsql-hackers@postgresql.org; Larry McGhaw
Subject: Re: [HACKERS] Selecting a constant question

Dann Corbit wrote:

> If the server bound the data as UNICODE, then it will tell me
> UNICODE(3).  I know how big this will be.
>
> In the worst case scenario it will fit in 3*4 = 12 bytes.
>
> If the server is built without UNICODE enabled, then it will
> definitely fit in 3 bytes.

Unless it's some other multibyte encoding.  And nowadays, the server is
always "unicode enabled".  The stuff sent down the wire is unicode or
not depending on a configuration parameter.

--
Alvaro Herrera
http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Selecting a constant question

From
Kris Jurka
Date:

On Mon, 11 Jun 2007, Larry McGhaw wrote:

> I think perhaps we have lost sight of the main issue:
>
> 2) libpq can properly describe the maximum internal data size of any
> varchar column via Pqfmod

SELECT cola || colb FROM tab;

> 3) libpq can properly describe the maximum internal data size of any
> numeric constant in a SQL statement via Pqfsize

SELECT 3::numeric;

Kris Jurka



Re: Selecting a constant question

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
> owner@postgresql.org] On Behalf Of Kris Jurka
> Sent: Monday, June 11, 2007 5:04 PM
> To: Larry McGhaw
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Selecting a constant question
>
>
>
> On Mon, 11 Jun 2007, Larry McGhaw wrote:
>
> > I think perhaps we have lost sight of the main issue:
> >
> > 2) libpq can properly describe the maximum internal data size of any
> > varchar column via Pqfmod
>
> SELECT cola || colb FROM tab;

Suggestion:
Return (column size of cola) + (column size of colb) in the maximum
length field.
> > 3) libpq can properly describe the maximum internal data size of any
> > numeric constant in a SQL statement via Pqfsize
>
> SELECT 3::numeric;

Suggestion:
Return sizeof (numeric(1,0)) -- after all, it's a constant here.

In the words of the great poet "Spike Lee":
'Always do the right thing.'



Re: Selecting a constant question

From
Andrew Dunstan
Date:

Larry McGhaw wrote:
> 4) libpq **cannot** describe the maximum internal data size of a char or
> varchar constant!
> Example:  select '123' from <any table>
>
> This is clearly a bug or serious oversight in libpq that should be
> addressed.
>
> The database *knows* this size of the char constant (obviously), and
> should report the size via a metadata call, as all other relational
> databases do.
>
>   
>   

What is not clear to me is why it is so important for you to know the 
length of a piece of data you are supplying. If it is so vitally 
important, you could always cast it, e.g. select '123'::varchar(3)

cheers

andrew


Re: Selecting a constant question

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
> owner@postgresql.org] On Behalf Of Andrew Dunstan
> Sent: Monday, June 11, 2007 5:12 PM
> To: Larry McGhaw
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Selecting a constant question
>
>
>
> Larry McGhaw wrote:
> > 4) libpq **cannot** describe the maximum internal data size of a
char or
> > varchar constant!
> > Example:  select '123' from <any table>
> >
> > This is clearly a bug or serious oversight in libpq that should be
> > addressed.
> >
> > The database *knows* this size of the char constant (obviously), and
> > should report the size via a metadata call, as all other relational
> > databases do.
> >
> >
> >
>
> What is not clear to me is why it is so important for you to know the
> length of a piece of data you are supplying. If it is so vitally
> important, you could always cast it, e.g. select '123'::varchar(3)

We're a middleware company.  We are not in control of the queries that
are sent.  We can intercept and reformat them, and perhaps that is what
we will need to do for PostgreSQL


Re: Selecting a constant question

From
Tom Lane
Date:
"Larry McGhaw" <lmcghaw@connx.com> writes:
> I think perhaps we have lost sight of the main issue:
> 1) libpq can properly describe the maximum internal data size of any
> numeric or char column in a table via Pqfsize
> 2) libpq can properly describe the maximum internal data size of any
> varchar column via Pqfmod
> 3) libpq can properly describe the maximum internal data size of any
> numeric constant in a SQL statement via Pqfsize

None of the above statements are actually true, at least not when you
take off your blinders and note the existence of unconstrained-width
numeric and text columns.

> The database *knows* this size of the char constant (obviously),

No, what it knows (and reports) is type information.  There are a small
number of datatypes where you can infer a maximum width from knowledge
of the datatype.  There are many others where you can't set an upper
bound from this knowledge --- at least not a usefully tight one.

Anyway, if we were to cast those constants to something other than
unknown, it would be text, not varchar, and you'd still have the same
issue.
        regards, tom lane


Re: Selecting a constant question

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Monday, June 11, 2007 5:32 PM
> To: Larry McGhaw
> Cc: Alvaro Herrera; Dann Corbit; Gregory Stark; Martijn van
Oosterhout;
> pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Selecting a constant question
>
> "Larry McGhaw" <lmcghaw@connx.com> writes:
> > I think perhaps we have lost sight of the main issue:
> > 1) libpq can properly describe the maximum internal data size of any
> > numeric or char column in a table via Pqfsize
> > 2) libpq can properly describe the maximum internal data size of any
> > varchar column via Pqfmod
> > 3) libpq can properly describe the maximum internal data size of any
> > numeric constant in a SQL statement via Pqfsize
>
> None of the above statements are actually true, at least not when you
> take off your blinders and note the existence of unconstrained-width
> numeric and text columns.

Unconstrained width columns are not what are being discussed here.  It
is constant expressions of known width.
> > The database *knows* this size of the char constant (obviously),
>
> No, what it knows (and reports) is type information.  There are a
small
> number of datatypes where you can infer a maximum width from knowledge
> of the datatype.  There are many others where you can't set an upper
> bound from this knowledge --- at least not a usefully tight one.

If you do not know how large 1::numeric is, then how can you know
whether it is safe or not to insert it into a column of type
numeric(12,4)?

If you do not know how large 'Joe'::varchar is, then how can you know
whether it is safe to insert it into a column of type varchar(256)?

Clearly, neither of these operations will cause any problems and so the
size of a constant can be determined.
> Anyway, if we were to cast those constants to something other than
> unknown, it would be text, not varchar, and you'd still have the same
> issue.

Other database systems can manage this, and the programmers of those
database systems are not smarter than the programmers of the PostgreSQL
group.  Therefore I can conclude that if the PostgreSQL group decides it
is important, then they can figure out the size of a string or numeric
constant.



Re: Selecting a constant question

From
"Larry McGhaw"
Date:
As far as I am aware these statements are true.  If you have a specific example you could provide to the contrary that would be interesting.
 
Even if there are such conditions it does not change the fact that libpq and/or postgresql is deficient in this area.
 
For any query, the database should be capable of describing the metadata for the columns, which includes
1) the column type
and
2) the column maximum length.
 
This is such a basic database interface principle that I very disappointed that someone has not recognized this and simply said " yes, we see the issue we will work on it".
 
Again, *all* other major relational databases do this ...  even blob fields have a maximum length reported from the database.
 
I hope someone who truly understands database interfaces will read this thread and address the issue.
For now we will have to "special case" postgres in our application until it is addressed.
 
Thanks
lm


From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Mon 6/11/2007 5:32 PM
To: Larry McGhaw
Cc: Alvaro Herrera; Dann Corbit; Gregory Stark; Martijn van Oosterhout; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Selecting a constant question

"Larry McGhaw" <lmcghaw@connx.com> writes:
> I think perhaps we have lost sight of the main issue:
> 1) libpq can properly describe the maximum internal data size of any
> numeric or char column in a table via Pqfsize
> 2) libpq can properly describe the maximum internal data size of any
> varchar column via Pqfmod
> 3) libpq can properly describe the maximum internal data size of any
> numeric constant in a SQL statement via Pqfsize

None of the above statements are actually true, at least not when you
take off your blinders and note the existence of unconstrained-width
numeric and text columns.

> The database *knows* this size of the char constant (obviously),

No, what it knows (and reports) is type information.  There are a small
number of datatypes where you can infer a maximum width from knowledge
of the datatype.  There are many others where you can't set an upper
bound from this knowledge --- at least not a usefully tight one.

Anyway, if we were to cast those constants to something other than
unknown, it would be text, not varchar, and you'd still have the same
issue.

                        regards, tom lane

Re: Selecting a constant question

From
Hannu Krosing
Date:
Ühel kenal päeval, E, 2007-06-11 kell 13:38, kirjutas Dann Corbit:
> > -----Original Message-----
> > From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> > Sent: Monday, June 11, 2007 1:32 PM
> > To: Dann Corbit
> > Cc: Gregory Stark; pgsql-hackers@postgresql.org
> > Subject: Re: [HACKERS] Selecting a constant question
...
> > You should be treating typlen as signed not unsigned, and not assuming
> a
> > fixed width for any negative value.
> > 
> > Since the width refers to the server internal representation, and not
> to
> > what comes down the wire, I find it pretty strange for an application
> to
> > be using typlen for anything at all actually.
> 
> Thanks for the response.
> 
> Since libpq function PQfsize returns -2 for all constant character
> strings in SQL statements ... What is the proper procedure to determine
> the length of a constant character column after query execution but
> before fetching the first row of data?

Why not just get the first row and determine the width from it before
you actually use any of tha data ?

----------
Hannu




Re: Selecting a constant question

From
Hannu Krosing
Date:
Ühel kenal päeval, E, 2007-06-11 kell 22:11, kirjutas Larry McGhaw:
> As far as I am aware these statements are true.  If you have a
> specific example you could provide to the contrary that would be
> interesting.
>  
> Even if there are such conditions it does not change the fact that
> libpq and/or postgresql is deficient in this area.
>  
> For any query, the database should be capable of describing the
> metadata for the columns, which includes
> 1) the column type
> and
> 2) the column maximum length.
>  
> This is such a basic database interface principle that I very
> disappointed that someone has not recognized this and simply said "
> yes, we see the issue we will work on it".
>  
> Again, *all* other major relational databases do this ...  even blob
> fields have a maximum length reported from the database.
>  
> I hope someone who truly understands database interfaces will read
> this thread and address the issue.
> For now we will have to "special case" postgres in our application
> until it is addressed.
>  

or redesign your application so that it allocates memory as needed and
won't waste client memory by allocating maximum possible amount for each
and every grid cell weather needed or not ;)

As I understand from this discussion you are writing some kind of
middleware (i.e. tools), and I'd expect toolmakers to do the right
thing.

allocating as much as possibly ever needed is something that would be
excusable in quick-n-dirty end user application, but not in a tool.

----------------
Hannu






Re: Selecting a constant question

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: Hannu Krosing [mailto:hannu@skype.net]
> Sent: Monday, June 11, 2007 8:42 PM
> To: Dann Corbit
> Cc: Tom Lane; Gregory Stark; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Selecting a constant question
>
> Ühel kenal päeval, E, 2007-06-11 kell 13:38, kirjutas Dann Corbit:
> > > -----Original Message-----
> > > From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> > > Sent: Monday, June 11, 2007 1:32 PM
> > > To: Dann Corbit
> > > Cc: Gregory Stark; pgsql-hackers@postgresql.org
> > > Subject: Re: [HACKERS] Selecting a constant question
> ...
> > > You should be treating typlen as signed not unsigned, and not assuming
> > a
> > > fixed width for any negative value.
> > >
> > > Since the width refers to the server internal representation, and not
> > to
> > > what comes down the wire, I find it pretty strange for an application
> > to
> > > be using typlen for anything at all actually.
> >
> > Thanks for the response.
> >
> > Since libpq function PQfsize returns -2 for all constant character
> > strings in SQL statements ... What is the proper procedure to determine
> > the length of a constant character column after query execution but
> > before fetching the first row of data?
>
> Why not just get the first row and determine the width from it before
> you actually use any of tha data ?

What if the second row is 1000x longer?


Re: Selecting a constant question

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: Hannu Krosing [mailto:hannu@skype.net]
> Sent: Monday, June 11, 2007 10:43 PM
> To: Larry McGhaw
> Cc: Tom Lane; Alvaro Herrera; Dann Corbit; Gregory Stark; Martijn van
> Oosterhout; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Selecting a constant question
>
> Ühel kenal päeval, E, 2007-06-11 kell 22:11, kirjutas Larry McGhaw:
> > As far as I am aware these statements are true.  If you have a
> > specific example you could provide to the contrary that would be
> > interesting.
> >
> > Even if there are such conditions it does not change the fact that
> > libpq and/or postgresql is deficient in this area.
> >
> > For any query, the database should be capable of describing the
> > metadata for the columns, which includes
> > 1) the column type
> > and
> > 2) the column maximum length.
> >
> > This is such a basic database interface principle that I very
> > disappointed that someone has not recognized this and simply said "
> > yes, we see the issue we will work on it".
> >
> > Again, *all* other major relational databases do this ...  even blob
> > fields have a maximum length reported from the database.
> >
> > I hope someone who truly understands database interfaces will read
> > this thread and address the issue.
> > For now we will have to "special case" postgres in our application
> > until it is addressed.
> >
>
> or redesign your application so that it allocates memory as needed and
> won't waste client memory by allocating maximum possible amount for each
> and every grid cell weather needed or not ;)
>
> As I understand from this discussion you are writing some kind of
> middleware (i.e. tools), and I'd expect toolmakers to do the right
> thing.

In this case the middleware is:
ODBC/JDBC/OLEDB/.NET data drivers for PostgreSQL.

There are other related tools, but the above is the product for which the bug needs corrected.

> allocating as much as possibly ever needed is something that would be
> excusable in quick-n-dirty end user application, but not in a tool.

It's a requirement of the ODBC/JDBC/OLEDB/.NET specifications.  I suppose we could scan the table twice to figure out
howlarge a column might be, but that would make the PostgreSQL driver run at 1/2 speed.  Not a very appetizing
solution.

None of the other database vendors has any trouble reporting this information correctly.



Re: Selecting a constant question

From
Hannu Krosing
Date:
Ühel kenal päeval, E, 2007-06-11 kell 22:55, kirjutas Dann Corbit:
> > -----Original Message-----
...
> > > I hope someone who truly understands database interfaces will read
> > > this thread and address the issue.
> > > For now we will have to "special case" postgres in our application
> > > until it is addressed.
> > >
> > 
> > or redesign your application so that it allocates memory as needed and
> > won't waste client memory by allocating maximum possible amount for each
> > and every grid cell weather needed or not ;)
> > 
> > As I understand from this discussion you are writing some kind of
> > middleware (i.e. tools), and I'd expect toolmakers to do the right
> > thing.
> 
> In this case the middleware is:
> ODBC/JDBC/OLEDB/.NET data drivers for PostgreSQL.
> 
> There are other related tools, but the above is the product for which the bug needs corrected.

You mean you use some kind of "Grid" inside JDBC/.NET drivers , and it
needs to know max size for a column ?

can't you replace it with a dynamically allocated Grid component, which
would also work well for other expressions, not just constants ?

> > allocating as much as possibly ever needed is something that would be
> > excusable in quick-n-dirty end user application, but not in a tool.
> 
> It's a requirement of the ODBC/JDBC/OLEDB/.NET specifications.  

Is that a requirement only for "constants" or for any expression, say
"SELECT substring(reallybigblob, 1, random(1000000)) from somebigtable"
?

> I suppose we could scan the 
> table twice to figure out how large a column might be, but that would make the PostgreSQL 
> driver run at 1/2 speed.  Not a very appetizing solution.

by scanninc twice you find out how big the largest column _is_, not
might be .

> None of the other database vendors has any trouble reporting this information correctly.

By "this information" you mean the max possible size of data returned by
and expression ?

-----------------
Hannu




Re: Selecting a constant question

From
"Larry McGhaw"
Date:
Again, the issue is not our tool, but the deficiency in libpq/postgres ... even mysql gets its right  .. why not Postgres?
 
Its not hard for a database to report metadata properly.
 
if I issue a sql statement:
select '123' from <any table>
the database should report that the maximum length of the 1st column in the resultset is 3 ... it cant be any more plain than that.
 
Thanks
 
lm


From: Hannu Krosing [mailto:hannu@skype.net]
Sent: Mon 6/11/2007 10:43 PM
To: Larry McGhaw
Cc: Tom Lane; Alvaro Herrera; Dann Corbit; Gregory Stark; Martijn van Oosterhout; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Selecting a constant question

Ühel kenal päeval, E, 2007-06-11 kell 22:11, kirjutas Larry McGhaw:
> As far as I am aware these statements are true.  If you have a
> specific example you could provide to the contrary that would be
> interesting.

> Even if there are such conditions it does not change the fact that
> libpq and/or postgresql is deficient in this area.

> For any query, the database should be capable of describing the
> metadata for the columns, which includes
> 1) the column type
> and
> 2) the column maximum length.

> This is such a basic database interface principle that I very
> disappointed that someone has not recognized this and simply said "
> yes, we see the issue we will work on it".

> Again, *all* other major relational databases do this ...  even blob
> fields have a maximum length reported from the database.

> I hope someone who truly understands database interfaces will read
> this thread and address the issue.
> For now we will have to "special case" postgres in our application
> until it is addressed.


or redesign your application so that it allocates memory as needed and
won't waste client memory by allocating maximum possible amount for each
and every grid cell weather needed or not ;)

As I understand from this discussion you are writing some kind of
middleware (i.e. tools), and I'd expect toolmakers to do the right
thing.

allocating as much as possibly ever needed is something that would be
excusable in quick-n-dirty end user application, but not in a tool.

----------------
Hannu




Re: Selecting a constant question

From
Heikki Linnakangas
Date:
Larry McGhaw wrote:
> Again, *all* other major relational databases do this ...  even blob fields have a maximum length reported from the
database.

So what are you doing with the max length? Not all data types and values 
have a meaningful max length, so you have to be able to deal with 
variable length data anyway.

For blobs, exactly what max length would you like to get; 1GB? 1TB? Why, 
what good is that for?

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Selecting a constant question

From
"Zeugswetter Andreas ADI SD"
Date:
> > Again, *all* other major relational databases do this ...
> even blob fields have a maximum length reported from the database.
>
> So what are you doing with the max length? Not all data types
> and values have a meaningful max length, so you have to be
> able to deal with variable length data anyway.

Imho it has a lot to do with optimizing the interface.
If you know, that the max length is e.g. 16 bytes in UTF-8 for the 3
chars, you will probably not want any on the fly allocation smarts and
preallocate and bind those 16 bytes. When the max length value gets
larger, and it is a variable lenght type, the overhead of varlen
allocation smarts starts to pay off.

A generic interface should keep the sql parsing smarts at a minimum,
thus it cannot know that a returned column is actually a text constant.

Imho the request for a max length is very reasonable, but has no value
once it exceeds a certain size e.g. 64k.

Andreas


Re: Selecting a constant question

From
Gregory Stark
Date:
"Larry McGhaw" <lmcghaw@connx.com> writes:

> The database *knows* this size of the char constant (obviously), and
> should report the size via a metadata call, as all other relational
> databases do.

I'm not even clear whether you and Dan are talking about the same thing. He's
talking about the number of bytes required hold the constant. You seem to be
talking about the character length of strings.


--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com



Re: Selecting a constant question

From
"Florian G. Pflug"
Date:
Dann Corbit wrote:
>> -----Original Message-----
>> From: Hannu Krosing [mailto:hannu@skype.net]
>>> Since libpq function PQfsize returns -2 for all constant character
>>> strings in SQL statements ... What is the proper procedure to determine
>>> the length of a constant character column after query execution but
>>> before fetching the first row of data?
>> Why not just get the first row and determine the width from it before
>> you actually use any of tha data ?
> 
> What if the second row is 1000x longer?

Thats exactly the point. Consider
select mytext from mytable ;

How can PostgreSQL possibly know the maximum length
of the returned values *before* it has scanned the
whole table?

greetings, Florian Pflug



Re: Selecting a constant question

From
"Zeugswetter Andreas ADI SD"
Date:
> Thats exactly the point. Consider
> select mytext from mytable ;
>
> How can PostgreSQL possibly know the maximum length of the
> returned values *before* it has scanned the whole table?

I think this focuses too much on those cases where it is not possible.
When it is not feasible like with a text column, clients deal with it
already (obviously some better than others).
It is for those cases where it would be feasible, like constants (or
concateneted columns), where the max length if properly returned could
be used to improve performance.

Andreas


Re: Selecting a constant question

From
Heikki Linnakangas
Date:
Zeugswetter Andreas ADI SD wrote:
>> Thats exactly the point. Consider
>> select mytext from mytable ;
>>
>> How can PostgreSQL possibly know the maximum length of the 
>> returned values *before* it has scanned the whole table?
> 
> I think this focuses too much on those cases where it is not possible.
> When it is not feasible like with a text column, clients deal with it
> already (obviously some better than others). 
> It is for those cases where it would be feasible, like constants (or
> concateneted columns), where the max length if properly returned could
> be used to improve performance.

I doubt there's any measurable performance benefit here. You might as 
well allocate a buffer of say 128 bytes, and enlarge it from there when 
you see a value larger than that. Even in the worst case, you'll only 
need to enlarge the buffer a few times per query until you reach the 
real max length.

Actually, if you're in such a high throughput, client-side CPU-intensive  situation that this makes any difference, why
areyou copying the 
 
value to another buffer in the first place? Just access it directly in 
the libpq buffer returned by PQgetvalue, and move on.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Selecting a constant question

From
Dave Page
Date:
Heikki Linnakangas wrote:
> Actually, if you're in such a high throughput, client-side CPU-intensive
>  situation that this makes any difference, why are you copying the value
> to another buffer in the first place? Just access it directly in the
> libpq buffer returned by PQgetvalue, and move on.

That's a *very* good point. The original design for the pgAdmin query
tool made it's own copy of the data to display in the grid which is
exactly why we used to get complaints about having a query time and a
display time.

The modern versions use a virtual table which enables the grid to
retrieve the data directly from the libpq buffer when it needs to draw
each cell which has effectively eliminated that display time.

Regards, Dave.


Re: Selecting a constant question

From
Martijn van Oosterhout
Date:
On Tue, Jun 12, 2007 at 12:47:55PM +0200, Zeugswetter Andreas ADI SD wrote:
> I think this focuses too much on those cases where it is not possible.
> When it is not feasible like with a text column, clients deal with it
> already (obviously some better than others).
> It is for those cases where it would be feasible, like constants (or
> concateneted columns), where the max length if properly returned could
> be used to improve performance.

For constants there is a basic problem that Postgres, if at all
possible, doesn't even analyse the string at all. If it's not part of a
join or sort, then in every likelyhood it's passed through the entire
execution untouched and comes out the other end as type unknown. The
length indicator of -2 indicates a null-terminated string, postgres
never even bothered calculating the length of it.

For the situation of concatinating varchar columns, it's a fairly
special case. The typmod, in the *special case* of varchar is the
maximum length, but for other types it means something else.
Additionally, the planner doesn't know that || is concatination, a
consequence of the user-defined operators. So to make this work you
need to change the planner so that:

1. It special cases varchar to know what the typmod means
2. It special cases the || operator to add the typmods together.
3. Has to take special care not to break user-defined operators

All a pile of hacks and special cases to handle something that, to be
honest, the vast majority of people never notice.

So no, no patch is going to be accepted to handle this special case,
because it's far too hacky for a corner case. On the other hand, if you
can piggyback it into something like the "user-defined typmod" stuff,
it may have a better chance, though I really think the first problem is
basically "won't fix" from an optimisation point of view.

Hope this clarifies things a bit,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: Selecting a constant question

From
Andrew Dunstan
Date:

Larry McGhaw wrote:
> Again, the issue is not our tool, but the deficiency in libpq/postgres 
> ... even mysql gets its right  .. why not Postgres?
>  
> Its not hard for a database to report metadata properly.
>  
> if I issue a sql statement:
> select '123' from <any table>
> the database should report that the maximum length of the 1st column 
> in the resultset is 3 ... it cant be any more plain than that.
>  
>
>

Making assertions like this does not make your case for you. If you 
think it's that easy then send in a patch. I suspect that doing what you 
want in the cases where it could be supported would require a protocol 
change, with possibly an extra field in the RowDescription object. If 
that's true you'd need to make a very good and compelling case indeed 
for such a change. If this is so vital I'm curious to know why driver 
authors haven't been screaming about it until now. I'm not dismissing 
what you want, but just waving your hand and saying "it's not hard" 
really won't do.

cheers

andrew




Re: Selecting a constant question

From
"Larry McGhaw"
Date:
I'm really frustrated by this process I'm not trying to attack anyone
here.  I'm just surprised that no one will even entertain the idea that
this is an issue that needs to be addressed.

Instead nearly all of the responses have been attacking the applications
that rely on the metadata.

Let me back up and explain the situation.

This issue came to light for us when we were using a query tool to
examine performance of postgres queries.

We were not only measuring the performance of the database itself, but
also the TCP/IP transport,
And the rendering of the data .. Comparing SQL Server, Oracle, and
Postgres head to head with the same queries.

We noticed inexplicably that when we used a constant with a postgres
query, our records per second dropped
From 60,000 records per second to 600 records per second, so we started
digging into the issue.

We discovered that libpq was not describing the metadata properly for
the constant column, and it appears
That the 3rd party grid control was relying on that metadata somehow  ..
The bottom line is that there was
A huge performance drag.

* OK ... I agree that the memory handling in the grid control could be
better, but I would imagine that
this issue is not an isolated to this one particular control, and that
other applications and controls that rely
on resultset metadata may have this issue.

Bottom line,  we only reported this problem because we thought you would
be interested in doing everything possible to make postgres more
mainstream and conform to SQL standards.  In the past such suggestions
have been absorbed with zeal.

I have no vested interest in you improving the interface or not, and I'm
not going to "plead a case" for you
To do something that every other commercial database has done out of the
box.

It is in your hands now :)

Thanks

lm

-----Original Message-----
From: Andrew Dunstan [mailto:andrew@dunslane.net]
Sent: Tuesday, June 12, 2007 4:50 AM
To: Larry McGhaw
Cc: Hannu Krosing; Tom Lane; Alvaro Herrera; Dann Corbit; Gregory Stark;
Martijn van Oosterhout; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Selecting a constant question



Larry McGhaw wrote:
> Again, the issue is not our tool, but the deficiency in libpq/postgres

> ... even mysql gets its right  .. why not Postgres?
>
> Its not hard for a database to report metadata properly.
>
> if I issue a sql statement:
> select '123' from <any table>
> the database should report that the maximum length of the 1st column
> in the resultset is 3 ... it cant be any more plain than that.
>
>
>

Making assertions like this does not make your case for you. If you
think it's that easy then send in a patch. I suspect that doing what you
want in the cases where it could be supported would require a protocol
change, with possibly an extra field in the RowDescription object. If
that's true you'd need to make a very good and compelling case indeed
for such a change. If this is so vital I'm curious to know why driver
authors haven't been screaming about it until now. I'm not dismissing
what you want, but just waving your hand and saying "it's not hard"
really won't do.

cheers

andrew




Re: Selecting a constant question

From
Martijn van Oosterhout
Date:
Hi,

Nobody is tring to attack anyone, but we're all surprised this is an
issue since you're the first person to have mentioned it. I have some
a query to test below:

On Tue, Jun 12, 2007 at 10:21:09AM -0700, Larry McGhaw wrote:
> We noticed inexplicably that when we used a constant with a postgres
> query, our records per second dropped
> From 60,000 records per second to 600 records per second, so we started
> digging into the issue.
>
> We discovered that libpq was not describing the metadata properly for
> the constant column, and it appears
> That the 3rd party grid control was relying on that metadata somehow  ..
> The bottom line is that there was
> A huge performance drag.

What I don't understand is *why* it's complaining about the constant
column and not, for example, any other variable length column. There
are a very small number of cases where a useful length is returned, 99%
of the time it doesn't, yet you're obviously not get any performance
problems there.

Just a quick test, does the problem go away if you do:

SELECT '1'::varchar FROM table;

If that fixes it then the bug is (probably) that the middleware thinks
that a length of -2 means it's 65534 bytes long. Note, in the test
query I gave, it will return -1 for the length. I don't want to blame
the middleware, but I want to make sure we're diagnosing the problem
correctly.

If that query has the same problem, then we really need to think of
something else.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: Selecting a constant question

From
Brian Hurt
Date:
Larry McGhaw wrote:

>I'm really frustrated by this process I'm not trying to attack anyone
>here.  I'm just surprised that no one will even entertain the idea that
>this is an issue that needs to be addressed.
>
>Instead nearly all of the responses have been attacking the applications
>that rely on the metadata.
>  
>
Having been following this debate, I think what people have really been 
attacking is the idea that the metadata for:

SELECT '1' AS varchar_column;

should be different from the metadata for:

SELECT varchar_column FROM really_big_table;

or for:

SELECT varchar_column FROM really_small_table;

Or at least that's what I've taken away from the dicussion- it's not so 
much that the metadata shouldn't be relied on, it's that the metadata 
may be more generic than theoretically necessary.  And that the metadata 
may not contain the length of a variable length field even when that 
length is known.

Brian



Re: Selecting a constant question

From
"Larry McGhaw"
Date:
>> What I don't understand is *why* it's complaining about the constant
column
>> and not, for example, any other variable length column. There are a
very small
>> number of cases where a useful length is returned, 99% of the time it
doesn't,
>> yet you're obviously not get any performance problems there.

The statement above is contrary to my actual results.  The proper length
is returned in all non-const cases.

Here is a specific example:

test=# create table test1 ( a varchar(20), b char(10), c integer );
CREATE TABLE
test=#

Note .. The table is empty, and contains no data at this point:

Select a, b, c, '123' , '123'::char(3), '123'::varchar(3) from test1

For column a libpq returns the following:
Pqfsize returns -1
Pqfmod (-4) returns 20

For column b libpq returns the following:
Pqfsize returns -1
Pqfmod (-4) returns 10

For column c libpq returns the following:
Pqfsize returns 4

For constant '123' libpq returns the following:
Pqfsize returns -2
Pqfmod returns -1

For constant '123'::char(3) libpq returns the following:
Pqfsize returns -1
Pqfmod (-4) returns 3

For constant '123'::varchar(3) libpq returns the following:
Pqfsize returns -1
Pqfmod returns -1

Thanks

lm

-----Original Message-----
From: Brian Hurt [mailto:bhurt@janestcapital.com]
Sent: Tuesday, June 12, 2007 11:09 AM
To: Larry McGhaw
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Selecting a constant question

Larry McGhaw wrote:

>I'm really frustrated by this process I'm not trying to attack anyone
>here.  I'm just surprised that no one will even entertain the idea that

>this is an issue that needs to be addressed.
>
>Instead nearly all of the responses have been attacking the
>applications that rely on the metadata.
>
>
Having been following this debate, I think what people have really been
attacking is the idea that the metadata for:

SELECT '1' AS varchar_column;

should be different from the metadata for:

SELECT varchar_column FROM really_big_table;

or for:

SELECT varchar_column FROM really_small_table;

Or at least that's what I've taken away from the dicussion- it's not so
much that the metadata shouldn't be relied on, it's that the metadata
may be more generic than theoretically necessary.  And that the metadata
may not contain the length of a variable length field even when that
length is known.

Brian



Re: Selecting a constant question

From
Andrew Dunstan
Date:

Larry McGhaw wrote:
> For constant '123'::varchar(3) libpq returns the following:
> Pqfsize returns -1
> Pqfmod returns -1 
>
>   

That one certainly looks odd.

cheers

andrew


Re: Selecting a constant question

From
"Larry McGhaw"
Date:
That one surprised me as well.

Thanks

lm

-----Original Message-----
From: Andrew Dunstan [mailto:andrew@dunslane.net]
Sent: Tuesday, June 12, 2007 12:00 PM
To: Larry McGhaw
Cc: Brian Hurt; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Selecting a constant question



Larry McGhaw wrote:
> For constant '123'::varchar(3) libpq returns the following:
> Pqfsize returns -1
> Pqfmod returns -1
>
>

That one certainly looks odd.

cheers

andrew


Re: Selecting a constant question

From
Gregory Stark
Date:
"Larry McGhaw" <lmcghaw@connx.com> writes:

> The statement above is contrary to my actual results.  The proper length
> is returned in all non-const cases.
>
> Here is a specific example:
>
> test=# create table test1 ( a varchar(20), b char(10), c integer );
> CREATE TABLE

It's not returning a length at all though. It's returning the typmod, ie, the
thing in the parentheses above. In that respect it's perfectly correct to
return -1 for the '123' case as well since it's interpreted as an unbounded
string and has no maximum length. It happens to only be three characters but
then the values in the table could happen to be much less than the 10 or 20
characters you declared them as.

The reason you might want to get this has more to do with understanding the
semantics of the data you're receiving than optimizing storage. If you queried
a Numeric column you would get something very different from the length from
which you could extract the maximum precision and scale. This might help you
display or work with the results maintaining the precision and scale a user
expects.

One reason why it might be useful to add an actual measure of the expected
length (Postgres does make guesses about the length for planning purposes)
would be to so a driver could size buffers appropriately. For example, in psql
where we use cursors to process rows, we might want to automatically use a
fetch count calculated to be large enough to receive approximately one
ethernet frame of data.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com



Re: Selecting a constant question

From
"Larry McGhaw"
Date:
For what its worth .. Your statement about why we are the first people
to mention this problem really got me thinking.  Anyone who would
attempt to write an ODBC driver for Postgres would run into the exact
same issue.   So I installed the official Postgres ODBC driver, and ran
the identical query and here are my results:

I probably should have looked at this first .... There is a whole
Postgres ODBC dialog dedicated to the very subject of this thread:
Handling of "unknown" data sizes.   The pgodbc driver is configured to
treat unknowns as varchar(255) by default,
As shown by my example below.  This can be configured up or down as
desired.

SQLExecDirect:
In: hstmt = 0x003C18E0, szSqlStr = "Select a,b,c, '123' ,
'123'::char(3), '123'::varchar(3) from...", cbSqlStr = -3
Return:    SQL_SUCCESS=0

Describe Column All:
icol, szColName, *pcbColName, *pfSqlType, *pcbColDef, *pibScale,
*pfNullable
1, a, 1, SQL_VARCHAR=12, 20, 0, SQL_NULLABLE=1
2, b, 1, SQL_CHAR=1, 10, 0, SQL_NULLABLE=1
3, c, 1, SQL_INTEGER=4, 10, 0, SQL_NULLABLE=1
4, ?column?, 8, SQL_VARCHAR=12, 255, 0, SQL_NULLABLE=1
5, bpchar, 6, SQL_CHAR=1, 3, 0, SQL_NULLABLE=1
6, varchar, 7, SQL_VARCHAR=12, 255, 0, SQL_NULLABLE=1

From psqlodbc.h

#define MAX_VARCHAR_SIZE        255    /* default maximum size
of                     * varchar fields (not
including null term) */

So I guess the bottom line is that we are not the first to encounter
this problem .. Its just been covered up by assigning
An arbitrary maximum size .. So I guess we will do the same and make it
configurable like the official postgres driver.

Thanks

lm

-----Original Message-----
From: Martijn van Oosterhout [mailto:kleptog@svana.org]
Sent: Tuesday, June 12, 2007 10:43 AM
To: Larry McGhaw
Cc: Andrew Dunstan; Hannu Krosing; Tom Lane; Alvaro Herrera; Dann
Corbit; Gregory Stark; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Selecting a constant question

Hi,

Nobody is tring to attack anyone, but we're all surprised this is an
issue since you're the first person to have mentioned it. I have some a
query to test below:

On Tue, Jun 12, 2007 at 10:21:09AM -0700, Larry McGhaw wrote:
> We noticed inexplicably that when we used a constant with a postgres
> query, our records per second dropped From 60,000 records per second
> to 600 records per second, so we started digging into the issue.
>
> We discovered that libpq was not describing the metadata properly for
> the constant column, and it appears That the 3rd party grid control
> was relying on that metadata somehow  ..
> The bottom line is that there was
> A huge performance drag.

What I don't understand is *why* it's complaining about the constant
column and not, for example, any other variable length column. There are
a very small number of cases where a useful length is returned, 99% of
the time it doesn't, yet you're obviously not get any performance
problems there.

Just a quick test, does the problem go away if you do:

SELECT '1'::varchar FROM table;

If that fixes it then the bug is (probably) that the middleware thinks
that a length of -2 means it's 65534 bytes long. Note, in the test query
I gave, it will return -1 for the length. I don't want to blame the
middleware, but I want to make sure we're diagnosing the problem
correctly.

If that query has the same problem, then we really need to think of
something else.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability
to litigate.


Re: Selecting a constant question

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: Larry McGhaw
> Sent: Tuesday, June 12, 2007 1:40 PM
> To: Martijn van Oosterhout
> Cc: Andrew Dunstan; Hannu Krosing; Tom Lane; Alvaro Herrera; Dann
Corbit;
> Gregory Stark; pgsql-hackers@postgresql.org
> Subject: RE: [HACKERS] Selecting a constant question
>
> For what its worth .. Your statement about why we are the first people
to
> mention this problem really got me thinking.  Anyone who would attempt
to
> write an ODBC driver for Postgres would run into the exact same issue.
> So I installed the official Postgres ODBC driver, and ran the
identical
> query and here are my results:
>
> I probably should have looked at this first .... There is a whole
Postgres
> ODBC dialog dedicated to the very subject of this thread:
> Handling of "unknown" data sizes.   The pgodbc driver is configured to
> treat unknowns as varchar(255) by default,
> As shown by my example below.  This can be configured up or down as
> desired.
>
> SQLExecDirect:
> In: hstmt = 0x003C18E0, szSqlStr = "Select a,b,c, '123' ,
'123'::char(3),
> '123'::varchar(3) from...", cbSqlStr = -3
> Return:    SQL_SUCCESS=0
>
> Describe Column All:
> icol, szColName, *pcbColName, *pfSqlType, *pcbColDef, *pibScale,
> *pfNullable
> 1, a, 1, SQL_VARCHAR=12, 20, 0, SQL_NULLABLE=1
> 2, b, 1, SQL_CHAR=1, 10, 0, SQL_NULLABLE=1
> 3, c, 1, SQL_INTEGER=4, 10, 0, SQL_NULLABLE=1
> 4, ?column?, 8, SQL_VARCHAR=12, 255, 0, SQL_NULLABLE=1
> 5, bpchar, 6, SQL_CHAR=1, 3, 0, SQL_NULLABLE=1
> 6, varchar, 7, SQL_VARCHAR=12, 255, 0, SQL_NULLABLE=1
>
> From psqlodbc.h
>
> #define MAX_VARCHAR_SIZE        255    /* default maximum size
of
>                          * varchar fields (not
including null
> term) */
>
> So I guess the bottom line is that we are not the first to encounter
this
> problem .. Its just been covered up by assigning
> An arbitrary maximum size .. So I guess we will do the same and make
it
> configurable like the official postgres driver.

Of course, the downside here is that choosing a default will truncate
the data when the actual data is larger than the default chosen.



Re: Selecting a constant question

From
Hannu Krosing
Date:
Ühel kenal päeval, T, 2007-06-12 kell 13:40, kirjutas Larry McGhaw:
> For what its worth .. Your statement about why we are the first people
> to mention this problem really got me thinking.  Anyone who would
> attempt to write an ODBC driver for Postgres would run into the exact
> same issue.   So I installed the official Postgres ODBC driver, and ran
> the identical query and here are my results:
> 
> I probably should have looked at this first .... There is a whole
> Postgres ODBC dialog dedicated to the very subject of this thread:
> Handling of "unknown" data sizes.   The pgodbc driver is configured to
> treat unknowns as varchar(255) by default,
> As shown by my example below.  This can be configured up or down as
> desired.

BTW, what is the reason you are writing your own ODBC driver ? 

What problems in the official one are you trying to solve ?

--------------
Hannu




Re: Selecting a constant question

From
"Joshua D. Drake"
Date:
Hannu Krosing wrote:
> Ühel kenal päeval, T, 2007-06-12 kell 13:40, kirjutas Larry McGhaw:
>> For what its worth .. Your statement about why we are the first people
>> to mention this problem really got me thinking.  Anyone who would
>> attempt to write an ODBC driver for Postgres would run into the exact
>> same issue.   So I installed the official Postgres ODBC driver, and ran
>> the identical query and here are my results:
>>
>> I probably should have looked at this first .... There is a whole
>> Postgres ODBC dialog dedicated to the very subject of this thread:
>> Handling of "unknown" data sizes.   The pgodbc driver is configured to
>> treat unknowns as varchar(255) by default,
>> As shown by my example below.  This can be configured up or down as
>> desired.
>
> BTW, what is the reason you are writing your own ODBC driver ?

They aren't I don't think. I think they are using the ODBC driver as an
example.

Joshua D. Drake

>
> What problems in the official one are you trying to solve ?
>
> --------------
> Hannu
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq
>


--
      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/