Thread: information_schema.columns changes needed for OLEDB
1. data_type field contains types names that are not recognized by MS apps.
Code around: rename types on the fly, e.g.
integer -> int
character varying -> varchar
character -> char
timestamp without time zone -> datetime
bytea -> image
2. character_maximum_length field
Code around: change value for text abd bytea types
[text] 1073741823
[bytea] 2147483647
3. character_octet_length should always be double of character_maximum_length (due to Unicode character size on Windows which is 2).
4. datetime_precision field is not always correct
Code around: change value of the fly, e.g. if value is not null then
[numeric] keep the value (ok)
[bigint] set value to 19
all other set to 10
5. numeric_precision_radix field should always be equal to 10
6. datetime_precision field, minor changes
Code around: change value on the fly, e.g.
[date] set value to zero
[datetime] set value to 3
On Friday 22 May 2009 18:27:01 Konstantin Izmailov wrote: > As we discussed at pgcon2009 there are some changes/fixes necessary in > information_schema.columns to allow correct work of applications and > services via OLEDB on Windows. Here are some: > > 1. data_type field contains types names that are not recognized by MS apps. > > Code around: rename types on the fly, e.g. > > integer -> int > > character varying -> varchar > > character -> char The spelling of these types in the information schema is fixed by the SQL standard. We can't change that. > timestamp without time zone -> datetime And that would certainly be wrong for other applications, because PostgreSQL doesn't have a datetime type. > bytea -> image And that we certainly can't do either. > 2. character_maximum_length field > > Code around: change value for text abd bytea types > > [text] 1073741823 (see next item) > [bytea] 2147483647 But bytea is not a character type in the first place, so this value is meaningless. > 3. character_octet_length should always be double of > character_maximum_length (due to Unicode character size on Windows which is > 2). We could do something like that if we exposed the maximum octet length of a character per encoding. But what I wonder is whether this should reflect the server or the client encoding. How do your applications use this value? > 4. datetime_precision field is not always correct > > Code around: change value of the fly, e.g. if value is not null then > > [numeric] keep the value (ok) > > [bigint] set value to 19 > > all other set to 10 Why would numeric and bigint affect *datetime*_precision at all? > 5. numeric_precision_radix field should always be equal to 10 Why? > 6. datetime_precision field, minor changes > > Code around: change value on the fly, e.g. > > [date] set value to zero Makes sense. I think this is not correct at the moment. > [datetime] set value to 3 Well, it really depends on what you set it to when you declared the column, no?
Number 4 is actually numeric_precision (I typed incorrectly). My recollection is that numeric_precision sometimes expressedin radix 2 and it caused issues for Windows apps.<br /><br />I agree on other issues. I was curious if databasecan help OLEDB driver (to make it simpler). Anyway it can emulate values for specific Windows apps on the fly. Thankyou!<br /><br /><div class="gmail_quote">On Sat, May 23, 2009 at 2:57 PM, Peter Eisentraut <span dir="ltr"><<a href="mailto:peter_e@gmx.net">peter_e@gmx.net</a>></span>wrote:<br /><blockquote class="gmail_quote" style="border-left:1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="im">On Friday22 May 2009 18:27:01 Konstantin Izmailov wrote:<br /> > As we discussed at pgcon2009 there are some changes/fixesnecessary in<br /> > information_schema.columns to allow correct work of applications and<br /> > servicesvia OLEDB on Windows. Here are some:<br /> ><br /> > 1. data_type field contains types names that are not recognizedby MS apps.<br /> ><br /> > Code around: rename types on the fly, e.g.<br /> ><br /> > integer ->int<br /> ><br /> > character varying -> varchar<br /> ><br /> > character -> char<br /><br /></div>Thespelling of these types in the information schema is fixed by the SQL<br /> standard. We can't change that.<br/><div class="im"><br /> > timestamp without time zone -> datetime<br /><br /></div>And that would certainlybe wrong for other applications, because PostgreSQL<br /> doesn't have a datetime type.<br /><br /> > bytea ->image<br /><br /> And that we certainly can't do either.<br /><div class="im"><br /> > 2. character_maximum_lengthfield<br /> ><br /> > Code around: change value for text abd bytea types<br /> ><br /> >[text] 1073741823<br /><br /></div>(see next item)<br /><br /> > [bytea] 2147483647<br /><br /> But bytea is not acharacter type in the first place, so this value is<br /> meaningless.<br /><div class="im"><br /> > 3. character_octet_lengthshould always be double of<br /> > character_maximum_length (due to Unicode character size on Windowswhich is<br /> > 2).<br /><br /></div>We could do something like that if we exposed the maximum octet length ofa<br /> character per encoding. But what I wonder is whether this should reflect the<br /> server or the client encoding. How do your applications use this value?<br /><div class="im"><br /> > 4. datetime_precision field is not alwayscorrect<br /> ><br /> > Code around: change value of the fly, e.g. if value is not null then<br /> ><br />> [numeric] keep the value (ok)<br /> ><br /> > [bigint] set value to 19<br /> ><br /> > all other set to10<br /><br /></div>Why would numeric and bigint affect *datetime*_precision at all?<br /><div class="im"><br /> > 5.numeric_precision_radix field should always be equal to 10<br /><br /></div>Why?<br /><div class="im"><br /> > 6. datetime_precisionfield, minor changes<br /> ><br /> > Code around: change value on the fly, e.g.<br /> ><br />> [date] set value to zero<br /><br /></div>Makes sense. I think this is not correct at the moment.<br /><div class="im"><br/> > [datetime] set value to 3<br /><br /></div>Well, it really depends on what you set it to when you declaredthe column,<br /> no?<br /></blockquote></div><br />
On 5/23/09 7:37 PM, Konstantin Izmailov wrote: > Number 4 is actually numeric_precision (I typed incorrectly). My > recollection is that numeric_precision sometimes expressed in radix 2 > and it caused issues for Windows apps. > > I agree on other issues. I was curious if database can help OLEDB driver > (to make it simpler). Anyway it can emulate values for specific Windows > apps on the fly. Thank you! You could, of course, create your own ms_information_schema which had ms_friendly views of the IS. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
Josh Berkus wrote: > On 5/23/09 7:37 PM, Konstantin Izmailov wrote: >> Number 4 is actually numeric_precision (I typed incorrectly). My >> recollection is that numeric_precision sometimes expressed in radix 2 >> and it caused issues for Windows apps. >> >> I agree on other issues. I was curious if database can help OLEDB driver >> (to make it simpler). Anyway it can emulate values for specific Windows >> apps on the fly. Thank you! > > You could, of course, create your own ms_information_schema which had > ms_friendly views of the IS. > This is what I have done for a past project. I do not think we should part from the standard SQL schema in order to satisfy a certain third party component. If the information_schema does not provide all the information, one could always query the pg_* tables for needed data. -- Regards, Gevik
On Sunday 24 May 2009 03:37:28 Konstantin Izmailov wrote: > Number 4 is actually numeric_precision (I typed incorrectly). My > recollection is that numeric_precision sometimes expressed in radix 2 and > it caused issues for Windows apps. It is measured in radix 2 for floating-point types and in radix 10 for fixed- point types.
Hi,<br />I'm not sure that it is related to information_schema but I wanted to let you know that some Postgres functionsare listed in pg_proc while others are not. For example, all Data Type Formatting function are in pg_proc (to_char,to_hex, ...). While several of the Date/Time Functions are not there (extract, localtime, ...).<br /><br />Why differentPostgres function are not equal???<br /><br />This causes issues to Windows integration as well.<br />Konstantin<br/>
Konstantin Izmailov <pgfizm@gmail.com> writes: > you know that some Postgres functions are listed in pg_proc while others are > not. For example, all Data Type Formatting function are in pg_proc (to_char, > to_hex, ...). While several of the Date/Time Functions are not there > (extract, localtime, ...). The ones that appear not to be there are ones that the SQL standard demands special weird syntax for. The grammar translates such calls to standard function calls to underlying functions, which usually are named a bit differently to avoid confusion. For instance extract(field from some_expr) becomes date_part('field', some_expr). If you want to know what all of these are, see the func_expr production in parser/gram.y. > This causes issues to Windows integration as well. Complain to the SQL standards committee, especially to those members who seem to think COBOL represented the apex of programming language syntax design :-( regards, tom lane
Konstantin Izmailov <pgfizm@gmail.com> writes:The ones that appear not to be there are ones that the SQL standard
> you know that some Postgres functions are listed in pg_proc while others are
> not. For example, all Data Type Formatting function are in pg_proc (to_char,
> to_hex, ...). While several of the Date/Time Functions are not there
> (extract, localtime, ...).
demands special weird syntax for. The grammar translates such calls
to standard function calls to underlying functions, which usually are
named a bit differently to avoid confusion. For instance
extract(field from some_expr) becomes date_part('field', some_expr).
If you want to know what all of these are, see the func_expr production
in parser/gram.y.Complain to the SQL standards committee, especially to those members
> This causes issues to Windows integration as well.
who seem to think COBOL represented the apex of programming language
syntax design :-(
regards, tom lane
Peter Eisentraut <peter_e@gmx.net> writes: > On Friday 22 May 2009 18:27:01 Konstantin Izmailov wrote: >> [ discussion of some details of information_schema results ] Are we going to do anything about any of these issues? In particular, >> 6. datetime_precision field, minor changes >> >> Code around: change value on the fly, e.g. >> >> [date] set value to zero > Makes sense. I think this is not correct at the moment. AFAICS, the SQL standard demands that precision and scale fields be non-null all the time for those data types where they make sense (this is encoded in the CHECK CONSTRAINTs that are declared for the various information-schema tables, see particularly 21.15 DATA_TYPE_DESCRIPTOR base table in SQL99). DATE is clearly wrong per spec, but it's not the only problem. Our interpretation has been to set these values to null if the typmod is defaulted, which is reasonable in the abstract but it's still a violation of spec. I wonder whether we should be inserting some large limit value instead. For the datetime types we actually have a perfectly good default precision value, namely six digits, if the DB is using integer datetimes --- and I don't think there's a strong argument not to use this value for float timestamps too. I'd prefer to avoid a catversion bump at this stage of the cycle, but it looks like any changes here would merely involve the bodies of some functions in information_schema.sql. I think we could just change them without a catversion bump. Any beta testers who actually care could easily insert the new definitions without an initdb. regards, tom lane
On Sun, May 31, 2009 at 11:41 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I'd prefer to avoid a catversion bump at this stage of the cycle, > but it looks like any changes here would merely involve the bodies of > some functions in information_schema.sql. I think we could just change > them without a catversion bump. Any beta testers who actually care > could easily insert the new definitions without an initdb. Is this a regression relative to 8.3? If not, why not leave it for 8.5? It seems a little bit late in the game to be messing with this. I admit I don't have any particular reason to think it will break anything, but there are other minor patches floating around that meet that criteria too, so if this one then why not those? ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > Is this a regression relative to 8.3? If not, why not leave it for > 8.5? No. it's not a regression. What I was concerned about was mainly that it seemed likely to drop off the radar screen without being addressed at all, since most of the other things Konstantin complained about seemed to be we-won't-change-this items. I have no objection to a conscious decision to leave it alone for 8.4, so long as it gets memorialized as either a pending patch or a TODO item. regards, tom lane
On Sunday 31 May 2009 18:41:55 Tom Lane wrote: > AFAICS, the SQL standard demands that precision and scale fields be > non-null all the time for those data types where they make sense > (this is encoded in the CHECK CONSTRAINTs that are declared for the > various information-schema tables, see particularly 21.15 > DATA_TYPE_DESCRIPTOR base table in SQL99). DATE is clearly wrong > per spec, but it's not the only problem. The DATE change is the only thing I'd be prepared to make right now. > Our interpretation has been to set these values to null if the typmod > is defaulted, which is reasonable in the abstract but it's still a > violation of spec. I wonder whether we should be inserting some large > limit value instead. That is something to think about, but it needs more time. We also have some inconsistency there; for example we produce a large limit value for octet length. Needs more thought. And if we go down that route, it should also require less hardcoding of numbers into information_schema.sql.
Peter Eisentraut <peter_e@gmx.net> writes: > On Sunday 31 May 2009 18:41:55 Tom Lane wrote: >> AFAICS, the SQL standard demands that precision and scale fields be >> non-null all the time for those data types where they make sense >> (this is encoded in the CHECK CONSTRAINTs that are declared for the >> various information-schema tables, see particularly 21.15 >> DATA_TYPE_DESCRIPTOR base table in SQL99). DATE is clearly wrong >> per spec, but it's not the only problem. > The DATE change is the only thing I'd be prepared to make right now. At this point I think the clear decision is "we're not changing anything for 8.4". I've put the issue on the TODO list for future development cycles. regards, tom lane
On Monday 08 June 2009 07:12:33 Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > On Sunday 31 May 2009 18:41:55 Tom Lane wrote: > >> AFAICS, the SQL standard demands that precision and scale fields be > >> non-null all the time for those data types where they make sense > >> (this is encoded in the CHECK CONSTRAINTs that are declared for the > >> various information-schema tables, see particularly 21.15 > >> DATA_TYPE_DESCRIPTOR base table in SQL99). DATE is clearly wrong > >> per spec, but it's not the only problem. > > > > The DATE change is the only thing I'd be prepared to make right now. > > At this point I think the clear decision is "we're not changing anything > for 8.4". I've put the issue on the TODO list for future development > cycles. After gathering that there will probably be some other changes before release that will require an initdb (even without catversion bump), and after reexamining the issue, I think it's trivial and uncontroversial to fix the datetime issues: diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql index 9c5672f..cb0296a 100644 --- a/src/backend/catalog/information_schema.sql +++ b/src/backend/catalog/information_schema.sql @@ -160,12 +160,12 @@ CREATE FUNCTION _pg_datetime_precision(typid oid, typmod int4) RETURNS integer RETURNS NULL ON NULLINPUT AS$$SELECT - CASE WHEN $2 = -1 /* default typmod */ - THEN null + CASE WHEN $1 IN (1082) /* date */ + THEN 0 WHEN $1 IN (1083, 1114, 1184, 1266) /* time, timestamp, same + tz */ - THEN $2 + THEN CASE WHEN $2 = -1 THEN 6 ELSE $2 END WHEN $1 IN (1186) /* interval */ - THEN $2 & 65535 + THEN CASE WHEN $2 = -1 THEN 6 ELSE $2 & 65535 END ELSE null END$$; I have also prepared a patch that creates more realistic values for character_octet_length based on encoding information, which I will propose for 8.5. The issue of whether to report null or some large value for "unlimited" length data types needs some more thought.
Peter Eisentraut <peter_e@gmx.net> writes: > diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql > index 9c5672f..cb0296a 100644 > --- a/src/backend/catalog/information_schema.sql > +++ b/src/backend/catalog/information_schema.sql > @@ -160,12 +160,12 @@ CREATE FUNCTION _pg_datetime_precision(typid oid, typmod int4) RETURNS integer > RETURNS NULL ON NULL INPUT > AS > $$SELECT > - CASE WHEN $2 = -1 /* default typmod */ > - THEN null > + CASE WHEN $1 IN (1082) /* date */ > + THEN 0 > WHEN $1 IN (1083, 1114, 1184, 1266) /* time, timestamp, same + tz */ > - THEN $2 > + THEN CASE WHEN $2 = -1 THEN 6 ELSE $2 END > WHEN $1 IN (1186) /* interval */ > - THEN $2 & 65535 > + THEN CASE WHEN $2 = -1 THEN 6 ELSE $2 & 65535 END > ELSE null > END$$; Just for the record, this is a perfect example of why -u format sucks. How many people think they can look at this and know exactly what the change does? I'll be back to weigh in on the merits of the patch after I've converted it to -c format so I can understand it. regards, tom lane
Peter Eisentraut <peter_e@gmx.net> writes: > After gathering that there will probably be some other changes before > release that will require an initdb (even without catversion bump), and > after reexamining the issue, I think it's trivial and uncontroversial to > fix the datetime issues: For the benefit of anyone else finding this hard to read, I've reformatted as *** src/backend/catalog/information_schema.sql.orig Tue Feb 24 11:10:16 2009 --- src/backend/catalog/information_schema.sql Tue Jun 9 14:21:37 2009 *************** *** 160,171 **** RETURNS NULL ON NULL INPUT AS $$SELECT ! CASE WHEN $2 = -1 /* default typmod */ ! THEN null WHEN $1 IN (1083, 1114, 1184, 1266) /* time, timestamp, same + tz */ ! THEN $2 WHEN $1 IN (1186) /* interval */ ! THEN $2 & 65535 ELSE null END$$; --- 160,171 ---- RETURNS NULL ON NULL INPUT AS $$SELECT ! CASE WHEN $1 IN (1082) /* date */ ! THEN 0 WHEN $1 IN (1083, 1114, 1184, 1266) /* time, timestamp, same + tz */ ! THEN CASE WHEN $2 = -1 THEN 6 ELSE $2 END WHEN $1 IN (1186) /* interval */ ! THEN CASE WHEN $2 = -1 THEN 6 ELSE $2 & 65535 END ELSE null END$$; I would suggest coding the inner cases asCASE WHEN $2 < 0 THEN ...default... since the general practice in the C code is to treat any negative value as meaning "unspecified". Otherwise, +1. regards, tom lane
On Friday 22 May 2009 18:27:01 Konstantin Izmailov wrote: > 3. character_octet_length should always be double of > character_maximum_length (due to Unicode character size on Windows which is > 2). I have the attached patch that would make character_octet_length the product of character_octet_length and the maximum octet length of a single character in the selected server encoding. So for UTF-8, this would be factor 4. This doesn't exactly correspond to the behavior that you expect, but I think it's more correct overall anyway.
Peter Eisentraut <peter_e@gmx.net> writes: > I have the attached patch that would make character_octet_length the product > of character_octet_length and the maximum octet length of a single character > in the selected server encoding. So for UTF-8, this would be factor 4. This > doesn't exactly correspond to the behavior that you expect, but I think it's > more correct overall anyway. +1, but that new query isn't very schema-safe ... I think it needs a few "pg_catalog." qualifications. regards, tom lane
On Monday 06 July 2009 22:16:12 Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > I have the attached patch that would make character_octet_length the > > product of character_octet_length and the maximum octet length of a > > single character in the selected server encoding. So for UTF-8, this > > would be factor 4. This doesn't exactly correspond to the behavior that > > you expect, but I think it's more correct overall anyway. > > +1, but that new query isn't very schema-safe ... I think it needs a few > "pg_catalog." qualifications. Applied with fixes.