Thread: "box" type description
I believe the "box" type description is slightly incorrect: # \dT box Liste der Datentypen Schema │ Name │ Beschreibung ────────────┼──────┼────────────────────────────────────────── pg_catalog │ box │ geometric box '(lower left,upper right)' While the syntax '((3,4),(1,2))'::box works, the canonical spelling is '(3,4),(1,2)' and hence the description should be: geometric box '(lower left),(upper right)' Christoph
At Mon, 29 Mar 2021 22:44:29 +0200, Christoph Berg <myon@debian.org> wrote in > I believe the "box" type description is slightly incorrect: > > # \dT box > Liste der Datentypen > Schema │ Name │ Beschreibung > ────────────┼──────┼────────────────────────────────────────── > pg_catalog │ box │ geometric box '(lower left,upper right)' > > While the syntax '((3,4),(1,2))'::box works, the canonical spelling is > '(3,4),(1,2)' and hence the description should be: > geometric box '(lower left),(upper right)' Maybe the reason you think so is that a box is printed in that format. postgres=# select '((1,1),(2,2))'::box; box ------------- (2,2),(1,1) (1 row) It doesn't use the word "canonical", but the documentation is saying that it is the output format. So I think you're right in that point. https://www.postgresql.org/docs/13/datatype-geometric.html Table 8.20. Geometric Types point 16 bytes Point on a plane (x,y) line 32 bytes Infinite line {A,B,C} lseg 32 bytes Finite line segment ((x1,y1),(x2,y2)) box 32 bytes Rectangular box ((x1,y1),(x2,y2)) path 16+16n bytes Closed path (similar to polygon) ((x1,y1),...) path 16+16n bytes Open path [(x1,y1),...] polygon 40+16n bytes Polygon (similar to closed path) ((x1,y1),...) circle 24 bytes Circle <(x,y),r> (center point and radius) Similary, lseg seems inconsistent... (It is correctly described in later sections.) select '(1,1),(2,2)'::lseg; => [(1,1),(2,2)] Surely it would be better that the documentation is consistent with the output function. Perhaps we prefer to fix documentation rather than to fix implementation to give impacts on applications that may exist. (I don't like the notation since the representation of box doesn't look like one object, though..) Returing to the description of pg_types, it should be changed like this following the discussion here. - pg_catalog | box | geometric box '(lower left,upper right)' + pg_catalog | box | geometric box 'lower left,upper right' But I find it hard to read. I fixed it instead as the following in the attached. However, it might rather be better not changing it.. + pg_catalog | box | geometric box 'pt-lower-left,pt-upper-right' I added a space after commas, since point has it and (I think) it is easier to read having the ones. Is there any opinions? regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml index 7c341c8e3f..e118a689c8 100644 --- a/doc/src/sgml/datatype.sgml +++ b/doc/src/sgml/datatype.sgml @@ -3264,13 +3264,13 @@ SELECT person.name, holidays.num_weeks FROM person, holidays <entry><type>lseg</type></entry> <entry>32 bytes</entry> <entry>Finite line segment</entry> - <entry>((x1,y1),(x2,y2))</entry> + <entry>[(x1,y1),(x2,y2)]</entry> </row> <row> <entry><type>box</type></entry> <entry>32 bytes</entry> <entry>Rectangular box</entry> - <entry>((x1,y1),(x2,y2))</entry> + <entry>(x1,y1),(x2,y2)</entry> </row> <row> <entry><type>path</type></entry> diff --git a/src/include/catalog/pg_type.dat b/src/include/catalog/pg_type.dat index 8c145c00be..42adc184d7 100644 --- a/src/include/catalog/pg_type.dat +++ b/src/include/catalog/pg_type.dat @@ -185,24 +185,24 @@ typinput => 'point_in', typoutput => 'point_out', typreceive => 'point_recv', typsend => 'point_send', typalign => 'd' }, { oid => '601', array_type_oid => '1018', - descr => 'geometric line segment \'(pt1,pt2)\'', + descr => 'geometric line segment \'[pt1, pt2]\'', typname => 'lseg', typlen => '32', typbyval => 'f', typcategory => 'G', typsubscript => 'raw_array_subscript_handler', typelem => 'point', typinput => 'lseg_in', typoutput => 'lseg_out', typreceive => 'lseg_recv', typsend => 'lseg_send', typalign => 'd' }, { oid => '602', array_type_oid => '1019', - descr => 'geometric path \'(pt1,...)\'', + descr => 'geometric path \'(pt1, ...)\'', typname => 'path', typlen => '-1', typbyval => 'f', typcategory => 'G', typinput => 'path_in', typoutput => 'path_out', typreceive => 'path_recv', typsend => 'path_send', typalign => 'd', typstorage => 'x' }, { oid => '603', array_type_oid => '1020', - descr => 'geometric box \'(lower left,upper right)\'', + descr => 'geometric box \'pt-lower-left, pt-upper-right\'', typname => 'box', typlen => '32', typbyval => 'f', typcategory => 'G', typdelim => ';', typsubscript => 'raw_array_subscript_handler', typelem => 'point', typinput => 'box_in', typoutput => 'box_out', typreceive => 'box_recv', typsend => 'box_send', typalign => 'd' }, { oid => '604', array_type_oid => '1027', - descr => 'geometric polygon \'(pt1,...)\'', + descr => 'geometric polygon \'(pt1, ...)\'', typname => 'polygon', typlen => '-1', typbyval => 'f', typcategory => 'G', typinput => 'poly_in', typoutput => 'poly_out', typreceive => 'poly_recv', typsend => 'poly_send', typalign => 'd', typstorage => 'x' },
Re: Kyotaro Horiguchi > Returing to the description of pg_types, it should be changed like > this following the discussion here. > > - pg_catalog | box | geometric box '(lower left,upper right)' > + pg_catalog | box | geometric box 'lower left,upper right' > > But I find it hard to read. I fixed it instead as the following in the > attached. However, it might rather be better not changing it.. > > + pg_catalog | box | geometric box 'pt-lower-left,pt-upper-right' I like that because it points to the "point" syntax so users can figure out how to spell a box. Christoph
On Wed, Mar 31, 2021 at 01:43:47PM +0200, Christoph Berg wrote: > Re: Kyotaro Horiguchi > > Returing to the description of pg_types, it should be changed like > > this following the discussion here. > > > > - pg_catalog | box | geometric box '(lower left,upper right)' > > + pg_catalog | box | geometric box 'lower left,upper right' > > > > But I find it hard to read. I fixed it instead as the following in the > > attached. However, it might rather be better not changing it.. > > > > + pg_catalog | box | geometric box 'pt-lower-left,pt-upper-right' > > I like that because it points to the "point" syntax so users can > figure out how to spell a box. I liked Horiguchi-san's patch from 2021, but once I started looking further, I found a number of improvements that can be made in the \dTS output beyond Horiguchi-san's changes: * boolean outputs 't'/'f', not 'true'/'false' * Added "format" ... for output * tid output format was at the start, not the end * I didn't add space between point x,y because the output has no space * I spelled out "point" instead of "pt" * "line" has two very different input formats so I listed both (more different than others like boolean) * I didn't see the need to say "datatype" for LSN and UUID * I improved the txid_snapshot description * There was no description for table_am_handler Patch attached. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
Attachment
At Wed, 1 Nov 2023 11:36:01 -0400, Bruce Momjian <bruce@momjian.us> wrote in > On Wed, Mar 31, 2021 at 01:43:47PM +0200, Christoph Berg wrote: > > Re: Kyotaro Horiguchi > > I like that because it points to the "point" syntax so users can > > figure out how to spell a box. > > I liked Horiguchi-san's patch from 2021, but once I started looking > further, I found a number of improvements that can be made in the \dTS > output beyond Horiguchi-san's changes: > > * boolean outputs 't'/'f', not 'true'/'false' > * Added "format" ... for output > * tid output format was at the start, not the end > * I didn't add space between point x,y because the output has no space > * I spelled out "point" instead of "pt" > * "line" has two very different input formats so I listed both > (more different than others like boolean) > * I didn't see the need to say "datatype" for LSN and UUID > * I improved the txid_snapshot description > * There was no description for table_am_handler > > Patch attached. Thank you for continuing this. The additional changes looks fine. Upon reviewing the table again in this line, further potential improvements and issues have been found. For example: character, varchar: don't follow the rule. - 'char(length)' blank-padded string, fixed storage length + blank-padded string, fixed storage length, format 'char(length)' interval: doesn't follow the rule. - @ <number> <units>, time interval + time interval, format '[@] <number> <units>' (I think '@' is not necessary here..) pg_snapshot: The description given is just "snapshot", which seems overly simplistic. txid_snapshot: The description reads "transaction snapshot". Is this really accurate, especially in contrast with pg_snapshot? pg_brin_bloom_summary, pg_brin_minmax_multi_summary, pg_mcv_list and many: I'm uncertain whether these types, which lack an input syntax (but have an output format), qualify as pseudo-types. Nevertheless, I believe it would be beneficial to describe that those types differ from ordinary types. Should we consider refining these descriptions in the table? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Thu, Nov 2, 2023 at 05:28:20PM +0900, Kyotaro Horiguchi wrote: > Thank you for continuing this. The additional changes looks > fine. > > Upon reviewing the table again in this line, further potential > improvements and issues have been found. For example: > > character, varchar: don't follow the rule. > - 'char(length)' blank-padded string, fixed storage length > + blank-padded string, fixed storage length, format 'char(length)' So, char() and varchar() are _definition_ synonyms for characater and character varying, so I put the way you define them at the _front_ of the text. The "format" is the _output_ format and I put that at the end for other types. I put numeric() at the front too since its definition is complex. (I now see numeric should be "precision, scale" so I fixed that.) > interval: doesn't follow the rule. > - @ <number> <units>, time interval > + time interval, format '[@] <number> <units>' > (I think '@' is not necessary here..) Agreed, '@' is optional so removed, and I added "...". > pg_snapshot: > > The description given is just "snapshot", which seems overly simplistic. > > txid_snapshot: > > The description reads "transaction snapshot". Is this really > accurate, especially in contrast with pg_snapshot? Uh, the docs have for txid_snapshot: user-level transaction ID snapshot (deprecated; see <type>pg_snapshot</type>)< Do we want to add "deprecated" in the output. > pg_brin_bloom_summary, pg_brin_minmax_multi_summary, pg_mcv_list and many: > > I'm uncertain whether these types, which lack an input syntax (but > have an output format), qualify as pseudo-types. Nevertheless, I > believe it would be beneficial to describe that those types differ > from ordinary types. Good point, now labeled as pseudo-types. Updated patch attached. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
Attachment
On Thu, Nov 2, 2023 at 04:12:57PM -0400, Bruce Momjian wrote: > > pg_brin_bloom_summary, pg_brin_minmax_multi_summary, pg_mcv_list and many: > > > > I'm uncertain whether these types, which lack an input syntax (but > > have an output format), qualify as pseudo-types. Nevertheless, I > > believe it would be beneficial to describe that those types differ > > from ordinary types. > > Good point, now labeled as pseudo-types. > > Updated patch attached. Patch applied to master. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.