Thread: [PATCH] bugfix for int2vectorin
I believe the int2vectorin function should handle invalid input more cleanly.
Current behavior:
Behavior after attached patch:
Caleb
Current behavior:
-- Correct input format:
SELECT '1 2 3'::int2vector;
int2vector
------------
1 2 3
(1 row)
-- Three variations on incorrect input format
SELECT '1not 2really_an 3int2vector'::int2vector;
int2vector
------------
1 2 3
(1 row)
SELECT '1,2,3'::int2vector;
int2vector
------------
1
(1 row)
SELECT '1, 2,3'::int2vector;
int2vector
------------
1 2
(1 row)
-- What other types do:
SELECT '1,2'::oid;
ERROR: invalid input syntax for type oid: "1,2"
Behavior after attached patch:
-- Correct input format:Regards,
SELECT '1 2 3'::int2vector;
int2vector
------------
1 2 3
(1 row)
-- Three variations on incorrect input format
SELECT '1not 2really_an 3int2vector'::int2vector;
ERROR: invalid input for int2vector: "1not 2really_an 3int2vector"
SELECT '1,2,3'::int2vector;
ERROR: invalid input for int2vector: "1,2,3"
SELECT '1, 2,3'::int2vector;
ERROR: invalid input for int2vector: "1, 2,3"
-- What other types do:
SELECT '1,2'::oid;
ERROR: invalid input syntax for type oid: "1,2"
Caleb
Attachment
Caleb Welton <cwelton@greenplum.com> writes: > I believe the int2vectorin function should handle invalid input more cleanly. Why bother? Under what circumstances would users (or anyone at all) be putting data into an int2vector? It's an internal type and is certainly not meant for use in user tables. regards, tom lane
<font face="Calibri, Verdana, Helvetica, Arial"><span style="font-size:11pt">Tom,<br /> Thanks for the comments.<br /><br/> Caleb Welton <<a href="cwelton@greenplum.com">cwelton@greenplum.com</a>> writes:<br /> > I believe the int2vectorinfunction should handle invalid input more cleanly.<br /><br /> On 12/1/09 7:38 PM, "Tom Lane" <<a href="tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>>wrote:<br /> >> Why bother? <br /></span></font><blockquote><fontface="Calibri, Verdana, Helvetica, Arial"><span style="font-size:11pt"><br /></span></font></blockquote><fontface="Calibri, Verdana, Helvetica, Arial"><span style="font-size:11pt">Because correctnessis good. Results that produce unexpected results from incorrect input formatting lead to incorrect data beingloaded, which is bad.<br /><br /> >> Under what circumstances would users (or anyone at all) be putting data intoan int2vector? <br /></span></font><blockquote><font face="Calibri, Verdana, Helvetica, Arial"><span style="font-size:11pt"><br/></span></font></blockquote><font face="Calibri, Verdana, Helvetica, Arial"><span style="font-size:11pt">Iagree in principle, but users do the darnedest things. Perhaps they simply want a compact int arraythat won't toast. Perhaps they liked that word "vector" in the name. I stopped trying to second guess them a longtime ago. <br /><br /> >> It's an internal type and is certainly not meant for use in user tables.<br /></span></font><blockquote><fontface="Calibri, Verdana, Helvetica, Arial"><span style="font-size:11pt"><br /></span></font></blockquote><fontface="Calibri, Verdana, Helvetica, Arial"><span style="font-size:11pt">But there is nothingpreventing it from being used in user tables so calling it "internal" doesn't really mean a whole lot.<br /><br />What exactly is your objection to having the int2arrayin parser handle its input conversion reasonably?<br /><br /> -Caleb</span></font>
On Tue, Dec 1, 2009 at 10:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Caleb Welton <cwelton@greenplum.com> writes: >> I believe the int2vectorin function should handle invalid input more cleanly. > > Why bother? Under what circumstances would users (or anyone at all) > be putting data into an int2vector? It's an internal type and is > certainly not meant for use in user tables. Maybe it's just me, but this seems like an unnecessarily dismissive response to a report of some rather odd-looking behavior. ...Robert
Caleb Welton <cwelton@greenplum.com> writes: > On 12/1/09 7:38 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: >> Under what circumstances would users (or anyone at all) be putting data into an int2vector? > What exactly is your objection to having the int2arrayin parser handle its input conversion reasonably? I'm trying to gauge what the actual use-case is for having a slightly nicer error behavior. The proposed patch adds another translatable error string, which is no skin off my own nose but does create ongoing work for our translation team. And presumably, if we're going to fix this, we ought to fix the about-equally-stupid parsing logic in oidvectorin. While we're at it, should we trouble to detect overflow in int2vectorin? You could spend quite a bit of time and code making these functions more bulletproof, but I'm not convinced it's worth any work. regards, tom lane
New patch attached:
1. Does not add a new error message (though the pg_atoi's error message is a little goofy looking).
2. Handles int2 overflow cases.
3. oidvectorin does NOT suffer from the same problems as int2vectorin, someone already fixed it.
As for the use-case I'm not completely sure... I'm not an end-user, I'm just responding to a bug report.
My stance here is that returning an error (even a bad error) on trying to convert data in is better
doing something "wrong" with bogus input. In the first case a user scratches their head, maybe
files a bug report, you tell them the correct syntax and they go on. In the second case they input
a bunch of data and then start complaining about "data corruption", "loss of data", etc. and the
support case is 100x worse.
The amount of code we are talking about here is less than 5 lines of code...
Regards,
Caleb
On 12/1/09 9:24 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
1. Does not add a new error message (though the pg_atoi's error message is a little goofy looking).
2. Handles int2 overflow cases.
3. oidvectorin does NOT suffer from the same problems as int2vectorin, someone already fixed it.
As for the use-case I'm not completely sure... I'm not an end-user, I'm just responding to a bug report.
My stance here is that returning an error (even a bad error) on trying to convert data in is better
doing something "wrong" with bogus input. In the first case a user scratches their head, maybe
files a bug report, you tell them the correct syntax and they go on. In the second case they input
a bunch of data and then start complaining about "data corruption", "loss of data", etc. and the
support case is 100x worse.
The amount of code we are talking about here is less than 5 lines of code...
Regards,
Caleb
On 12/1/09 9:24 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
Caleb Welton <cwelton@greenplum.com> writes:
> On 12/1/09 7:38 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
>> Under what circumstances would users (or anyone at all) be putting data into an int2vector?
> What exactly is your objection to having the int2arrayin parser handle its input conversion reasonably?
I'm trying to gauge what the actual use-case is for having a slightly
nicer error behavior. The proposed patch adds another translatable
error string, which is no skin off my own nose but does create ongoing
work for our translation team. And presumably, if we're going to fix
this, we ought to fix the about-equally-stupid parsing logic in oidvectorin.
While we're at it, should we trouble to detect overflow in int2vectorin?
You could spend quite a bit of time and code making these functions more
bulletproof, but I'm not convinced it's worth any work.
regards, tom lane
Attachment
On Wed, Dec 2, 2009 at 2:59 PM, Caleb Welton <cwelton@greenplum.com> wrote: > New patch attached: > > 1. Does not add a new error message (though the pg_atoi's error message is a > little goofy looking). > 2. Handles int2 overflow cases. > 3. oidvectorin does NOT suffer from the same problems as int2vectorin, > someone already fixed it. > > As for the use-case I'm not completely sure... I'm not an end-user, I'm just > responding to a bug report. > > My stance here is that returning an error (even a bad error) on trying to > convert data in is better > doing something "wrong" with bogus input. In the first case a user > scratches their head, maybe > files a bug report, you tell them the correct syntax and they go on. In the > second case they input > a bunch of data and then start complaining about "data corruption", "loss of > data", etc. and the > support case is 100x worse. > > The amount of code we are talking about here is less than 5 lines of code... I have scrutinized the latest version of this patch and I feel that it is a modest improvement on the status quo and that there is really no downside. Absent strong objections, I will commit it later this week. ...Robert
On Mon, Dec 28, 2009 at 3:22 AM, Robert Haas <robertmhaas@gmail.com> wrote: > I have scrutinized the latest version of this patch and I feel that it > is a modest improvement on the status quo and that there is really no > downside. Absent strong objections, I will commit it later this week. Committed. ...Robert