Thread: Add error-checking to timestamp_recv
Greetings, The attached patch adds some error-checking to the timestamp_recv function so that it's not possible to put an invalid timestamp into a timestamp column (hopefully). The check is done in basically the exact same way the out-of-bounds check in timestamp2tm is done. There's probably an easier/cleaner way but this should work or at least generate discussion and a better patch. :) Thanks, Stephen
Attachment
Would you show an example of the invalid value this is trying to avoid? --------------------------------------------------------------------------- Stephen Frost wrote: > Greetings, > > The attached patch adds some error-checking to the timestamp_recv > function so that it's not possible to put an invalid timestamp into a > timestamp column (hopefully). The check is done in basically the > exact same way the out-of-bounds check in timestamp2tm is done. > There's probably an easier/cleaner way but this should work or at > least generate discussion and a better patch. :) > > Thanks, > > Stephen [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
* Bruce Momjian (pgman@candle.pha.pa.us) wrote: > Would you show an example of the invalid value this is trying to avoid? Well, the way I discovered the problem was by sending a timestamp in double format when the server was expecting one in int64 format. This is when using the binary data method for timestamps. I'll generate a small example program/schema later today and post it to the list. Stephen > --------------------------------------------------------------------------- > > Stephen Frost wrote: > > Greetings, > > > > The attached patch adds some error-checking to the timestamp_recv > > function so that it's not possible to put an invalid timestamp into a > > timestamp column (hopefully). The check is done in basically the > > exact same way the out-of-bounds check in timestamp2tm is done. > > There's probably an easier/cleaner way but this should work or at > > least generate discussion and a better patch. :) > > > > Thanks, > > > > Stephen > > [ Attachment, skipping... ] > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 6: Have you searched our list archives? > > > > http://archives.postgresql.org > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 359-1001 > + If your life is a hard drive, | 13 Roberts Road > + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Attachment
Stephen Frost <sfrost@snowman.net> writes: > * Bruce Momjian (pgman@candle.pha.pa.us) wrote: >> Would you show an example of the invalid value this is trying to avoid? > Well, the way I discovered the problem was by sending a timestamp in > double format when the server was expecting one in int64 format. Most of the time, though, this sort of error would still yield a valid value that just failed to represent the timestamp value you wanted. I'm unsure that a range check is going to help much. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > * Bruce Momjian (pgman@candle.pha.pa.us) wrote: > >> Would you show an example of the invalid value this is trying to avoid? > > > Well, the way I discovered the problem was by sending a timestamp in > > double format when the server was expecting one in int64 format. > > Most of the time, though, this sort of error would still yield a valid > value that just failed to represent the timestamp value you wanted. > I'm unsure that a range check is going to help much. I'm not sure I agree about the 'most of the time' comment- I havn't done extensive tests yet but I wouldn't be at all suprised if most of the time range around the current date, when stored as a double and passed to the database which is expecting an int64, would cause the problem. The issue isn't about the wrong date being shown or anything, it's that the database accepts the value but then throws errors whenever you try to view the table. The intent of the patch was to use the exact same logic to test if the date is valid on the incoming side as is used to test the date before displaying it. This would hopefully make it impossible for someone to run into this problem in the future. It would also make it much clearer to the programmer that the date he's passing is bad and not that there's some corruption happening in the database after the date is accepted. Stephen
Attachment
Stephen Frost wrote: -- Start of PGP signed section. > * Bruce Momjian (pgman@candle.pha.pa.us) wrote: > > Would you show an example of the invalid value this is trying to avoid? > > Well, the way I discovered the problem was by sending a timestamp in > double format when the server was expecting one in int64 format. This > is when using the binary data method for timestamps. I'll generate a > small example program/schema later today and post it to the list. So you are passing the data via binary COPY or a C function? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
* Bruce Momjian (pgman@candle.pha.pa.us) wrote: > Stephen Frost wrote: > -- Start of PGP signed section. > > * Bruce Momjian (pgman@candle.pha.pa.us) wrote: > > > Would you show an example of the invalid value this is trying to avoid? > > > > Well, the way I discovered the problem was by sending a timestamp in > > double format when the server was expecting one in int64 format. This > > is when using the binary data method for timestamps. I'll generate a > > small example program/schema later today and post it to the list. > > So you are passing the data via binary COPY or a C function? Sorry I wasn't clear, it's using libpq and binary data using an 'insert' statement. The code looks something like this: PQexec(connection,"prepare addrow (timestamp) as insert into mytable values ($1)"); lengths[0] = sizeof(double); formats[0] = 1; *(double*)(values[0]) = tv_sec - ((POSTGRES_EPOCH_JDATE - UNIX_EPOCH_DATE) * 86400) + (tv_sec / 1000000.00); PQexecPrepared(connection,"addrow",1,(void*)values,lengths,formats,0); While the new code is something like: int64_t pg_timestamp; PQexec(connection,"prepare addrow (timestamp) as insert into mytable values ($1)"); lengths[0] = sizeof(int64_t); formats[0] = 1; pg_timestamp = ((tv_sec - ((POSTGRES_EPOCH_JDATE - UNIX_EPOCH_JDATE) * 86400)) * (int64_t)1000000) + tv_usec; *(int64_t*)(values[0]) = bswap_64(pg_timestamp); PQexecPrepared(connection,"addrow",1,(void*)values,lengths,formats,0); I'll see about writing up a proper test case/schema. Looks like I'm probably most of the way there at this point, really. ;) Stephen
Attachment
Stephen Frost wrote: > > So you are passing the data via binary COPY or a C function? > > Sorry I wasn't clear, it's using libpq and binary data using an 'insert' > statement. The code looks something like this: > > PQexec(connection,"prepare addrow (timestamp) as insert into mytable > values ($1)"); > lengths[0] = sizeof(double); > formats[0] = 1; > *(double*)(values[0]) = tv_sec - ((POSTGRES_EPOCH_JDATE - > UNIX_EPOCH_DATE) * 86400) + (tv_sec / 1000000.00); > PQexecPrepared(connection,"addrow",1,(void*)values,lengths,formats,0); > > While the new code is something like: > > int64_t pg_timestamp; > PQexec(connection,"prepare addrow (timestamp) as insert into mytable > values ($1)"); > lengths[0] = sizeof(int64_t); > formats[0] = 1; > pg_timestamp = ((tv_sec - ((POSTGRES_EPOCH_JDATE - UNIX_EPOCH_JDATE) * > 86400)) * (int64_t)1000000) + tv_usec; > *(int64_t*)(values[0]) = bswap_64(pg_timestamp); > PQexecPrepared(connection,"addrow",1,(void*)values,lengths,formats,0); > > I'll see about writing up a proper test case/schema. Looks like I'm > probably most of the way there at this point, really. ;) I wasn't aware you could throw binary values into the timestamp fields like that. I thought you needed to use a C string for the value. Does PREPARE bypass that for some reason? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
* Bruce Momjian (pgman@candle.pha.pa.us) wrote: > Stephen Frost wrote: > > I'll see about writing up a proper test case/schema. Looks like I'm > > probably most of the way there at this point, really. ;) > > I wasn't aware you could throw binary values into the timestamp fields > like that. I thought you needed to use a C string for the value. > > Does PREPARE bypass that for some reason? I don't think so.. As I recall, I think I might have had it set up w/o using a prepare before and it was working but I'm not sure. It's certainly very useful and lets me bypass *alot* of overhead (converting to a string and then making the database convert back...). The one complaint I do have is that I don't see a way to pass a timestamp w/ an explicit timezone in binary format into a table which has a 'timestamp with timezone' field. I can pass a binary timestamp into a 'timestamp with timezone' field, but it's interpreted as UTC or the local timezone (can't remember which atm). Stephen
Attachment
Stephen Frost wrote: -- Start of PGP signed section. > * Bruce Momjian (pgman@candle.pha.pa.us) wrote: > > Stephen Frost wrote: > > > I'll see about writing up a proper test case/schema. Looks like I'm > > > probably most of the way there at this point, really. ;) > > > > I wasn't aware you could throw binary values into the timestamp fields > > like that. I thought you needed to use a C string for the value. > > > > Does PREPARE bypass that for some reason? > > I don't think so.. As I recall, I think I might have had it set up w/o > using a prepare before and it was working but I'm not sure. It's > certainly very useful and lets me bypass *alot* of overhead > (converting to a string and then making the database convert back...). Considering all the other things the database is doing, I can't imagine that would be a measurable improvement. > The one complaint I do have is that I don't see a way to pass a > timestamp w/ an explicit timezone in binary format into a table which > has a 'timestamp with timezone' field. I can pass a binary timestamp > into a 'timestamp with timezone' field, but it's interpreted as UTC or > the local timezone (can't remember which atm). I still do not understand how this is working. It must be using our fast path as part of prepare. What language is you client code? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
* Bruce Momjian (pgman@candle.pha.pa.us) wrote: > Considering all the other things the database is doing, I can't imagine > that would be a measurable improvement. It makes it easier on my client program too which is listening to an ethernet interface and trying to process all of the packets coming in off of it and putting timestamps and header information into the database. The table in the database doesn't have any constraints or primary keys on it or anything, pretty much as simple as I could make it. :) > > The one complaint I do have is that I don't see a way to pass a > > timestamp w/ an explicit timezone in binary format into a table which > > has a 'timestamp with timezone' field. I can pass a binary timestamp > > into a 'timestamp with timezone' field, but it's interpreted as UTC or > > the local timezone (can't remember which atm). > > I still do not understand how this is working. It must be using our > fast path as part of prepare. What language is you client code? It's just plain ol' C. It's a pretty short/simple program, really. It uses libpcap to listen to the interface, checks the type of packet (ethernet, IP, UDP/TCP, etc), copies the binary header values into the structure which it then passes to PQexecPrepared. It's kind of amazing under 2.6, you can actually calculate the delay and bandwidth pretty accurately through a network (7 'backbone' nodes, each with a backbone router, an edge router, and an access router, all in a lab) by listening on two interfaces, one on each side to calculate one-way propagation time. Stephen
Attachment
Bruce Momjian <pgman@candle.pha.pa.us> writes: > I wasn't aware you could throw binary values into the timestamp fields > like that. I thought you needed to use a C string for the value. This facility was added in 7.4 as part of the wire-protocol overhaul. It's nothing directly to do with PREPARE; you could get the same result with no prepared statement using PQexecParams. regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > I wasn't aware you could throw binary values into the timestamp fields > > like that. I thought you needed to use a C string for the value. > > This facility was added in 7.4 as part of the wire-protocol overhaul. > It's nothing directly to do with PREPARE; you could get the same result > with no prepared statement using PQexecParams. Ah, no wonder I had not seen that before. So, I guess the issue is how much error checking do we want to have for these binary values. I was a little disturbed to hear he could insert data he couldn't later view. How many datatype have this issue? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
* Bruce Momjian (pgman@candle.pha.pa.us) wrote: > Tom Lane wrote: > > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > > I wasn't aware you could throw binary values into the timestamp fields > > > like that. I thought you needed to use a C string for the value. > > > > This facility was added in 7.4 as part of the wire-protocol overhaul. > > It's nothing directly to do with PREPARE; you could get the same result > > with no prepared statement using PQexecParams. > > Ah, no wonder I had not seen that before. So, I guess the issue is how > much error checking do we want to have for these binary values. I was a > little disturbed to hear he could insert data he couldn't later view. > How many datatype have this issue? I don't think that many do.. A number of them already check incoming values where it's possible for them to not be valid. For example, 'macaddr' accepts all possible binary values, 'inet' does error checking on input. Binary timestamps were the only place I found in the work I was doing where this could happen and I managed to mess up most of the fields in one way or another before I figured it all out. :) Stephen
Attachment
Stephen Frost <sfrost@snowman.net> writes: >> How many datatype have this issue? > I don't think that many do.. A number of them already check incoming > values where it's possible for them to not be valid. In general we do check incoming binary values to ensure minimal validity. I think when I did timestamp_recv I was thinking it was just like int8 or float8 (respectively), in that any bit pattern is potentially legal; I had forgotten about the range restrictions. I haven't looked at the details of Stephen's patch (and can't till the archives site comes back up) but the idea is probably sound. regards, tom lane
I wrote: > In general we do check incoming binary values to ensure minimal > validity. I think when I did timestamp_recv I was thinking it was > just like int8 or float8 (respectively), in that any bit pattern is > potentially legal; I had forgotten about the range restrictions. > I haven't looked at the details of Stephen's patch (and can't till the > archives site comes back up) but the idea is probably sound. Having looked at it, I don't like the patch as-is; it misses timestamptz_recv and doesn't handle the boundary condition correctly for the HasCTZSet case. However the details of the latter are likely to change completely once we finish adopting src/timezone. I'll make a note to do something with this issue after the TZ patch is in. regards, tom lane
I said: > I'll make a note to do something with this issue after the TZ patch > is in. I've applied a patch to take care of this problem. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > I said: > > I'll make a note to do something with this issue after the TZ patch > > is in. > > I've applied a patch to take care of this problem. Great, thanks, much appriciated. I'll test once 7.5 goes beta. Stephen