Thread: How to setup default value "0000-00-00" for "date" type under PostgreSQL?
Hello all, I have a question about "date" & "timestamp" types in PostgreSQL. I want to setup the default value '0000-00-00' and "0000-00-00 00:00:00" for them. However, it seems that PostgreSQL does not support it. Could someone helps me please? The example table: T1 (col1 varchar(7) not null, col2 varchar(4) not null, col3 date not null, col 4 varchar(3), primary key(col1, col2, col3) ) In my design model, "col3" has to be one of the primary key part. Since at the beginning of the data population, we do not know the value of "col3"; values for "col3" are input throught GUI. Therefore, when I use MySQL, the default values I gave is "0000-00-00". However, after I migrate to postgreSQL, I could not setup the default values as "0000-00-00" any more. Could somebody help me about it please? I really want to know how I can save '0000-00-00' as the default value for "date" and "timestamp" types. Thanks a lot! Emi Lu
Re: How to setup default value "0000-00-00" for "date" type under PostgreSQL?
From
Karsten Hilbert
Date:
> I really > want to know how I can save '0000-00-00' as the default value for "date" > and "timestamp" types. You can not. They are invalid dates. Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
On Wed, 18 Aug 2004, Emi Lu wrote: > Hello all, > > I have a question about "date" & "timestamp" types in PostgreSQL. I want to > setup the default value '0000-00-00' and "0000-00-00 00:00:00" for them. > However, it seems that PostgreSQL does not support it. Could someone helps me > please? how about using NULL ? You say the correct value is still undefined so NULL should be right on the spot. Greetings Christian -- Christian Kratzer ck@cksoft.de CK Software GmbH http://www.cksoft.de/ Phone: +49 7452 889 135 Fax: +49 7452 889 136
Emi Lu wrote: > Hello all, > > I have a question about "date" & "timestamp" types in PostgreSQL. I want > to setup the default value '0000-00-00' and "0000-00-00 00:00:00" for > them. However, it seems that PostgreSQL does not support it. Could > someone helps me please? PostgreSQL doesn't and almost certainly never will support "0000-00-00" as a date. That's because it isn't a valid date. You also can't store 13.723, "Hello world" or (12,13) in a date column either. Where you don't have a valid date to store you should use NULL. This business of storing zeroes is a horrible MySQL design mistake. > The example table: > > T1 (col1 varchar(7) not null, > col2 varchar(4) not null, > col3 date not null, > col 4 varchar(3), > primary key(col1, col2, col3) > ) > > > In my design model, "col3" has to be one of the primary key part. Since > at the beginning of the data population, we do not know the value of > "col3"; values for "col3" are input throught GUI. If you don't know the value of col3, it can't be part of your primary key. That's part of the definition of "primary key" and trying to work around it is what's causing you problems here. If you have an initial population of data that then needs to have "col3" set then it sounds like you need two tables T0 with primary-key (col1,col2) and T1 with (col1,col2,col3) and copy from T0=>T1 as users supply values. Difficult to say without knowing more about your situation. HTH -- Richard Huxton Archonet Ltd
Hi, On Fri, 20 Aug 2004, Richard Huxton wrote: > Emi Lu wrote: >> Hello all, >> >> I have a question about "date" & "timestamp" types in PostgreSQL. I want >> to setup the default value '0000-00-00' and "0000-00-00 00:00:00" for >> them. However, it seems that PostgreSQL does not support it. Could someone >> helps me please? > > PostgreSQL doesn't and almost certainly never will support "0000-00-00" as a > date. That's because it isn't a valid date. You also can't store 13.723, > "Hello world" or (12,13) in a date column either. > > Where you don't have a valid date to store you should use NULL. This business > of storing zeroes is a horrible MySQL design mistake. which is because the last time when I last used mysql it did not support NULLs in indexed columns (at least not in myisam tables). The workaround was to use something else like 0 to represent undefined values.... Horrible ... Greetings Christian -- Christian Kratzer ck@cksoft.de CK Software GmbH http://www.cksoft.de/ Phone: +49 7452 889 135 Fax: +49 7452 889 136
Richard Huxton wrote: > Where you don't have a valid date to store you should use NULL. This > business of storing zeroes is a horrible MySQL design mistake. Well, yes and no. It certainly is a design mistake and introduces incosistency into the database, but after I was bitten several times by NULL values I'd go for solution like this any day. Let me explain. We had a table of messages, which was inserted to randomly and every few minutes we'd walk through the unprocessed messages and perform some work on them. I, trying to have the database as clean as possible, used this table definition (simplified): messages ( id serial, data text, arrived timestamp default now, processed timestamp) So after the message arrived, it had the processed field set to null, which was perfectly consistent and represented what it realy was--an unknown value. We'd then simply SELECT * FROM messages WHERE processed IS NULL and were happy ever after, only to realise after the table had grown to few thousands rows, that the SELECT took ages, because the system had to perform seqscan. Aha! So we added an index on processed, because common sense told me, that as long as there are 100k rows and only 10 of them are NULL, the index would be very selective and therefore useful. I guess you know where it ends--the index is not used for IS [NOT] NULL expressions. The obvious workaround was to add DEFAULT value to "processed" in form of kind of anchor (we used '-infinity') and then do SELECT * FROM messages WHERE processed='-infinity'. Bingo! The thing went 100x faster. So we could choose to have standards-compliant, very clean database design OR the thing that does what it's supposed to do in reasonable time. And believe me, it's kind of difficult to explain to our logistics department that we could have done the thing to return results in milliseconds instead of 10 secs, but chose not to for sake of clean design. It'd be really nice if we didn't have to use such hacks, but hey, life's inperfect. And so that this would not be just a literary exercise and to answer Emi's question--you can't do that, but use some valid date which you are never going to use for your ordinary data (like the '-infinity', or 1.1.1970 00:00). Just make sure you make a note of it somewhere and my suggestion is you write a COMMENT ON that column for future generations. -- Michal Taborsky http://www.taborsky.cz
> I guess you know where it ends--the index is not used for IS [NOT] NULL > expressions. The obvious workaround was to add DEFAULT value to > "processed" in form of kind of anchor (we used '-infinity') Wouldn't it have worked to add an index ... WHERE processed IS NULL and go from there ? Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
Michal Taborsky said: > Richard Huxton wrote: > > Where you don't have a valid date to store you should use NULL. This > > business of storing zeroes is a horrible MySQL design mistake. > > Well, yes and no. It certainly is a design mistake and introduces > incosistency into the database, but after I was bitten several times by > NULL values I'd go for solution like this any day. Let me explain. > > We had a table of messages, which was inserted to randomly and every few > minutes we'd walk through the unprocessed messages and perform some work > on them. I, trying to have the database as clean as possible, used this > table definition (simplified): > > messages ( > id serial, > data text, > arrived timestamp default now, > processed timestamp) > > So after the message arrived, it had the processed field set to null, > which was perfectly consistent and represented what it realy was--an > unknown value. > > We'd then simply SELECT * FROM messages WHERE processed IS NULL and were > happy ever after, only to realise after the table had grown to few > thousands rows, that the SELECT took ages, because the system had to > perform seqscan. Aha! So we added an index on processed, because common > sense told me, that as long as there are 100k rows and only 10 of them > are NULL, the index would be very selective and therefore useful. > > I guess you know where it ends--the index is not used for IS [NOT] NULL > expressions. The obvious workaround was to add DEFAULT value to > "processed" in form of kind of anchor (we used '-infinity') and then do > SELECT * FROM messages WHERE processed='-infinity'. > > Bingo! The thing went 100x faster. So we could choose to have > standards-compliant, very clean database design OR the thing that does > what it's supposed to do in reasonable time. And believe me, it's kind > of difficult to explain to our logistics department that we could have > done the thing to return results in milliseconds instead of 10 secs, but > chose not to for sake of clean design. > > It'd be really nice if we didn't have to use such hacks, but hey, life's > inperfect. It'd probably be better design to not use the date as a flag. This issue actually came up for me yesterday with an application that is now being ported to Postgres. Previously a null "ship date" indicated that an item to be shipped had not gone yet. I'm adding a flag, not just because of this issue you describe, but it is also more intuitive for anyone looking at the data who is unfamiliar with the business logic. Best, Jim Wilson
Karsten Hilbert wrote: > > I guess you know where it ends--the index is not used for IS [NOT] NULL > > expressions. The obvious workaround was to add DEFAULT value to > > "processed" in form of kind of anchor (we used '-infinity') > > Wouldn't it have worked to add an index > > ... WHERE processed IS NULL > > and go from there ? Yes, you use a partial index. The 8.0beta1 docs mention this: Indexes are not used for <literal>IS NULL</> clauses by default. The best way to use indexes in such cases is to create a partial index using an <literal>IS NULL</> comparison. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Fri, 20 Aug 2004 11:12:40 -0400 (EDT), Bruce Momjian wrote: > Yes, you use a partial index. The 8.0beta1 docs mention this: > > Indexes are not used for <literal>IS NULL</> clauses by default. > The best way to use indexes in such cases is to create a partial > index using an <literal>IS NULL</> comparison. Is this because nulls aren't indexed (I'm sure I remember someone saying that they are, though), or because (null=null) is null rather than true? If it's the latter, why couldn't an explicit IS NULL test be allowed to use the index? -- Peter Haworth pmh@edison.ioppublishing.com Znqr lbh ybbx
"Peter Haworth" <pmh@edison.ioppublishing.com> writes: > ... why couldn't an explicit IS NULL test > be allowed to use the index? It could, if someone cared to puzzle out a way to integrate IS NULL into the index opclass and access method API infrastructure. Right now all that stuff assumes that indexable operators are, well, operators (and I think there are places that assume they must be binary operators, to boot). I've looked at this once or twice but always decided that the bang-for-the-buck ratio was too low compared to other open problems ... regards, tom lane
In article <twig.1093012692.59157@kelcomaine.com>, "Jim Wilson" <jimw@kelcomaine.com> writes: > It'd probably be better design to not use the date as a flag. This issue > actually came up for me yesterday with an application that is now being ported > to Postgres. Previously a null "ship date" indicated that an item to be > shipped had not gone yet. I'm adding a flag, not just because of this issue > you describe, but it is also more intuitive for anyone looking at the data > who is unfamiliar with the business logic. Me thinks that's somewhat unclean. Is your shipDate nullable? If yes, what's the meaning of "shipDate IS NULL"? If no, what do you put in that field if notYetShipped is true?
Harald Fuchs said: > In article <twig.1093012692.59157@kelcomaine.com>, > "Jim Wilson" <jimw@kelcomaine.com> writes: > > > It'd probably be better design to not use the date as a flag. This issue > > actually came up for me yesterday with an application that is now being ported > > to Postgres. Previously a null "ship date" indicated that an item to be > > shipped had not gone yet. I'm adding a flag, not just because of this issue > > you describe, but it is also more intuitive for anyone looking at the data > > who is unfamiliar with the business logic. > > Me thinks that's somewhat unclean. Is your shipDate nullable? If > yes, what's the meaning of "shipDate IS NULL"? If no, what do you put > in that field if notYetShipped is true? Actually it would be a boolean "shipped" flag. ShipDate can be whatever you want. Perhaps even null. All in all it just makes sense to use the boolean when you need a flag (or integer or char for more than two flag states). Best, Jim
On Fri, 20 Aug 2004, Michal Taborsky wrote: > Richard Huxton wrote: > > Where you don't have a valid date to store you should use NULL. This > > business of storing zeroes is a horrible MySQL design mistake. > > Well, yes and no. It certainly is a design mistake and introduces > incosistency into the database, but after I was bitten several times by > NULL values I'd go for solution like this any day. Let me explain. He refers to MySQL design, not _your_ design. But by letting you insert zeros, MySQL misguided you in your design... > We had a table of messages, which was inserted to randomly and every few > minutes we'd walk through the unprocessed messages and perform some work > on them. I, trying to have the database as clean as possible, used this > table definition (simplified): > > messages ( > id serial, > data text, > arrived timestamp default now, > processed timestamp) > > So after the message arrived, it had the processed field set to null, > which was perfectly consistent and represented what it realy was--an > unknown value. No. You're using one field to emulate two. First, you're using processed as a flag, to indicate the unprocesses/processed status. Then, you use is to store the timestamp. To clarify, you use that field to put two different questions to the system: 1) has this message been processed? 2) _when_ was this message processed? This design mistake introduces the need of an 'invalid' value you have to put in the field, cause that's the 'flag' part I've mentioned. And NULL does not mean 'invalid', it means 'not available'. A timestamp field can be used to answer only to one question: 'when'. Be careful what NULL means here: it does not mean 'never', it means 'I don't know when it was processed'. You cannot consider a message with NULL timestamp as 'not processes yet', because that's only one case. It could have been processed, but for some reason someone forgot to store the timestamp. So your claim: "...which was perfectly consistent and represented what it realy was--an unknown value." does not hold. It's a subtle design mistake. Because you seem to be sure that if a message was processed, then the timestamp must be available, you're implying that if the field is NULL, the message is unprocessed. This kind of 'knowledge' is external to the DB. You'd better let the DB know the full story if you want it to provide nice answers to your queries. It's not just an index problem. [...] > SELECT * FROM messages WHERE processed='-infinity'. There! Here you have to introduce your 'flag' value again. The dual nature of the field has to be represented somehow... see? Since NULL failed in its role of 'flag', you're using another value. Now ask yourself what the meaning of processed is here: 1) if it's NULL, you can't tell _when_ (not _if_) the message was processed; 2) if it's -infinity (call it SPECIAL_FLAG_VALUE), the message was NOT processed; 3) if it's different from -infinity, then the message was processed, and you can tell when... Compare it to the natural meaning of a timestamp field: 1) if it's NULL, you can't tell _when_ the message was processed; 2) otherwise you can tell _when_ the message was processed. that simple. [...] > 1.1.1970 00:00). Just make sure you make a note of it somewhere and my > suggestion is you write a COMMENT ON that column for future generations. :) If you want to take care of future generations, my suggestion is to design the database the right way from the start. If you need to check for unprocessed messages, and thus you want the db to answer to the question: 'has this message been processed?' you should introduce a way to represent that kind of knowledge explicitly, with a boolean field. Once you have two fields: is_processed boolean, processed_ts timestamp, you can express queries more naturally (and they'll run fast). 'Saving' one field may lead to horrors at query time. And of course, you may want to place contraints on those fields: - if is_processed is false, then processed_ts must be NULL; - if is_processed is true, then processed_ts must be NOT NULL; - is_processed may be NOT NULL. The first is natural, the second tricky: the system will force anyone to specify a timestamp when changing the state of the message to 'processed'. It means no partial records, no 'I will fill it in later'. The third one is tricker: it places a contraint on the knowledge you need to have before you can insert a message in the system... if someone walks in and tells you: 'let's put message XXX into the system, but I don't know if it was processed or not' you'll have to say 'I'm sorry no, I can't', and hope that answer gets accepted. That message will have to stay outside the system, until someone discovers if it was processed or not. If that message _has_ to be in (maybe just because your boss expects it to be in) you'll have to provide a value... that means the DB will provide potentially wrong answers (think of count() on processed messages). .TM. -- ____/ ____/ / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _____/ _____/ _/ Colombo@ESI.it