Thread: How to setup default value "0000-00-00" for "date" type under PostgreSQL?

How to setup default value "0000-00-00" for "date" type under PostgreSQL?

From

Emi Lu

Date:

19 August 2004, 21:34:31

Hello all,

I have a question about "date" & "timestamp" types in PostgreSQL. I want
to setup the default value '0000-00-00' and "0000-00-00 00:00:00" for
them. However, it seems that PostgreSQL does not support it. Could
someone helps me please?

The example table:

T1 (col1      varchar(7) not null,
       col2      varchar(4) not null,
       col3      date not null,
       col 4     varchar(3),
       primary key(col1, col2, col3)
)


In my design model, "col3" has to be one of the primary key part. Since
at the beginning of the data population, we do not know the value of
"col3"; values for "col3" are input throught GUI. Therefore, when I use
MySQL, the default values I gave is "0000-00-00". However, after I
migrate to postgreSQL, I could not setup the default values as
"0000-00-00" any more. Could somebody help me about it please? I really
want to know how I can save '0000-00-00' as the default value for "date"
and "timestamp" types.

Thanks a lot!
Emi Lu

Re: How to setup default value "0000-00-00" for "date" type under PostgreSQL?

From

Karsten Hilbert

Date:

20 August 2004, 03:40:18

> I really
> want to know how I can save '0000-00-00' as the default value for "date"
> and "timestamp" types.
You can not. They are invalid dates.

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

Re: How to setup default value "0000-00-00" for "date"

From

Christian Kratzer

Date:

20 August 2004, 03:58:37

On Wed, 18 Aug 2004, Emi Lu wrote:

> Hello all,
>
> I have a question about "date" & "timestamp" types in PostgreSQL. I want to
> setup the default value '0000-00-00' and "0000-00-00 00:00:00" for them.
> However, it seems that PostgreSQL does not support it. Could someone helps me
> please?

how about using NULL ? You say the correct value is still undefined so
NULL should be right on the spot.

Greetings
Christian

--
Christian Kratzer                       ck@cksoft.de
CK Software GmbH                        http://www.cksoft.de/
Phone: +49 7452 889 135                 Fax: +49 7452 889 136

Re: How to setup default value "0000-00-00" for "date"

From

Richard Huxton

Date:

20 August 2004, 05:06:47

Emi Lu wrote:
> Hello all,
>
> I have a question about "date" & "timestamp" types in PostgreSQL. I want
> to setup the default value '0000-00-00' and "0000-00-00 00:00:00" for
> them. However, it seems that PostgreSQL does not support it. Could
> someone helps me please?

PostgreSQL doesn't and almost certainly never will support "0000-00-00"
as a date. That's because it isn't a valid date. You also can't store
13.723, "Hello world" or (12,13) in a date column either.

Where you don't have a valid date to store you should use NULL. This
business of storing zeroes is a horrible MySQL design mistake.

> The example table:
>
> T1 (col1      varchar(7) not null,
>       col2      varchar(4) not null,
>       col3      date not null,
>       col 4     varchar(3),
>       primary key(col1, col2, col3)
> )
>
>
> In my design model, "col3" has to be one of the primary key part. Since
> at the beginning of the data population, we do not know the value of
> "col3"; values for "col3" are input throught GUI.

If you don't know the value of col3, it can't be part of your primary
key. That's part of the definition of "primary key" and trying to work
around it is what's causing you problems here.

If you have an initial population of data that then needs to have "col3"
set then it sounds like you need two tables T0 with primary-key
(col1,col2) and T1 with (col1,col2,col3) and copy from T0=>T1 as users
supply values. Difficult to say without knowing more about your situation.

HTH
--
   Richard Huxton
   Archonet Ltd

Re: How to setup default value "0000-00-00" for "date"

From

Christian Kratzer

Date:

20 August 2004, 05:20:57

Hi,

On Fri, 20 Aug 2004, Richard Huxton wrote:

> Emi Lu wrote:
>> Hello all,
>>
>> I have a question about "date" & "timestamp" types in PostgreSQL. I want
>> to setup the default value '0000-00-00' and "0000-00-00 00:00:00" for
>> them. However, it seems that PostgreSQL does not support it. Could someone
>> helps me please?
>
> PostgreSQL doesn't and almost certainly never will support "0000-00-00" as a
> date. That's because it isn't a valid date. You also can't store 13.723,
> "Hello world" or (12,13) in a date column either.
>
> Where you don't have a valid date to store you should use NULL. This business
> of storing zeroes is a horrible MySQL design mistake.

which is because the last time when I last used mysql it did not support
NULLs in indexed columns (at least not in myisam tables).

The workaround was to use something else like 0 to represent undefined
values....  Horrible ...

Greetings
Christian

--
Christian Kratzer                       ck@cksoft.de
CK Software GmbH                        http://www.cksoft.de/
Phone: +49 7452 889 135                 Fax: +49 7452 889 136

Re: How to setup default value "0000-00-00" for "date"

From

Michal Taborsky

Date:

20 August 2004, 05:39:29

Richard Huxton wrote:
> Where you don't have a valid date to store you should use NULL. This
> business of storing zeroes is a horrible MySQL design mistake.

Well, yes and no. It certainly is a design mistake and introduces
incosistency into the database, but after I was bitten several times by
NULL values I'd go for solution like this any day. Let me explain.

We had a table of messages, which was inserted to randomly and every few
minutes we'd walk through the unprocessed messages and perform some work
on them. I, trying to have the database as clean as possible, used this
table definition (simplified):

messages (
id serial,
data text,
arrived timestamp default now,
processed timestamp)

So after the message arrived, it had the processed field set to null,
which was perfectly consistent and represented what it realy was--an
unknown value.

We'd then simply SELECT * FROM messages WHERE processed IS NULL and were
happy ever after, only to realise after the table had grown to few
thousands rows, that the SELECT took ages, because the system had to
perform seqscan. Aha! So we added an index on processed, because common
sense told me, that as long as there are 100k rows and only 10 of them
are NULL, the index would be very selective and therefore useful.

I guess you know where it ends--the index is not used for IS [NOT] NULL
expressions. The obvious workaround was to add DEFAULT value to
"processed" in form of kind of anchor (we used '-infinity') and then do
SELECT * FROM messages WHERE processed='-infinity'.

Bingo! The thing went 100x faster. So we could choose to have
standards-compliant, very clean database design OR the thing that does
what it's supposed to do in reasonable time. And believe me, it's kind
of difficult to explain to our logistics department that we could have
done the thing to return results in milliseconds instead of 10 secs, but
chose not to for sake of clean design.

It'd be really nice if we didn't have to use such hacks, but hey, life's
inperfect.

And so that this would not be just a literary exercise and to answer
Emi's question--you can't do that, but use some valid date which you are
never going to use for your ordinary data (like the '-infinity', or
1.1.1970 00:00). Just make sure you make a note of it somewhere and my
suggestion is you write a COMMENT ON that column for future generations.

--
Michal Taborsky
http://www.taborsky.cz

Re: How to setup default value "0000-00-00" for "date"

From

Karsten Hilbert

Date:

20 August 2004, 06:20:20

> I guess you know where it ends--the index is not used for IS [NOT] NULL
> expressions. The obvious workaround was to add DEFAULT value to
> "processed" in form of kind of anchor (we used '-infinity')

Wouldn't it have worked to add an index

    ... WHERE processed IS NULL

and go from there ?

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

Re: How to setup default value "0000-00-00" for "date"

From

"Jim Wilson"

Date:

20 August 2004, 11:39:43

Michal Taborsky said:

> Richard Huxton wrote:
> > Where you don't have a valid date to store you should use NULL. This
> > business of storing zeroes is a horrible MySQL design mistake.
>
> Well, yes and no. It certainly is a design mistake and introduces
> incosistency into the database, but after I was bitten several times by
> NULL values I'd go for solution like this any day. Let me explain.
>
> We had a table of messages, which was inserted to randomly and every few
> minutes we'd walk through the unprocessed messages and perform some work
> on them. I, trying to have the database as clean as possible, used this
> table definition (simplified):
>
> messages (
> id serial,
> data text,
> arrived timestamp default now,
> processed timestamp)
>
> So after the message arrived, it had the processed field set to null,
> which was perfectly consistent and represented what it realy was--an
> unknown value.
>
> We'd then simply SELECT * FROM messages WHERE processed IS NULL and were
> happy ever after, only to realise after the table had grown to few
> thousands rows, that the SELECT took ages, because the system had to
> perform seqscan. Aha! So we added an index on processed, because common
> sense told me, that as long as there are 100k rows and only 10 of them
> are NULL, the index would be very selective and therefore useful.
>
> I guess you know where it ends--the index is not used for IS [NOT] NULL
> expressions. The obvious workaround was to add DEFAULT value to
> "processed" in form of kind of anchor (we used '-infinity') and then do
> SELECT * FROM messages WHERE processed='-infinity'.
>
> Bingo! The thing went 100x faster. So we could choose to have
> standards-compliant, very clean database design OR the thing that does
> what it's supposed to do in reasonable time. And believe me, it's kind
> of difficult to explain to our logistics department that we could have
> done the thing to return results in milliseconds instead of 10 secs, but
> chose not to for sake of clean design.
>
> It'd be really nice if we didn't have to use such hacks, but hey, life's
> inperfect.

It'd probably be better design to not use the date as a flag.  This issue
actually came up for me yesterday with an application that is now being ported
to Postgres.  Previously a null "ship date" indicated that an item to be
shipped had not gone yet.  I'm adding a flag, not just because of this issue
you describe,  but it is also more intuitive for anyone looking at the data
who is unfamiliar with the business logic.

Best,

Jim Wilson

Re: How to setup default value "0000-00-00" for "date"

From

Bruce Momjian

Date:

20 August 2004, 12:12:59

Karsten Hilbert wrote:
> > I guess you know where it ends--the index is not used for IS [NOT] NULL
> > expressions. The obvious workaround was to add DEFAULT value to
> > "processed" in form of kind of anchor (we used '-infinity')
>
> Wouldn't it have worked to add an index
>
>     ... WHERE processed IS NULL
>
> and go from there ?

Yes, you use a partial index.  The 8.0beta1 docs mention this:

   Indexes are not used for <literal>IS NULL</> clauses by default.
   The best way to use indexes in such cases is to create a partial index
   using an <literal>IS NULL</> comparison.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: How to setup default value "0000-00-00" for "date"

From

"Peter Haworth"

Date:

20 August 2004, 14:23:22

On Fri, 20 Aug 2004 11:12:40 -0400 (EDT), Bruce Momjian wrote:
> Yes, you use a partial index. The 8.0beta1 docs mention this:
>
>    Indexes are not used for <literal>IS NULL</> clauses by default.
>    The best way to use indexes in such cases is to create a partial
>    index using an <literal>IS NULL</> comparison.

Is this because nulls aren't indexed (I'm sure I remember someone
saying that they are, though), or because (null=null) is null rather
than true? If it's the latter, why couldn't an explicit IS NULL test
be allowed to use the index?

--
    Peter Haworth    pmh@edison.ioppublishing.com
Znqr lbh ybbx

Re: How to setup default value "0000-00-00" for "date"

From

Tom Lane

Date:

20 August 2004, 14:52:03

"Peter Haworth" <pmh@edison.ioppublishing.com> writes:
> ... why couldn't an explicit IS NULL test
> be allowed to use the index?

It could, if someone cared to puzzle out a way to integrate IS NULL into
the index opclass and access method API infrastructure.  Right now all
that stuff assumes that indexable operators are, well, operators (and I
think there are places that assume they must be binary operators, to
boot).

I've looked at this once or twice but always decided that the
bang-for-the-buck ratio was too low compared to other open problems ...

            regards, tom lane

Re: How to setup default value "0000-00-00" for "date"

From

Harald Fuchs

Date:

22 August 2004, 09:15:26

In article <twig.1093012692.59157@kelcomaine.com>,
"Jim Wilson" <jimw@kelcomaine.com> writes:

> It'd probably be better design to not use the date as a flag.  This issue
> actually came up for me yesterday with an application that is now being ported
> to Postgres.  Previously a null "ship date" indicated that an item to be
> shipped had not gone yet.  I'm adding a flag, not just because of this issue
> you describe,  but it is also more intuitive for anyone looking at the data
> who is unfamiliar with the business logic.

Me thinks that's somewhat unclean.  Is your shipDate nullable?  If
yes, what's the meaning of "shipDate IS NULL"?  If no, what do you put
in that field if notYetShipped is true?

Re: How to setup default value "0000-00-00" for "date"

From

"Jim Wilson"

Date:

22 August 2004, 19:02:54

Harald Fuchs said:

> In article <twig.1093012692.59157@kelcomaine.com>,
> "Jim Wilson" <jimw@kelcomaine.com> writes:
>
> > It'd probably be better design to not use the date as a flag.  This issue
> > actually came up for me yesterday with an application that is now being ported
> > to Postgres.  Previously a null "ship date" indicated that an item to be
> > shipped had not gone yet.  I'm adding a flag, not just because of this issue
> > you describe,  but it is also more intuitive for anyone looking at the data
> > who is unfamiliar with the business logic.
>
> Me thinks that's somewhat unclean.  Is your shipDate nullable?  If
> yes, what's the meaning of "shipDate IS NULL"?  If no, what do you put
> in that field if notYetShipped is true?

Actually it would be a boolean "shipped" flag.  ShipDate can be whatever you
want.  Perhaps even null.  All in all it just makes sense to use the boolean
when you need a flag (or integer or char for more than two flag states).

Best,

Jim

Re: How to setup default value "0000-00-00" for "date"

From

Marco Colombo

Date:

24 August 2004, 16:09:20

On Fri, 20 Aug 2004, Michal Taborsky wrote:

> Richard Huxton wrote:
> > Where you don't have a valid date to store you should use NULL. This
> > business of storing zeroes is a horrible MySQL design mistake.
>
> Well, yes and no. It certainly is a design mistake and introduces
> incosistency into the database, but after I was bitten several times by
> NULL values I'd go for solution like this any day. Let me explain.

He refers to MySQL design, not _your_ design. But by letting you
insert zeros, MySQL misguided you in your design...

> We had a table of messages, which was inserted to randomly and every few
> minutes we'd walk through the unprocessed messages and perform some work
> on them. I, trying to have the database as clean as possible, used this
> table definition (simplified):
>
> messages (
> id serial,
> data text,
> arrived timestamp default now,
> processed timestamp)
>
> So after the message arrived, it had the processed field set to null,
> which was perfectly consistent and represented what it realy was--an
> unknown value.

No. You're using one field to emulate two. First, you're using
processed as a flag, to indicate the unprocesses/processed status.
Then, you use is to store the timestamp. To clarify, you use that
field to put two different questions to the system:

1) has this message been processed?
2) _when_ was this message processed?

This design mistake introduces the need of an 'invalid' value you
have to put in the field, cause that's the 'flag' part I've mentioned.
And NULL does not mean 'invalid', it means 'not available'.

A timestamp field can be used to answer only to one question: 'when'.
Be careful what NULL means here: it does not mean 'never', it means
'I don't know when it was processed'. You cannot consider a message
with NULL timestamp as 'not processes yet', because that's only one
case. It could have been processed, but for some reason someone
forgot to store the timestamp.

So your claim:
 "...which was perfectly consistent and represented what it realy was--an
  unknown value."
does not hold.

It's a subtle design mistake. Because you seem to be sure that
if a message was processed, then the timestamp must be available,
you're implying that if the field is NULL, the message is unprocessed.
This kind of 'knowledge' is external to the DB. You'd better let
the DB know the full story if you want it to provide nice answers
to your queries. It's not just an index problem.

[...]
> SELECT * FROM messages WHERE processed='-infinity'.

There! Here you have to introduce your 'flag' value again. The dual
nature of the field has to be represented somehow... see? Since
NULL failed in its role of 'flag', you're using another value.
Now ask yourself what the meaning of processed is here:

1) if it's NULL, you can't tell _when_ (not _if_) the message was processed;
2) if it's -infinity (call it SPECIAL_FLAG_VALUE), the message was NOT
   processed;
3) if it's different from -infinity, then the message was processed, and
   you can tell when...

Compare it to the natural meaning of a timestamp field:

1) if it's NULL, you can't tell _when_ the message was processed;
2) otherwise you can tell _when_ the message was processed.

that simple.

[...]
> 1.1.1970 00:00). Just make sure you make a note of it somewhere and my
> suggestion is you write a COMMENT ON that column for future generations.

:) If you want to take care of future generations, my suggestion is
to design the database the right way from the start.
If you need to check for unprocessed messages, and thus you want the db
to answer to the question: 'has this message been processed?'
you should introduce a way to represent that kind of knowledge
explicitly, with a boolean field. Once you have two fields:

is_processed boolean,
processed_ts timestamp,

you can express queries more naturally (and they'll run fast).
'Saving' one field may lead to horrors at query time.

And of course, you may want to place contraints on those fields:
- if is_processed is false, then processed_ts must be NULL;
- if is_processed is true, then processed_ts must be NOT NULL;
- is_processed may be NOT NULL.

The first is natural, the second tricky: the system will force
anyone to specify a timestamp when changing the state of the
message to 'processed'. It means no partial records, no 'I will
fill it in later'.
The third one is tricker: it places a contraint on the knowledge you
need to have before you can insert a message in the system...
if someone walks in and tells you: 'let's put message XXX into the
system, but I don't know if it was processed or not' you'll have to
say 'I'm sorry no, I can't', and hope that answer gets accepted.
That message will have to stay outside the system, until someone
discovers if it was processed or not. If that message _has_ to be in
(maybe just because your boss expects it to be in) you'll have to
provide a value... that means the DB will provide potentially
wrong answers (think of count() on processed messages).

.TM.
--
      ____/  ____/   /
     /      /       /            Marco Colombo
    ___/  ___  /   /              Technical Manager
   /          /   /             ESI s.r.l.
 _____/ _____/  _/               Colombo@ESI.it