Thread: select query on Dates stored as varchar
I have a table (transactions) with an order_date field that is varchar(32). The data looks like this:
order_date
-------------------
20100623 02:16:40
20100623 04:32:41
20100625 04:18:53
20100625 07:53:24
In my current database (8.1.4) if I run
select (*) from transactions where order_date > ‘2010-06-24’;
I get:
count
--------
2
I’m moving to a new server that has 8.2.11. There, if I run the same query, I get
count
--------
4
In both, select (*) from transactions where order_date > ‘20100624’; returns 2.
The newer version does not seem to know that ‘2010-06-24’ is a date the same way that ‘20100624’ is.
Any ideas? Is there somewhere that I can specify that with, or without the hyphens, they are both dates?
Thank you,
Yosef Haas
yosef@karatedepot.com
Hello 2010/8/30 Yosef Haas <yosef@karatedepot.com>: > I have a table (transactions) with an order_date field that is varchar(32). > The data looks like this: > > > > order_date > > ------------------- > > 20100623 02:16:40 > > 20100623 04:32:41 > > 20100625 04:18:53 > > 20100625 07:53:24 > > > > In my current database (8.1.4) if I run > > select (*) from transactions where order_date > ‘2010-06-24’; > > I get: > > count > > -------- > > 2 > > > > I’m moving to a new server that has 8.2.11. There, if I run the same query, > I get > > count > > -------- > > 4 > > > > > > In both, select (*) from transactions where order_date > ‘20100624’; returns > 2. > > > > The newer version does not seem to know that ‘2010-06-24’ is a date the same > way that ‘20100624’ is. > > > > Any ideas? Is there somewhere that I can specify that with, or without the > hyphens, they are both dates? > a) check your locales - different locales can make different order b) use a native data type "timestamp" - your technique isn't best - it is slower, needs more memory and depends on locale. Regards Pavel Stehule > > > > > Thank you, > > Yosef Haas > > yosef@karatedepot.com > >
I know it would be best to change the data type of the column (I didn't initially create the table), but doing so would be require changes to several application that use it and I'd like to avoid that if possible. Thanks, Yosef Haas yosef@karatedepot.com -----Original Message----- From: Pavel Stehule [mailto:pavel.stehule@gmail.com] Sent: Monday, August 30, 2010 4:45 PM To: Yosef Haas Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] select query on Dates stored as varchar Hello 2010/8/30 Yosef Haas <yosef@karatedepot.com>: > I have a table (transactions) with an order_date field that is varchar(32). > The data looks like this: > > > > order_date > > ------------------- > > 20100623 02:16:40 > > 20100623 04:32:41 > > 20100625 04:18:53 > > 20100625 07:53:24 > > > > In my current database (8.1.4) if I run > > select (*) from transactions where order_date > 2010-06-24; > > I get: > > count > > -------- > > 2 > > > > Im moving to a new server that has 8.2.11. There, if I run the same query, > I get > > count > > -------- > > 4 > > > > > > In both, select (*) from transactions where order_date > 20100624; returns > 2. > > > > The newer version does not seem to know that 2010-06-24 is a date the same > way that 20100624 is. > > > > Any ideas? Is there somewhere that I can specify that with, or without the > hyphens, they are both dates? > a) check your locales - different locales can make different order b) use a native data type "timestamp" - your technique isn't best - it is slower, needs more memory and depends on locale. Regards Pavel Stehule > > > > > Thank you, > > Yosef Haas > > yosef@karatedepot.com > >
On Mon, Aug 30, 2010 at 2:53 PM, Yosef Haas <yosef@karatedepot.com> wrote: > I know it would be best to change the data type of the column (I didn't > initially create the table), but doing so would be require changes to > several application that use it and I'd like to avoid that if possible. 1. Most of those changes would involve chain saw type work on those apps, i.e. cutting out huge chunks of logic in the app to deal with the insanity of a text type storing a timestamp or date in 18 different ways. 2. Any new app written now gets inherit the insanity of having to deal with the 18 crazy ways a date / timestamp can get entered into the system. 3. In the future, after converting, math is simple: where timestampfield between now()-interval '2 days' and now() and so on. Trust me, in the long run, you'll be thankful to have the right data type in there, and your applications will tend to do the right thing. -- To understand recursion, one must first understand recursion.
2010/8/30 Scott Marlowe <scott.marlowe@gmail.com>: > On Mon, Aug 30, 2010 at 2:53 PM, Yosef Haas <yosef@karatedepot.com> wrote: >> I know it would be best to change the data type of the column (I didn't >> initially create the table), but doing so would be require changes to >> several application that use it and I'd like to avoid that if possible. > > 1. Most of those changes would involve chain saw type work on those > apps, i.e. cutting out huge chunks of logic in the app to deal with > the insanity of a text type storing a timestamp or date in 18 > different ways. > 2. Any new app written now gets inherit the insanity of having to deal > with the 18 crazy ways a date / timestamp can get entered into the > system. > 3. In the future, after converting, math is simple: > where timestampfield between now()-interval '2 days' and now() > and so on. > Trust me, in the long run, you'll be thankful to have the right data > type in there, and your applications will tend to do the right thing. > sure +1 Pavel > -- > To understand recursion, one must first understand recursion. >
2010/8/30 Yosef Haas <yosef@karatedepot.com>: > I know it would be best to change the data type of the column (I didn't > initially create the table), but doing so would be require changes to > several application that use it and I'd like to avoid that if possible. > then you can dynamically transform data to timestamp (it is slower, but not depends on locale) SELECT to_timestamp(order_date, ''YYYYMMDD HH:MI:SS') > to_timestamp(‘2010-06-24’, ''YYYYMMDD HH:MI:SS') Regards Pavel Stehule > Thanks, > Yosef Haas > yosef@karatedepot.com > > -----Original Message----- > From: Pavel Stehule [mailto:pavel.stehule@gmail.com] > Sent: Monday, August 30, 2010 4:45 PM > To: Yosef Haas > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] select query on Dates stored as varchar > > Hello > > 2010/8/30 Yosef Haas <yosef@karatedepot.com>: >> I have a table (transactions) with an order_date field that is > varchar(32). >> The data looks like this: >> >> >> >> order_date >> >> ------------------- >> >> 20100623 02:16:40 >> >> 20100623 04:32:41 >> >> 20100625 04:18:53 >> >> 20100625 07:53:24 >> >> >> >> In my current database (8.1.4) if I run >> >> select (*) from transactions where order_date > ‘2010-06-24’; >> >> I get: >> >> count >> >> -------- >> >> 2 >> >> >> >> I’m moving to a new server that has 8.2.11. There, if I run the same > query, >> I get >> >> count >> >> -------- >> >> 4 >> >> >> >> >> >> In both, select (*) from transactions where order_date > ‘20100624’; > returns >> 2. >> >> >> >> The newer version does not seem to know that ‘2010-06-24’ is a date the > same >> way that ‘20100624’ is. >> >> >> >> Any ideas? Is there somewhere that I can specify that with, or without the >> hyphens, they are both dates? >> > > a) check your locales - different locales can make different order > b) use a native data type "timestamp" - your technique isn't best - it > is slower, needs more memory and depends on locale. > > Regards > > Pavel Stehule > >> >> >> >> >> Thank you, >> >> Yosef Haas >> >> yosef@karatedepot.com >> >> > >
On Mon, 2010-08-30 at 16:32 -0400, Yosef Haas wrote: > I have a table (transactions) with an order_date field that is > varchar(32). The data looks like this: > > > > order_date > > ------------------- > > 20100623 02:16:40 > > 20100623 04:32:41 > > 20100625 04:18:53 > > 20100625 07:53:24 > > ... > > Any ideas? Is there somewhere that I can specify that with, or without > the hyphens, they are both dates? > For now, write a function called something like my_string_to_timestamp() that takes a string in the format above and converts it to a real timestamp (or timestamptz). Then, use a query like: select (*) from transactions where my_string_to_timestamp(order_date) > ‘20100624’::timestamp; That way, you at least have the ugly interpretation logic in one place (the function), and you will see errors when you run into a malformed string. Later, you really should change these to be real timestamps (or timestamptz). Trying to re-interpret your data at query time is the recipe for wrong answers. Regards, Jeff Davis
I agree wholeheartedly with all of your points. The data should be in the right format. The problem I'm faced with is that I'm in the middle of a hosting migration and can't stop to change and deploy all the applications that expect the date in that format. Pavel did however point me in the right direction to making this work the way it did. The answer was the locale. Once I did initdb with --locale=en_US.UTF-8 it worked. I know this is not great for the long run, but right now I need to get this up and running. Thanks for the help. Yosef Haas yosef@karatedepot.com -----Original Message----- From: Jeff Davis [mailto:pgsql@j-davis.com] Sent: Monday, August 30, 2010 5:23 PM To: Yosef Haas Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] select query on Dates stored as varchar On Mon, 2010-08-30 at 16:32 -0400, Yosef Haas wrote: > I have a table (transactions) with an order_date field that is > varchar(32). The data looks like this: > > > > order_date > > ------------------- > > 20100623 02:16:40 > > 20100623 04:32:41 > > 20100625 04:18:53 > > 20100625 07:53:24 > > ... > > Any ideas? Is there somewhere that I can specify that with, or without > the hyphens, they are both dates? > For now, write a function called something like my_string_to_timestamp() that takes a string in the format above and converts it to a real timestamp (or timestamptz). Then, use a query like: select (*) from transactions where my_string_to_timestamp(order_date) > '20100624'::timestamp; That way, you at least have the ugly interpretation logic in one place (the function), and you will see errors when you run into a malformed string. Later, you really should change these to be real timestamps (or timestamptz). Trying to re-interpret your data at query time is the recipe for wrong answers. Regards, Jeff Davis