Thread: Proposed changing the definition of decade for date_trunc and extract

Proposed changing the definition of decade for date_trunc and extract

From
Mike Swanson
Date:
For a long time (since version 8.0), PostgreSQL has adopted the logical
barriers for centuries and millenniums in these functions.  The calendar
starts millennium and century 1 on year 1, directly after 1 BC.
Unfortunately decades are still reported rather simplistically by
dividing the year by 10.  Years 1-10 are logically the first decade and
working up from there, year 2014 should be counted as 202nd decade.

I've pushed code and documentation changes to reflect this, based on the
master branch (9.5devel), it's on the branch new_decade_def at
https://github.com/chungy/postgres.git -- In both the commit message and
docs, I made note of the backwards compatibility change.  I don't know
how much of an impact this would have but I suspect not many
applications are really going to be affected by how decades are counted
(should be simple to fix on their part, if any are...).

-- Mike Swanson




Re: Proposed changing the definition of decade for date_trunc and extract

From
David G Johnston
Date:
Mike Swanson wrote
> For a long time (since version 8.0), PostgreSQL has adopted the logical
> barriers for centuries and millenniums in these functions.  The calendar
> starts millennium and century 1 on year 1, directly after 1 BC.
> Unfortunately decades are still reported rather simplistically by
> dividing the year by 10.  Years 1-10 are logically the first decade and
> working up from there, year 2014 should be counted as 202nd decade.
> 
> I've pushed code and documentation changes to reflect this, based on the
> master branch (9.5devel), it's on the branch new_decade_def at
> https://github.com/chungy/postgres.git -- In both the commit message and
> docs, I made note of the backwards compatibility change.  I don't know
> how much of an impact this would have but I suspect not many
> applications are really going to be affected by how decades are counted
> (should be simple to fix on their part, if any are...).

Floor ( Year / 10 ) = decade number feels right.  Sure, the zero decade only
has 9 years but after that everything is easy to read.  Typical usage refers
to decades such as the 80s and the 90s but if you start counting at 1 the 0
year would have a mis-matched prefix.  And date truncation would be
weird...though I haven't tested the behavior I assume it works by basically
just dropping the year digit and replacing it with zero...that at least
would be the desired behavior for me.

Any supporting arguments for 1-10 = 1st decade other than technical
perfection?  I guess if you use data around and before 1AD you care about
this more, and rightly so, but given sound arguments for both methods the
one more useful to more users who I suspect dominantly care about years >
1900.

So -1 to change for breaking backward compatibility and -1 because the
current behavior seems to be more useful in everyday usage.

David J.








--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Proposed-changing-the-definition-of-decade-for-date-trunc-and-extract-tp5813578p5813580.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



On 08/01/2014 05:32 PM, David G Johnston wrote:
> Any supporting arguments for 1-10 = 1st decade other than technical
> perfection?  I guess if you use data around and before 1AD you care about
> this more, and rightly so, but given sound arguments for both methods the
> one more useful to more users who I suspect dominantly care about years >
> 1900.

Well, I think most people in casual speech would consider "The 80's" to
be 1980 to 1989.  But if you ask a historian, the decade is 1981 to 1990
(or, if they're an American social historian, 1981 to 1988, but that's a
different topic).  So both ways of counting have valid, solid arguments
behind them.

> So -1 to change for breaking backward compatibility and -1 because the
> current behavior seems to be more useful in everyday usage.

If we were adding a new "decade" feature, then I'd probably side with
Mike.  However, it's hard for me to believe that this change is worth
breaking backwards compatibility.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: Re: Proposed changing the definition of decade for date_trunc and extract

From
Gavin Flower
Date:
On 02/08/14 12:32, David G Johnston wrote:
> Mike Swanson wrote
>> For a long time (since version 8.0), PostgreSQL has adopted the logical
>> barriers for centuries and millenniums in these functions.  The calendar
>> starts millennium and century 1 on year 1, directly after 1 BC.
>> Unfortunately decades are still reported rather simplistically by
>> dividing the year by 10.  Years 1-10 are logically the first decade and
>> working up from there, year 2014 should be counted as 202nd decade.
>>
>> I've pushed code and documentation changes to reflect this, based on the
>> master branch (9.5devel), it's on the branch new_decade_def at
>> https://github.com/chungy/postgres.git -- In both the commit message and
>> docs, I made note of the backwards compatibility change.  I don't know
>> how much of an impact this would have but I suspect not many
>> applications are really going to be affected by how decades are counted
>> (should be simple to fix on their part, if any are...).
> Floor ( Year / 10 ) = decade number feels right.  Sure, the zero decade only
> has 9 years but after that everything is easy to read.  Typical usage refers
> to decades such as the 80s and the 90s but if you start counting at 1 the 0
> year would have a mis-matched prefix.  And date truncation would be
> weird...though I haven't tested the behavior I assume it works by basically
> just dropping the year digit and replacing it with zero...that at least
> would be the desired behavior for me.
>
> Any supporting arguments for 1-10 = 1st decade other than technical
> perfection?  I guess if you use data around and before 1AD you care about
> this more, and rightly so, but given sound arguments for both methods the
> one more useful to more users who I suspect dominantly care about years >
> 1900.
>
> So -1 to change for breaking backward compatibility and -1 because the
> current behavior seems to be more useful in everyday usage.
>
> David J.
>
>
>
>
>
>
>
>
> --
> View this message in context:
http://postgresql.1045698.n5.nabble.com/Proposed-changing-the-definition-of-decade-for-date-trunc-and-extract-tp5813578p5813580.html
> Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
>
>
Since there was no year zero: then it follows that the first decade 
comprises years 1 to 10, and the current Millennium started in 2001 - or 
am I being too logical???   :-)


Cheers,
Gavin



Re: Re: Proposed changing the definition of decade for date_trunc and extract

From
David Johnston
Date:

On Fri, Aug 1, 2014 at 8:15 PM, Gavin Flower <GavinFlower@archidevsys.co.nz> wrote:
On 02/08/14 12:32, David G Johnston wrote:

Any supporting arguments for 1-10 = 1st decade other than technical
perfection?  I guess if you use data around and before 1AD you care about
this more, and rightly so, but given sound arguments for both methods the
one more useful to more users who I suspect dominantly care about years >
1900.

So -1 to change for breaking backward compatibility and -1 because the
current behavior seems to be more useful in everyday usage.

Since there was no year zero: then it follows that the first decade comprises years 1 to 10, and the current Millennium started in 2001 - or am I being too logical???   :-)


​This is SQL, only relational logic matters.  All other logic can be superseded by committee consensus.

IOW - and while I have no way of checking - this seems like something that may be governed by the SQL standard...in which case adherence to that would trump mathematical logic.

David J.


Re: Re: Proposed changing the definition of decade for date_trunc and extract

From
Mike Swanson
Date:
On Sat, 2014-08-02 at 15:15 +1200, Gavin Flower wrote:
> Since there was no year zero: then it follows that the first decade
> comprises years 1 to 10, and the current Millennium started in 2001 - or
> am I being too logical???   :-)


This is pretty much the reason I'm sending this patch, because it makes
mathematical sense, plus my OCD-sense tingles when Postgres handles
centuries and millenniums correctly, whereas decades are not.

I will concede if the compatibility breaks are too great, but I don't
know how many people depend on the output of this.  I didn't do any
market research :)  Besides, it seemed to me that if the other two were
able to be fixed (albeit ~10 years ago), there's little reason to avoid
fixing decade too.

There's a few definitions of a decade: * Spans of ten years that start from year 1. * Spans of ten years defined by the
second-to-the-rightdigit (years   1-9 would be in decade 0?) -- this is one of the colloquial   versions when people
referto "the (19)90s." * The other version tends to be less well-defined. "The 1960s"   usually conjures up images of
countercultureand the British   Invasion and such; debatably occurring around 1964-1972 (this   version used by culture
cannever be derived mathematically by a   database, but it might be worth putting out here). * Any span of
approximately10 years (the interval type is fine   enough for this).
 

I lack significant research but it's rare to hear people refer to
1990-1999 as the "199th century" in the same way they might refer to
1900-1999 (or 1901-2000) as the "20th century" -- and it's worth noting
that common usage for determining 20th/21st centuries generally follow
the mathematical logic of them, even if some people are off-by-one when
determining when they start and end.

I'd also argue that the current function basing the logic from
definition #2 has limited use even when you want to use it for such.
If you want to generate text for '(decades)s' you'd have to do: SELECT extract('year' from date_trunc('decade', now()))
||'s';
 
Or with my patch: SELECT floor(extract('year' from now()) / 10) || '0s';
It's different, for sure, but I would actually think the second one is
a bit less awkward.  Plus it's shorter :)





Re: Proposed changing the definition of decade for date_trunc and extract

From
Mike Swanson
Date:
On Fri, 2014-08-01 at 22:28 -0700, Mike Swanson wrote:
> I'd also argue that the current function basing the logic from
> definition #2 has limited use even when you want to use it for such.
> If you want to generate text for '(decades)s' you'd have to do:
>   SELECT extract('year' from date_trunc('decade', now())) || 's';
> Or with my patch:
>   SELECT floor(extract('year' from now()) / 10) || '0s';
> It's different, for sure, but I would actually think the second one is
> a bit less awkward.  Plus it's shorter :)

I'm responding to myself because I realized that what I wrote was a bit
silly.  The first and current example (which is invalidated by my patch)
should really be: SELECT extract('decade' from now()) || '0s';
which makes it the shorter and simpler version of the two.  I'll still
stand by my opinion that it's not an extremely useful function as it is.




Re: Proposed changing the definition of decade for date_trunc and extract

From
Bruce Momjian
Date:
On Sat, Aug  2, 2014 at 11:53:06PM -0700, Mike Swanson wrote:
> On Fri, 2014-08-01 at 22:28 -0700, Mike Swanson wrote:
> > I'd also argue that the current function basing the logic from
> > definition #2 has limited use even when you want to use it for such.
> > If you want to generate text for '(decades)s' you'd have to do:
> >   SELECT extract('year' from date_trunc('decade', now())) || 's';
> > Or with my patch:
> >   SELECT floor(extract('year' from now()) / 10) || '0s';
> > It's different, for sure, but I would actually think the second one is
> > a bit less awkward.  Plus it's shorter :)
> 
> I'm responding to myself because I realized that what I wrote was a bit
> silly.  The first and current example (which is invalidated by my patch)
> should really be:
>   SELECT extract('decade' from now()) || '0s';
> which makes it the shorter and simpler version of the two.  I'll still
> stand by my opinion that it's not an extremely useful function as it is.

Well, you can make the argument that since the 20th century is from 1901
to 2000, that decades should be from *01 to *00.  However, those
centuries are ordinal, e.g. using "th", while decades are not.  The
decade is more a short-hand, rather than something that counts from year
1 like centuries.  Looking at decades as an abbreviation supports our
current behavior.

I don't think 99% of people know there was no year 0, so I don't think
you can base a lot on that.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: Proposed changing the definition of decade for date_trunc and extract

From
Gavin Flower
Date:
On 04/08/14 01:27, Bruce Momjian wrote:
> On Sat, Aug  2, 2014 at 11:53:06PM -0700, Mike Swanson wrote:
>> On Fri, 2014-08-01 at 22:28 -0700, Mike Swanson wrote:
>>> I'd also argue that the current function basing the logic from
>>> definition #2 has limited use even when you want to use it for such.
>>> If you want to generate text for '(decades)s' you'd have to do:
>>>    SELECT extract('year' from date_trunc('decade', now())) || 's';
>>> Or with my patch:
>>>    SELECT floor(extract('year' from now()) / 10) || '0s';
>>> It's different, for sure, but I would actually think the second one is
>>> a bit less awkward.  Plus it's shorter :)
>> I'm responding to myself because I realized that what I wrote was a bit
>> silly.  The first and current example (which is invalidated by my patch)
>> should really be:
>>    SELECT extract('decade' from now()) || '0s';
>> which makes it the shorter and simpler version of the two.  I'll still
>> stand by my opinion that it's not an extremely useful function as it is.
> Well, you can make the argument that since the 20th century is from 1901
> to 2000, that decades should be from *01 to *00.  However, those
> centuries are ordinal, e.g. using "th", while decades are not.  The
> decade is more a short-hand, rather than something that counts from year
> 1 like centuries.  Looking at decades as an abbreviation supports our
> current behavior.
>
> I don't think 99% of people know there was no year 0, so I don't think
> you can base a lot on that.
>
I am not in favour of 'mushroom farming' - keeping people ignorant and 
feeding them bullshit.  So I don't think we should pander to ignorance.

So it would probably be a good idea to mention in the relevant 
documentation, that there was no Year Zero, and that 1 AD follows 
directly after 1 BC.


Cheers,
Gavin




Re: Proposed changing the definition of decade for date_trunc and extract

From
Robert Haas
Date:
On Sun, Aug 3, 2014 at 3:29 PM, Gavin Flower
<GavinFlower@archidevsys.co.nz> wrote:
> So it would probably be a good idea to mention in the relevant
> documentation, that there was no Year Zero, and that 1 AD follows directly
> after 1 BC.

Well, maybe.  By and large, the PostgreSQL documentation should
confine itself to documenting facts about PostgreSQL rather than, say,
facts about date reckoning, except to the extent that PostgreSQL does
something different from the universal norms.  I mean, we could
document the rules for leap years, too, or how many days there are in
each month, or that many people who used the Julian calendar
considered the first day of the year to be March 25th, but really,
those are things that mostly deserve to be mentioned in a treatise on
historical calendaring rather than our documentation, unless there is
some PostgreSQL-specific treatment that requires expounding on them.

On the original patch, like a few others who have spoken, I think
changing the current behavior is a bad idea.  I certainly respect
Mike's desire to get the behavior right, and to set things up in the
way that makes sense to him, but if we start changing things like this
on the basis of a small number of complaints, we will piss off a lot
of users.  There are many people out there who have thousands upon
thousands of lines of PostgreSQL code that they can't easily audit and
change, and the standard for doing things that might break that code
in subtle ways is, and should be, high.  If what we were doing today
was outright wrong, of course I'd be in favor of changing it.  But the
current behavior isn't at all unreasonable; there are two possible
definitions and we picked one.  In that kind of situation, backward
compatibility concerns carry a great deal of weight.  Even if, in a
vacuum, more users would have chosen to do it as Mike proposes vs.
what we actually picked, we'll make enough people unhappy by changing
it now to make it a poor decision for the project.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company