Thread: Copyright information in source files

Copyright information in source files

From
vignesh C
Date:
Hi,

I noticed that some of the source files does not include the copyright
information. Most of the files have included it, but few files have
not included it. I felt it should be included. The attached patch
contains the fix for including the copyright information in the source
files. Let me know your thoughts on the same.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

Attachment

Re: Copyright information in source files

From
Thomas Munro
Date:
On Sun, Nov 17, 2019 at 6:36 AM vignesh C <vignesh21@gmail.com> wrote:
> I noticed that some of the source files does not include the copyright
> information. Most of the files have included it, but few files have
> not included it. I felt it should be included. The attached patch
> contains the fix for including the copyright information in the source
> files. Let me know your thoughts on the same.

I'd like to get rid of those IDENTIFICATION lines completely (they are
left over from the time when the project used CVS, and that section
had a $Header$ "ident" tag, but in the git era, those ident tags are
no longer in fashion).

There are other inconsistencies in the copyright messages, like
whether we say "Portions" or not for PGDU, and whether we use 1996- or
the year the file was created, and whether the Berkeley copyright is
there or not (different people seem to have different ideas about
whether that's needed for a post-Berkeley file).



Re: Copyright information in source files

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> I'd like to get rid of those IDENTIFICATION lines completely (they are
> left over from the time when the project used CVS, and that section
> had a $Header$ "ident" tag, but in the git era, those ident tags are
> no longer in fashion).

I'm not for that.  Arguments about CVS vs git are irrelevant: the
usefulness of those lines comes up when you've got a file that's
not in your source tree but somewhere else.  It's particularly
useful for the Makefiles, which are otherwise often same-y and
hard to identify.

> There are other inconsistencies in the copyright messages, like
> whether we say "Portions" or not for PGDU, and whether we use 1996- or
> the year the file was created, and whether the Berkeley copyright is
> there or not (different people seem to have different ideas about
> whether that's needed for a post-Berkeley file).

Yeah, it'd be nice to have some greater consistency there.  My own
thought about it is that it's rare to have a file that's *completely*
de novo code, and can be guaranteed to stay that way --- more usually
there is some amount of copying&pasting, and then you have to wonder
how much of that material could be traced back to Berkeley.  So I
prefer to err on the side of including their copyright.  That line of
argument basically leads to the conclusion that all the copyright tags
should be identical, which doesn't seem like an unreasonable rule.

            regards, tom lane



Re: Copyright information in source files

From
vignesh C
Date:
On Fri, Nov 22, 2019 at 2:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Thomas Munro <thomas.munro@gmail.com> writes:
> > I'd like to get rid of those IDENTIFICATION lines completely (they are
> > left over from the time when the project used CVS, and that section
> > had a $Header$ "ident" tag, but in the git era, those ident tags are
> > no longer in fashion).
>
> I'm not for that.  Arguments about CVS vs git are irrelevant: the
> usefulness of those lines comes up when you've got a file that's
> not in your source tree but somewhere else.  It's particularly
> useful for the Makefiles, which are otherwise often same-y and
> hard to identify.
>
> > There are other inconsistencies in the copyright messages, like
> > whether we say "Portions" or not for PGDU, and whether we use 1996- or
> > the year the file was created, and whether the Berkeley copyright is
> > there or not (different people seem to have different ideas about
> > whether that's needed for a post-Berkeley file).
>
> Yeah, it'd be nice to have some greater consistency there.  My own
> thought about it is that it's rare to have a file that's *completely*
> de novo code, and can be guaranteed to stay that way --- more usually
> there is some amount of copying&pasting, and then you have to wonder
> how much of that material could be traced back to Berkeley.  So I
> prefer to err on the side of including their copyright.  That line of
> argument basically leads to the conclusion that all the copyright tags
> should be identical, which doesn't seem like an unreasonable rule.
>

I had seen that most files use the below format:
/*-------------------------------------------------------------------------
 * relation.c
 *       PostgreSQL logical replication
 *
 * Copyright (c) 2016-2019, PostgreSQL Global Development Group
 *
 * IDENTIFICATION
 *      src/backend/replication/logical/relation.c
 *
 * NOTES
 *      This file contains helper functions for logical replication relation
 *      mapping cache.
 *
 *-------------------------------------------------------------------------
 */

Can we use the above format as a standard format?

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com



Re: Copyright information in source files

From
John Naylor
Date:
On Sat, Nov 23, 2019 at 11:39 PM vignesh C <vignesh21@gmail.com> wrote:

>  * Copyright (c) 2016-2019, PostgreSQL Global Development Group

While we're talking about copyrights, I noticed while researching
something else that the PHP project recently got rid of all the
copyright years from their files, which is one less thing to update
and one less cause of noise in the change log for rarely-changed
files. Is there actually a good reason to update the year?

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Copyright information in source files

From
vignesh C
Date:
On Sun, Nov 24, 2019 at 7:24 AM John Naylor <john.naylor@2ndquadrant.com> wrote:
>
> On Sat, Nov 23, 2019 at 11:39 PM vignesh C <vignesh21@gmail.com> wrote:
>
> >  * Copyright (c) 2016-2019, PostgreSQL Global Development Group
>
> While we're talking about copyrights, I noticed while researching
> something else that the PHP project recently got rid of all the
> copyright years from their files, which is one less thing to update
> and one less cause of noise in the change log for rarely-changed
> files. Is there actually a good reason to update the year?
>

That idea sounds good to me. Also that way no need to update the year
every year or can we mention using current to indicate the latest
year, something like:
* Copyright (c) 2016-current, PostgreSQL Global Development Group

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com



Re: Copyright information in source files

From
Michael Paquier
Date:
On Thu, Nov 21, 2019 at 03:42:26PM -0500, Tom Lane wrote:
> Yeah, it'd be nice to have some greater consistency there.  My own
> thought about it is that it's rare to have a file that's *completely*
> de novo code, and can be guaranteed to stay that way --- more usually
> there is some amount of copying&pasting, and then you have to wonder
> how much of that material could be traced back to Berkeley.  So I
> prefer to err on the side of including their copyright.  That line of
> argument basically leads to the conclusion that all the copyright tags
> should be identical, which doesn't seem like an unreasonable rule.

Agreed.  Doing that is also a no-brainer when adding new files into
the tree or for your own, separate, modules and that's FWIW the way of
doing things I tend to follow.
--
Michael

Attachment

Re: Copyright information in source files

From
Tom Lane
Date:
John Naylor <john.naylor@2ndquadrant.com> writes:
> On Sat, Nov 23, 2019 at 11:39 PM vignesh C <vignesh21@gmail.com> wrote:
>> * Copyright (c) 2016-2019, PostgreSQL Global Development Group

> While we're talking about copyrights, I noticed while researching
> something else that the PHP project recently got rid of all the
> copyright years from their files, which is one less thing to update
> and one less cause of noise in the change log for rarely-changed
> files. Is there actually a good reason to update the year?

Good question.

I was wondering about something even simpler: is there a reason to
have per-file copyright notices at all?  Why isn't it good enough
to have one copyright notice at the top of the tree?

Actual legal advice might be a good thing to have here ...

            regards, tom lane



Re: Copyright information in source files

From
Fabien COELHO
Date:
Hello Tom,

>> While we're talking about copyrights, I noticed while researching 
>> something else that the PHP project recently got rid of all the 
>> copyright years from their files, which is one less thing to update and 
>> one less cause of noise in the change log for rarely-changed files. Is 
>> there actually a good reason to update the year?
>
> Good question.
>
> I was wondering about something even simpler: is there a reason to have 
> per-file copyright notices at all?  Why isn't it good enough to have one 
> copyright notice at the top of the tree?
>
> Actual legal advice might be a good thing to have here ...

I have no legal skills, but I (well Google really:-) found this:

https://softwarefreedom.org/resources/2012/ManagingCopyrightInformation.html

"Contrary to popular belief, copyright notices aren’t required to secure 
copyright."

There is a section about "Comparing two systems: file-scope and 
centralized notices" which is probably what you are looking for.

The "file-scope" approach suggests that each dev should add its own notice 
on each significant change. This is not was pg does and does not look too 
practical. It looks that the copyright notice is interpreted as a VCS.

Then there is some stuff about distributed VCS, but pg really uses git as 
a centralized VCS: when a patch is submitted, it is really applied by 
someone but not merged into the code from an external source. The good 
news is that git comments include the contributor identification, to some 
extent.

Then there is the centralized approach, which seems just to require 
per-file "pointer" to the license. Maybe pg should do that, which would 
strip a large part of repeated copyright headers.

-- 
Fabien.

Re: Copyright information in source files

From
vignesh C
Date:
On Sun, Nov 24, 2019 at 8:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> John Naylor <john.naylor@2ndquadrant.com> writes:
> > On Sat, Nov 23, 2019 at 11:39 PM vignesh C <vignesh21@gmail.com> wrote:
> >> * Copyright (c) 2016-2019, PostgreSQL Global Development Group
>
> > While we're talking about copyrights, I noticed while researching
> > something else that the PHP project recently got rid of all the
> > copyright years from their files, which is one less thing to update
> > and one less cause of noise in the change log for rarely-changed
> > files. Is there actually a good reason to update the year?
>
> Good question.
>
> I was wondering about something even simpler: is there a reason to
> have per-file copyright notices at all?  Why isn't it good enough
> to have one copyright notice at the top of the tree?
>
> Actual legal advice might be a good thing to have here ...

+1 for having single copyright notice at the top of the tree.
What about file header, should we have anything at all?

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com