Copyright information in source files

Started by vignesh Cover 6 years ago10 messageshackers
Jump to latest
#1vignesh C
vignesh21@gmail.com

Hi,

I noticed that some of the source files does not include the copyright
information. Most of the files have included it, but few files have
not included it. I felt it should be included. The attached patch
contains the fix for including the copyright information in the source
files. Let me know your thoughts on the same.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

Attachments:

0001-Make-the-copyright-information-consistent.patchtext/x-patch; charset=US-ASCII; name=0001-Make-the-copyright-information-consistent.patchDownload+460-90
#2Thomas Munro
thomas.munro@gmail.com
In reply to: vignesh C (#1)
Re: Copyright information in source files

On Sun, Nov 17, 2019 at 6:36 AM vignesh C <vignesh21@gmail.com> wrote:

I noticed that some of the source files does not include the copyright
information. Most of the files have included it, but few files have
not included it. I felt it should be included. The attached patch
contains the fix for including the copyright information in the source
files. Let me know your thoughts on the same.

I'd like to get rid of those IDENTIFICATION lines completely (they are
left over from the time when the project used CVS, and that section
had a $Header$ "ident" tag, but in the git era, those ident tags are
no longer in fashion).

There are other inconsistencies in the copyright messages, like
whether we say "Portions" or not for PGDU, and whether we use 1996- or
the year the file was created, and whether the Berkeley copyright is
there or not (different people seem to have different ideas about
whether that's needed for a post-Berkeley file).

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#2)
Re: Copyright information in source files

Thomas Munro <thomas.munro@gmail.com> writes:

I'd like to get rid of those IDENTIFICATION lines completely (they are
left over from the time when the project used CVS, and that section
had a $Header$ "ident" tag, but in the git era, those ident tags are
no longer in fashion).

I'm not for that. Arguments about CVS vs git are irrelevant: the
usefulness of those lines comes up when you've got a file that's
not in your source tree but somewhere else. It's particularly
useful for the Makefiles, which are otherwise often same-y and
hard to identify.

There are other inconsistencies in the copyright messages, like
whether we say "Portions" or not for PGDU, and whether we use 1996- or
the year the file was created, and whether the Berkeley copyright is
there or not (different people seem to have different ideas about
whether that's needed for a post-Berkeley file).

Yeah, it'd be nice to have some greater consistency there. My own
thought about it is that it's rare to have a file that's *completely*
de novo code, and can be guaranteed to stay that way --- more usually
there is some amount of copying&pasting, and then you have to wonder
how much of that material could be traced back to Berkeley. So I
prefer to err on the side of including their copyright. That line of
argument basically leads to the conclusion that all the copyright tags
should be identical, which doesn't seem like an unreasonable rule.

regards, tom lane

#4vignesh C
vignesh21@gmail.com
In reply to: Tom Lane (#3)
Re: Copyright information in source files

On Fri, Nov 22, 2019 at 2:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Thomas Munro <thomas.munro@gmail.com> writes:

I'd like to get rid of those IDENTIFICATION lines completely (they are
left over from the time when the project used CVS, and that section
had a $Header$ "ident" tag, but in the git era, those ident tags are
no longer in fashion).

I'm not for that. Arguments about CVS vs git are irrelevant: the
usefulness of those lines comes up when you've got a file that's
not in your source tree but somewhere else. It's particularly
useful for the Makefiles, which are otherwise often same-y and
hard to identify.

There are other inconsistencies in the copyright messages, like
whether we say "Portions" or not for PGDU, and whether we use 1996- or
the year the file was created, and whether the Berkeley copyright is
there or not (different people seem to have different ideas about
whether that's needed for a post-Berkeley file).

Yeah, it'd be nice to have some greater consistency there. My own
thought about it is that it's rare to have a file that's *completely*
de novo code, and can be guaranteed to stay that way --- more usually
there is some amount of copying&pasting, and then you have to wonder
how much of that material could be traced back to Berkeley. So I
prefer to err on the side of including their copyright. That line of
argument basically leads to the conclusion that all the copyright tags
should be identical, which doesn't seem like an unreasonable rule.

I had seen that most files use the below format:
/*-------------------------------------------------------------------------
* relation.c
* PostgreSQL logical replication
*
* Copyright (c) 2016-2019, PostgreSQL Global Development Group
*
* IDENTIFICATION
* src/backend/replication/logical/relation.c
*
* NOTES
* This file contains helper functions for logical replication relation
* mapping cache.
*
*-------------------------------------------------------------------------
*/

Can we use the above format as a standard format?

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

#5John Naylor
john.naylor@enterprisedb.com
In reply to: vignesh C (#4)
Re: Copyright information in source files

On Sat, Nov 23, 2019 at 11:39 PM vignesh C <vignesh21@gmail.com> wrote:

* Copyright (c) 2016-2019, PostgreSQL Global Development Group

While we're talking about copyrights, I noticed while researching
something else that the PHP project recently got rid of all the
copyright years from their files, which is one less thing to update
and one less cause of noise in the change log for rarely-changed
files. Is there actually a good reason to update the year?

--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#6vignesh C
vignesh21@gmail.com
In reply to: John Naylor (#5)
Re: Copyright information in source files

On Sun, Nov 24, 2019 at 7:24 AM John Naylor <john.naylor@2ndquadrant.com> wrote:

On Sat, Nov 23, 2019 at 11:39 PM vignesh C <vignesh21@gmail.com> wrote:

* Copyright (c) 2016-2019, PostgreSQL Global Development Group

While we're talking about copyrights, I noticed while researching
something else that the PHP project recently got rid of all the
copyright years from their files, which is one less thing to update
and one less cause of noise in the change log for rarely-changed
files. Is there actually a good reason to update the year?

That idea sounds good to me. Also that way no need to update the year
every year or can we mention using current to indicate the latest
year, something like:
* Copyright (c) 2016-current, PostgreSQL Global Development Group

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

#7Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#3)
Re: Copyright information in source files

On Thu, Nov 21, 2019 at 03:42:26PM -0500, Tom Lane wrote:

Yeah, it'd be nice to have some greater consistency there. My own
thought about it is that it's rare to have a file that's *completely*
de novo code, and can be guaranteed to stay that way --- more usually
there is some amount of copying&pasting, and then you have to wonder
how much of that material could be traced back to Berkeley. So I
prefer to err on the side of including their copyright. That line of
argument basically leads to the conclusion that all the copyright tags
should be identical, which doesn't seem like an unreasonable rule.

Agreed. Doing that is also a no-brainer when adding new files into
the tree or for your own, separate, modules and that's FWIW the way of
doing things I tend to follow.
--
Michael

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Naylor (#5)
Re: Copyright information in source files

John Naylor <john.naylor@2ndquadrant.com> writes:

On Sat, Nov 23, 2019 at 11:39 PM vignesh C <vignesh21@gmail.com> wrote:

* Copyright (c) 2016-2019, PostgreSQL Global Development Group

While we're talking about copyrights, I noticed while researching
something else that the PHP project recently got rid of all the
copyright years from their files, which is one less thing to update
and one less cause of noise in the change log for rarely-changed
files. Is there actually a good reason to update the year?

Good question.

I was wondering about something even simpler: is there a reason to
have per-file copyright notices at all? Why isn't it good enough
to have one copyright notice at the top of the tree?

Actual legal advice might be a good thing to have here ...

regards, tom lane

#9Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Tom Lane (#8)
Re: Copyright information in source files

Hello Tom,

While we're talking about copyrights, I noticed while researching
something else that the PHP project recently got rid of all the
copyright years from their files, which is one less thing to update and
one less cause of noise in the change log for rarely-changed files. Is
there actually a good reason to update the year?

Good question.

I was wondering about something even simpler: is there a reason to have
per-file copyright notices at all? Why isn't it good enough to have one
copyright notice at the top of the tree?

Actual legal advice might be a good thing to have here ...

I have no legal skills, but I (well Google really:-) found this:

https://softwarefreedom.org/resources/2012/ManagingCopyrightInformation.html

"Contrary to popular belief, copyright notices arenοΏ½t required to secure
copyright."

There is a section about "Comparing two systems: file-scope and
centralized notices" which is probably what you are looking for.

The "file-scope" approach suggests that each dev should add its own notice
on each significant change. This is not was pg does and does not look too
practical. It looks that the copyright notice is interpreted as a VCS.

Then there is some stuff about distributed VCS, but pg really uses git as
a centralized VCS: when a patch is submitted, it is really applied by
someone but not merged into the code from an external source. The good
news is that git comments include the contributor identification, to some
extent.

Then there is the centralized approach, which seems just to require
per-file "pointer" to the license. Maybe pg should do that, which would
strip a large part of repeated copyright headers.

--
Fabien.

#10vignesh C
vignesh21@gmail.com
In reply to: Tom Lane (#8)
Re: Copyright information in source files

On Sun, Nov 24, 2019 at 8:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

John Naylor <john.naylor@2ndquadrant.com> writes:

On Sat, Nov 23, 2019 at 11:39 PM vignesh C <vignesh21@gmail.com> wrote:

* Copyright (c) 2016-2019, PostgreSQL Global Development Group

While we're talking about copyrights, I noticed while researching
something else that the PHP project recently got rid of all the
copyright years from their files, which is one less thing to update
and one less cause of noise in the change log for rarely-changed
files. Is there actually a good reason to update the year?

Good question.

I was wondering about something even simpler: is there a reason to
have per-file copyright notices at all? Why isn't it good enough
to have one copyright notice at the top of the tree?

Actual legal advice might be a good thing to have here ...

+1 for having single copyright notice at the top of the tree.
What about file header, should we have anything at all?

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com