remove recommendation re: standard license headers
J Lovejoy
Hi all,
We have some text at the bottom of this page https://spdx.dev/ids/ regarding the use of SPDX ids related to a recommendation about using and retaining standard headers when using/adding an SPDX id in source code. If memory serves, we wrote this at the time when use of SPDX ids in source code was a very new thing. We didn't know if some license stewards might have discomfort with the use of SPDX ids *instead* of their suggested standard license header, and thus felt the need to take a sort of conservative approach. Now that SPDX ids are used more widely and we know a bit more about how scanning tools identify license headers in total - I think we can remove this section altogether. I don't think SPDX needs to make a statement either way and projects can make their own call, as we've seen with the Linux kernal and other projects. Thoughts? Jilayne |
|
Warner Losh
On Mon, Oct 25, 2021 at 9:43 AM J Lovejoy <opensource@...> wrote:
I've been grappling with this in the FreeBSD project. I'll share my perspective. There's two parts to that advice. The first is to include the standard boilerplate text to invoke the license ("the standard header," though that phrase means something different in my world, so it should be eliminated for that reason alone). I think we can toss that. This project found dozens (hundreds) of variations in the prescribed text from the FSF GPL, suggesting that the suggested text is more of a suggestion than a requirement. The suggestion of not removing the boilerplate text for a license is tricky. There's a lot of inertia and received wisdom that one must never do this (since often the text includes statements that it must be retained). With the SPDX, though, the text is substantially reproduced, in durable form by a 3rd party and the reference to that third party's copy could be construed to be reproducing the text (in fact, this notion seems like a bedrock SPDX principal axiom: giving a pointer to the license is just as good as reproducing the whole license). There's much consternation in the FreeBSD project, none-the-less, with wholesale removal of these standard license texts because the variations or slight word changes means we're not reproducing the conditions exactly, and that delta may put us out of license compliance. It's an open question for the chat I hope to have with a competent attorney before the project finalizes its policies towards SPDX. So removing the advice not to remove the license text is fine, imho, since that's legal advice for what constitutes compliance (imho). Replacing it with text that says it's OK or always OK, though would not be cool, imho. Though having that there might encourage others to adopt the SPDX-only policies that have become widespread but not universal. Does that help? Warner
|
|
J Lovejoy
Thanks Warner!
On 10/25/21 3:35 PM, Warner Losh wrote:
"standard header" is narrowly defined in the context of the SPDX License List - https://github.com/spdx/license-list-XML/blob/master/DOCS/license-fields.md I am not sure where the FSF stands re: suggestion v. requirement, but the reality of lots of variations in the wild even when there is a specific standard header text provided by the license steward leans heavily towards the advantage of simply using SPDX identifiers, IMHO :) Worth noting that the steward of one of the other well known licenses with standard header agrees - http://www.apache.org/foundation/license-faq.html#Apply-My-Software Totally agree. We felt the need to say something in the beginning, but given time and how things have played out in reality - I really think this is up to the project to make a determination (like the kernel did and Uboot did, etc.) and with their own attorney's advice as they see fit. Thus, better for us to remove any such specific advice.
|
|
Dear Jilayne,
Now that SPDX ids are used more widely and we know a bit more aboutI'd completely agree with your appraisal here. Personally, I prefer to use just the SPDX license headers. I imagine that, in some cases, having both could be confusing - for example, if someone copied the standard license header for the GPLv3 "or (at your option) any later version", but also wrote SPDX-License-Identifier: GPL-3.0-only at the top of the file. Of course, if anyone does want to use both, the standard license header text will still be on the SPDX License List website. Best wishes, Sebastian |
|
Warner Losh
On Sun, Nov 14, 2021 at 1:41 PM Sebastian Crane <seabass-labrax@...> wrote: Dear Jilayne, Perhaps we should recommend that any policy about the license marking of files should address this. FreeBSD's policy will likely state that the actual boiler plate license text in the file is controlling when both are present and the SPDX-License-Identifier doesn't match the prose grant. Warner |
|
On Sun, Nov 14, 2021 at 9:35 PM Warner Losh <imp@...> wrote:
I'd personally rather we didn't even make the *appearance* of a recommendation that SPDX-License-Identifiers are suitable replacements for standard license headers. Especially with licenses that declare *how* you're supposed to leverage a license for your software, this can be highly problematic. My personal feeling is that everyone who uses SPDX-License-Identifier as a replacement for proper license headers is doing a disservice to themselves, the community at large, and everyone who uses and and consumes that code. When code travels (e.g. Linux drm/ -> FreeBSD), it's super-easy for compliance and understanding to be missed because you've gutted the important information from the code itself. This also makes it difficult for the spirit and intent of licenses to be conveyed because you're reducing them to something that they're not: some checkbox somewhere. Moreover, you've effectively eliminated how people learn about the licenses the code uses. Older licenses like BSD and MIT flavors are designed to be short enough to be embedded in the source. Newer licenses like MPL, GPL, and ASL are both too large for that, so these licenses have a preferred method of indicating that code follows those terms. Not following those adds too much ambiguity and weakens the importance of conveying the *intent* and *spirit* of these licenses. If we were to have any recommendation, I would say the SPDX-License-Identifier is a machine-parseable supplement to the standard header, not a replacement. This is also how my workplace uses them. -- 真実はいつも一つ!/ Always, there's only one truth! |
|
Warner Losh
On Mon, Nov 22, 2021 at 8:42 PM Neal Gompa <ngompa13@...> wrote: When code travels (e.g. Linux drm/ -> FreeBSD), There's no Linux drm code in FreeBSD proper. Certainly none with the new-style having the only SPDX-License-Identifier: tags (there's a few stragglers from some ancient drm implementation used only on arm on code that wasn't really from Linux). I also checked the side project that provides packages for FreeBSD that has the drm code in it. It's a mix of full copies and several files from Intel tagged with a MIT license, which is clear enough the intent. So could you elaborate a bit more on the problems and compliance issues? I'm not following these issues might be for this side project, but would like to understand because I've been advising them for almost 4 years and I've not heard even a whisper of there being even the slightest issue until the call last week and now this email. Warner |
|
Richard Purdie
On Mon, 2021-11-22 at 22:41 -0500, Neal Gompa wrote:
I'd personally rather we didn't even make the *appearance* of aIf this was attempted some number of years ago, I'm not sure it would have been appropriate but things evolve. Through the efforts of SPDX and others, I think it is now very clear what these identifiers mean and how they can be used. It makes the situation so much clearer to have a definitive short statement rather than multiple copies of license text which are often subtly different from each other or where people have avoided any license text at all as it was too verbose/painful. I say this as someone who helped adding the original license fields to openembedded, trawling through tons of source code where it was often unclear and ambiguous what license things were under. I'd strongly disagree it is a disservice and stand by the decision to tidy up code headers in various projects, some of which I've helped with. Yes you do need to be careful in changing things but the resulting readability and usability improvements are very much worthwhile. Cheers, Richard |
|
Warner Losh
On Tue, Nov 23, 2021 at 3:47 AM Richard Purdie <richard.purdie@...> wrote: On Mon, 2021-11-22 at 22:41 -0500, Neal Gompa wrote: I'll point out that the variations are an enormous pain in the ass for FreeBSD and create more uncertainty and compliance issues not less. If I don't reproduce every single license in the tree, verbatim, is that a material breach of the license? Is the 'voices in Bill Paul's head' evidence of insanity of Bill Paul this making his grant of license improper because insane people can't enter into legal agreements? All of this is with the standard 'boiler plate' language. I've also studied Unix history and noticed something interesting. In all CSRG's code inside of SCCS, they had something like %License% for all the files, to be replaced on release automatically. Even CSRG didn't want to slavishly copy the license text around, but used that hack to impose uniformity without burdening the CSRG staff. :) Warner |
|
On Tue, Nov 23, 2021 at 11:12 AM Warner Losh <imp@...> wrote:
Well, insanity question aside (because at some level, all of our sanity will need to be questioned because we deal with this ;) ), if you don't reproduce them (variations and all), you risk breaching the licenses. Because those notices in the headers are an expression of intent in themselves. I've also studied Unix history and noticed something interesting. In all CSRG's codeAnd those notices were carried *everywhere* that code was copied. :) Because the truth is, those notices *need* to be reproduced when informing people of the code *at the minimum*. No notice means the licensing doesn't exist for most people (including lawyers I've talked to over beverages before...). -- 真実はいつも一つ!/ Always, there's only one truth! |
|
J Lovejoy
top-posting as I'm not sure I can keep up
with the various comments, but a bit of background:
toggle quoted message
Show quoted text
- the first idea of using the identifiers in source files from outside the main SPDX community came by way of a developer in Germany who wrote a blog post about it and sent me the link (I still don't really know who this person was) - that was 2011 - I believe U-boot was the first project that began actually using them - I'd guess that was shortly thereafter, 2021, say. U-boot made the decision to remove the GPL "standard header" and just use the SPDX identifiers, which seemed rather daring at the time, but this was their choice. Since then, many more project have adopted this manor of communicating the license. The SPDX project does not need to provide recommendations either way - projects will make their own determination, as they should. * For anyone who has done scans and audits of source code to determine the license - this is absolutely helpful. (and if you haven't done scans and audits of source code to determine licensing - count yourself as lucky!) There are really very few licenses that provide a "standard header" - meaning a delineated instruction of text to include in the source file. (Apache-2.0 and the L/GPL family being the most used/common). Even in the case of the short licenses (e.g., BSD, MIT) that are assumed to have the full text in the file - I have heard complaints that this is wasted text/space from developers and I have seen *plenty* of "short hand" license notices that were not even clear, for example, "this file is licensed under a BSD-style license" <groan> From a broader legal perspective - consider that the license (or agreement, more generally) often does not directly accompany the things (software) you are consuming. It is quite often there is a reference (express or even implied) and the actual text is elsewhere. That's fine and it's no different here. J. On 11/24/21 9:42 PM, Neal Gompa wrote:
On Tue, Nov 23, 2021 at 11:12 AM Warner Losh <imp@...> wrote:On Tue, Nov 23, 2021 at 3:47 AM Richard Purdie <richard.purdie@...> wrote:On Mon, 2021-11-22 at 22:41 -0500, Neal Gompa wrote:I'd personally rather we didn't even make the *appearance* of a recommendation that SPDX-License-Identifiers are suitable replacements for standard license headers. Especially with licenses that declare *how* you're supposed to leverage a license for your software, this can be highly problematic. My personal feeling is that everyone who uses SPDX-License-Identifier as a replacement for proper license headers is doing a disservice to themselves, the community at large, and everyone who uses and and consumes that code. When code travels (e.g. Linux drm/ -> FreeBSD), it's super-easy for compliance and understanding to be missed because you've gutted the important information from the code itself. This also makes it difficult for the spirit and intent of licenses to be conveyed because you're reducing them to something that they're not: some checkbox somewhere. Moreover, you've effectively eliminated how people learn about the licenses the code uses.If this was attempted some number of years ago, I'm not sure it would have been appropriate but things evolve. Through the efforts of SPDX and others, I think it is now very clear what these identifiers mean and how they can be used. It makes the situation so much clearer to have a definitive short statement rather than multiple copies of license text which are often subtly different from each other or where people have avoided any license text at all as it was too verbose/painful. I say this as someone who helped adding the original license fields to openembedded, trawling through tons of source code where it was often unclear and ambiguous what license things were under. I'd strongly disagree it is a disservice and stand by the decision to tidy up code headers in various projects, some of which I've helped with. Yes you do need to be careful in changing things but the resulting readability and usability improvements are very much worthwhile.I'll point out that the variations are an enormous pain in the ass for FreeBSD and create more uncertainty and compliance issues not less. If I don't reproduce every single license in the tree, verbatim, is that a material breach of the license? Is the 'voices in Bill Paul's head' evidence of insanity of Bill Paul this making his grant of license improper because insane people can't enter into legal agreements? All of this is with the standard 'boiler plate' language.Well, insanity question aside (because at some level, all of our sanity will need to be questioned because we deal with this ;) ), if you don't reproduce them (variations and all), you risk breaching the licenses. Because those notices in the headers are an expression of intent in themselves.I've also studied Unix history and noticed something interesting. In all CSRG's code inside of SCCS, they had something like %License% for all the files, to be replaced on release automatically. Even CSRG didn't want to slavishly copy the license text around, but used that hack to impose uniformity without burdening the CSRG staff. :)And those notices were carried *everywhere* that code was copied. :) Because the truth is, those notices *need* to be reproduced when informing people of the code *at the minimum*. No notice means the licensing doesn't exist for most people (including lawyers I've talked to over beverages before...). -- 真実はいつも一つ!/ Always, there's only one truth! |
|
On Tue, Nov 23, 2021 at 11:12 AM Warner Losh <imp@...> wrote:
I'll just add that with the Linux kernel I was stunned when Kate Stewart and a few others analyzed how many unintentional minor variations there were in the "standard" GPLv2 header in just the kernel project. And similar to Warner's comment, it creates more compliance issues and uncertainty - definitely not less. The number of "Can you confirm...?" requests from lawyers or others in the industry about Linux kernel license information has dropped from dozens per year to zero. In most cases, each request we fielded probably had multiple people, and hours of internal debate among knowledgeable people within an organization, before they came to us. I know another person who analyzed the number of variations of the "standard" GPLv2 header on the FSF's own website and materials. They found in excess of 500 unique variations. None of these variations are indications of an author's intent. They are copy/paste errors or oversights that are perpetually propagated without realizing it. I'd also just remind everyone that the source tree of a git project retains the historical information. If you remove the text, the git history is still available if anyone wants to go back and look at the original variations. One other point I remind many of is to not remove copyright notices in the process. For reference, see 17 U.S. Code § 1202 "Integrity of copyright management information". Back to Jilayne's original question, I don't see anything on the website that says to retain original headers (maybe it's already been removed?), but if there was I'd support removing it. If a project decides they want to retain them, that's fine, but I don't see why the SPDX community would need to provide any particular guidance one way or another. The page Jilayne cited does include the reminder not to remove Copyright notices, which I think makes sense to keep there. Mike |
|
Phil Odence <phil.odence@...>
Mike, this is really interesting input and provides great perspective. Thank you.
When we first started advocating SPDX headers in files, we were concerned that there would be a backlash of concern about using them instead of standard headers and felt, therefore, we could not be silent. These many years later with the use well-established, I’m in agreement with advocating the use of SPDX heading and leaving it up to projet what else then include in the file.
From:
Spdx-legal@... <Spdx-legal@...> on behalf of Michael Dolan <mdolan@...> On Tue, Nov 23, 2021 at 11:12 AM Warner Losh <imp@...> wrote:
I'll just add that with the Linux kernel I was stunned when Kate Stewart and a few others analyzed how many unintentional minor variations there were in the "standard" GPLv2 header in just the kernel project. And similar to Warner's comment, it creates more compliance issues and uncertainty - definitely not less. The number of "Can you confirm...?" requests from lawyers or others in the industry about Linux kernel license information has dropped from dozens per year to zero. In most cases, each request we fielded probably had multiple people, and hours of internal debate among knowledgeable people within an organization, before they came to us.
I know another person who analyzed the number of variations of the "standard" GPLv2 header on the FSF's own website and materials. They found in excess of 500 unique variations. None of these variations are indications of an author's intent. They are copy/paste errors or oversights that are perpetually propagated without realizing it.
I'd also just remind everyone that the source tree of a git project retains the historical information. If you remove the text, the git history is still available if anyone wants to go back and look at the original variations.
One other point I remind many of is to not remove copyright notices in the process. For reference, see 17 U.S. Code § 1202 "Integrity of copyright management information".
Back to Jilayne's original question, I don't see anything on the website that says to retain original headers (maybe it's already been removed?), but if there was I'd support removing it. If a project decides they want to retain them, that's fine, but I don't see why the SPDX community would need to provide any particular guidance one way or another. The page Jilayne cited does include the reminder not to remove Copyright notices, which I think makes sense to keep there.
Mike |
|
Karen Sandler
I'm a bit confused by the discussion here.
toggle quoted message
Show quoted text
We know that the licenses require that license information and notices be retained, as Mike Dolan pointed out. (In GPL-2.0 this text is "keep intact all the notices that refer to this License"). We also know that non-copyright-holders trim down that required information and encapsulate information in SPDX identifiers that may not be complete, no matter how well intentioned they are. While the Git repos may have the full historic information in them, we have to be realistic: companies usually don't distribute a full Git repository of every project they use as part of standard license compliance practices (as much as I see that there are benefits for software freedom were they to choose to do so). It seems to me that keeping the recommendation is the safer course of action here and respectful of the licenses that ask for this information to be retained. I recall FSF had some strong opinions about this, and they literally make the same recommendation in the "How to Apply" section of GPL-2.0... maybe someone from FSF can participate in this discussion and share their opinion now? Karen M. Sandler Executive Director, Software Freedom Conservancy she/hers __________ Become a Sustainer today! http://sfconservancy.org/sustainer/ On 2021-12-01 07:31, Phil Odence via lists.spdx.org wrote:
Mike, this is really interesting input and provides great perspective. |
|
Karen said:
We also know that non-copyright-holders trim down that required information and encapsulate information in SPDX identifiers that may not be complete, no matter how well intentioned they are.Perhaps I misunderstood the thrust of the conversation, but I was assuming that the recommendations under discussion were intended for use *by* copyright holders. In other words, if it's my code, I get to decide whether to include the full license text or just the SPDX Identifier, but if it's your code, you get to decide that, not me. I was expecting the recommendation to be intended for copyright holders looking for guidance, in this ever-changing world. That said, I recognise projects with multiple contributors are a different animal: removal of a license text from a contributed file by a project maintainer who is not the original contributor is potentially more problematic (IANAL), except possibly where the contribution has been made after accepting a CLA that indicates project-specific rules for license texts and SPDX Identifiers. steve -----Original Message----- From: Spdx-legal@... <Spdx-legal@...> On Behalf Of Karen Sandler Sent: 01 December 2021 19:05 To: phil.odence@... Cc: Michael Dolan <mdolan@...>; Warner Losh <imp@...>; Richard Purdie <richard.purdie@...>; Neal Gompa <ngompa13@...>; Sebastian Crane <seabass-labrax@...>; SPDX-legal <Spdx-legal@...>; Geoffrey S. Knauth, Treasurer, FSF <geoff@...>; John Sullivan <johns@...> Subject: Re: remove recommendation re: standard license headers [External] I'm a bit confused by the discussion here. We know that the licenses require that license information and notices be retained, as Mike Dolan pointed out. (In GPL-2.0 this text is "keep intact all the notices that refer to this License"). We also know that non-copyright-holders trim down that required information and encapsulate information in SPDX identifiers that may not be complete, no matter how well intentioned they are. While the Git repos may have the full historic information in them, we have to be realistic: companies usually don't distribute a full Git repository of every project they use as part of standard license compliance practices (as much as I see that there are benefits for software freedom were they to choose to do so). It seems to me that keeping the recommendation is the safer course of action here and respectful of the licenses that ask for this information to be retained. I recall FSF had some strong opinions about this, and they literally make the same recommendation in the "How to Apply" section of GPL-2.0... maybe someone from FSF can participate in this discussion and share their opinion now? Karen M. Sandler Executive Director, Software Freedom Conservancy she/hers __________ Become a Sustainer today! https://urldefense.com/v3/__http://sfconservancy.org/sustainer/__;!!A3Ni8CS0y2Y!vw46QCqqRBUGOcNp1vDREwV9YtFYFafC0VRe6qWobIHMGg0P_G5z6ldK5ZcihGawc__h$ On 2021-12-01 07:31, Phil Odence via lists.spdx.org wrote: Mike, this is really interesting input and provides great perspective. |
|
Max Mehl
~ Steve Kilbane [2021-12-02 10:16 +0100]:
That said, I recognise projects with multiple contributors are aI vaguely recall a discussion on how the Linux kernel project should deal with this if they want to replace the copyright notices (in all their variety, as Mike pointed out) with SPDX license identifiers. One idea was to move these notices to a separate file to archive them instead of deleting them, or leaving them in the Git tree (which is, as Karen pointed out, not easily transferable). Would that be a compromise? Best, Max -- Max Mehl - Programme Manager - Free Software Foundation Europe Contact and information: https://fsfe.org/about/mehl | @mxmehl Become a supporter of software freedom: https://fsfe.org/join |
|
J Lovejoy
Jilayne
|
|
Steve Winslow
On Thu, Dec 2, 2021 at 12:18 PM J Lovejoy <opensource@...> wrote:
Jilayne, I agree with this. Projects, organizations, license stewards, etc. may all have recommendations about whether or how standard license headers should or shouldn't be used for particular licenses, and that's great. From the SPDX project's perspective, I'm +1 on simply not making a recommendation either way. Steve |
|
Dear Karen,
We know that the licenses require that license information and notices beIndeed, the full text of the license should always been available when using SPDX License Identifiers. The suggestion (which we've now removed) to include the licenses' typical header information as well as the SPDX License Identifiers was guidance for copyright holders to declare the licensing that they released the code under. There's no suggestion from SPDX that non-copyright-holders should remove any copyright or license notices from code that they receive or redistribute, be they typical license headers or SPDX License Identifiers. Thus, the point that the discussion here started around was merely about the preference of copyright holders of how to include license notices. We used to try to influence this, but have now decided not to any longer and let copyright holders make their own decisions. The REUSE case, as well as what Warner is doing with FreeBSD, is to have a canonical license text in a particular directory in the source code, and to use only the SPDX License Identifiers in the code files themselves. This makes it less likely that unintentional license variations will be added to source code files, and to make it easier for downstream users to catalogue all the applicable licenses. All users need to do is look in a single directory - very convenient! :) I hope I've managed to clear up any confusion you have, but feel free to respond if I didn't something doesn't quite make sense. Best wishes, Sebastian |
|