Re: SPDX files as templates
toggle quoted message Show quoted text
Adding a couple of facts to the discussion:
“Locate the canonical text for the license. There should be a link to this in the issue, but if there isn't please ask for it from the license steward. Don't proceed until you have confirmed that you have the canonical text.
A couple of opinions:
From: Spdx-legal@... <Spdx-legal@...> On Behalf Of VM (Vicky) Brasseur via lists.spdx.org
Sent: Tuesday, November 16, 2021 1:38 PM
To: SPDX-legal <Spdx-legal@...>
Subject: Re: SPDX files as templates
Data point: REUSE specifically directs people to copy the text from https://github.com/spdx/license-list-data/tree/master/text.
VM (Vicky) Brasseur
Director, Senior Strategy Advisor
Open Source Program Office
Time Zone: Pacific/West Coast US
From: <Spdx-legal@...> on behalf of "Alan Tse via lists.spdx.org" <alan.tse=wdc.com@...>
CAUTION:This email is received from an external domain. Open the hyperlink(s) & attachment(s) with caution.
As a programmatic user of the list, I think we should expect the use per Vicky’s points. One extra data point, I’m not accessing any of the GitHub repos listed so far but relying on whatever the licenses.json leads me to. I do that because at one point that was pointed out as the endpoint for machine reading. If we wanted to encourage a specific type of use, we’d have to build some tooling to encourage it. That way there’s a benefit to doing it the “official way”. So for example, our template matching could be used to indicate which fields should be replaced (copyright holder). If there was a library to pull the right file and swap in the missing variable, that would encourage more official use.
On canonical licenses, I’d be supportive of swapping out to the canonical if it exists. The old one could be kept as another example. Seems like a simple version increment. If we wanted to normalize licenses to replace specific names with “COPYRIGHT HOLDERS”, I think that would be helpful and could be treated the same as a canonical switch.
From: <Spdx-legal@...> on behalf of Steve Winslow <swinslow@...>
CAUTION: This email originated from outside of Western Digital. Do not click on links or open attachments unless you recognize the sender and know that the content is safe.
Previously I've been generally against the idea of encouraging folks to use the test/simpleTestForGenerator/*.txt files for anything other than the automated tests for the XML files. Mostly for the reason you noted at the start of this thread: that in many cases (especially where a copyright notice is baked into the license text, such as MIT) people may grab it without realizing they should probably adjust the text.
I've been pretty well convinced that I was wrong there; if people are finding value in using the "test" text as license templates, then great.
A couple of random thoughts, getting into the weeds:
1) People should not assume that the text in test/simpleTestForGenerator/*.txt is necessarily the _official, canonical, byte-for-byte text_ from the license steward, if there is one. Here are a couple of examples:
* https://github.com/spdx/license-list-XML/blob/master/test/simpleTestForGenerator/Apache-2.0.txt is different from https://www.apache.org/licenses/LICENSE-2.0.txt (at least w/r/t whitespace; I haven't checked more closely)
* https://github.com/spdx/license-list-XML/blob/master/test/simpleTestForGenerator/GPL-2.0-or-later.txt is different from https://www.gnu.org/licenses/old-licenses/gpl-2.0.txt (at least w/r/t whitespace, and different parts of "optional" text after the end of the license, etc).
There's a related question of whether the test text in the license-list-XML repo _should_ be the same as the canonical license steward text, where there is one. I'm just noting that at present, it isn't always the same. Also, since license stewards sometimes make changes to their own official license texts (GPL-2.0 is an example), the SPDX text is not necessarily going to be in sync if upstream makes a change.
2) I'd tend to agree that it's generally going to be preferable to point folks at the text/ directory in the license-list-data repo. That helps to keep the concerns separated as "go to license-list-data if you're a user of the License List; go to license-list-XML in order to contribute."
From a very quick skim, it looks like the text/ directory in license-list-data is _mostly_ the same as the test text files in license-list-XML. I see a handful with differences in whitespaces; and it looks like the naming for deprecated licenses might be handled differently. But those are both presumably something that could be addressed.
On Tue, Nov 16, 2021 at 3:46 PM J Lovejoy <opensource@...> wrote:
'The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com'