update on only/or later etc.


J Lovejoy
 

Hi All,

Kate and I just had a call with Richard Stallman of the FSF to try and come to a resolution everyone can be happy with, taking into consideration the ask from the FSF and the many thorough discussions we’ve had on the mailing list and calls. This is similar to an approach we discussed on the last call, with one variation. As such, I’d like to propose the following path forward (again, using GPL-2.0 but for all GNU licenses):

Deprecate the "GPL-2.0" identifier and add the word “only” for GPL version 2 only, e.g., "GPL-2.0-only"
- this should not be problematic as it does not change the meaning of the identifier. GPL-2.0 has meant ‘version 2 only’ since the SPDX License List was born. We are simply adding explicit language for the identifier. No backwards compatibility issues in terms of the meaning.
- we can do a “warning” for people using the deprecated identifier for a period before “GPL-2.0" becomes invalid to give people a chance to update. This will also encourage people who have been sloppy to fix their sloppiness.

Add GPL version 2 or later back to the SPDX License List as it’s own entry with the short identifier of “GPL-2.0+” or “GPL-2.0-or-later” 
- This would essentially put us in the same position we are now: with two options - “only” and “or later” - it just alters how one gets there, where one finds it
- this would also put both options back on the license list thus highlighting that the GNU licenses provides these options more obviously and hopefully providing a more overt encouragement to using one or the other
- the identifier here could be “GPL-2.0+” (same as always) or “GPL-2.0-or-later” (differentiation from the + modifier might be better for tooling?) - we can discuss which is better, FSF is fine with either. 
- if we go with “GPL-2.0-or-later”, can take same approach with warning re: “GPL-2.0+” then invalid?

Keep the + modifier in the license expression language
- this allows use of + with other licenses as always, no change, no backwards compatibility

Do NOT add a identifier or operator, etc. for the found-license-text-only scenario where you don’t know if the intent of the copyright holder was “only or “or later” and are thus left to interpret clause 9 
- on the last call, we came up with two proposals that both incorporate 3 options for each GNU license, see: https://wiki.spdx.org/view/Legal_Team/Minutes/2017-11-09 - the above proposal is the same as “Paul’s alternative” / hard-coded proposal but omits adding the ‘text alone” option
- we don’t need to solve this right now and we can always add this option later
- without adding a third option, we are in the same position we have been in since the birth of the SPDX License List. incremental changes have always been our go-to strategy; let’s take a first step to clarify the current identifiers in a way that the FSF can get behind. If, for a later release, we think we need this third option, then we can discuss that further once we have some time under our belts with this change. 


I am really hoping we can all get behind this approach and spend the time on Tuesdays’ call discussing the specifics of implementation, whatever else needs to be done for the next release (for this change and generally), and then get the next release out in time for a nice Christmas present to us all :)


Thanks,
Jilayne

SPDX Legal Team co-lead
opensource@...



W. Trevor King
 

On Thu, Nov 16, 2017 at 05:37:50PM -0700, J Lovejoy wrote:
Deprecate the "GPL-2.0" identifier and add the word “only” for GPL
version 2 only, e.g., "GPL-2.0-only"
- this should not be problematic as it does not change the meaning
of the identifier. GPL-2.0 has meant ‘version 2 only’ since the
SPDX License List was born. We are simply adding explicit language
for the identifier. No backwards compatibility issues in terms of
the meaning.
- we can do a “warning” for people using the deprecated identifier
for a period before “GPL-2.0" becomes invalid to give people a
chance to update. This will also encourage people who have been
sloppy to fix their sloppiness.
I think this “deprecation with an eventual removal” approach is part
of all of the proposals, and is not unique to the “coin new
per-version license identifiers” approach.

Keep the + modifier in the license expression language
- this allows use of + with other licenses as always, no change, no
backwards compatibility
I am strongly against having both a ‘GPL-2.0+’ license ID and a ‘+’
operator. I think committing to a ‘GPL-2.0+’ license ID is an
unfortunate but tenable postition. And if we go that way, I'd rather
remove the ‘+’ operator entirely.

I'd be ok with ‘GPL-2.0-or-later’ while preserving the ‘+’ operator
for other licenses. But if a ‘+’ operator is deemed not good enough
for the GPL, which licenses would it be good enough for? This feels
like “we don't know when we'd recommend ‘+’, but didn't have the heart
to kill it”.

Personally, I think the ‘+’ operator *is* good enough for the GPL, but
if that view was universal we wouldn't be adding an or-later license
ID. If we cannot build a consensus around using ‘+’ for the GPL, I'd
rather drop it entirely. My concern with coining license identifiers
for ‘GPL-2.0-or-later’ and similar is the combinatoric increase in
license identifiers, and that's more of an aesthetic concern than a
technical concern (although there are some technical impacts, e.g. the
size of license-list-XML and license-list-data will grow).

Cheers,
Trevor

--
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy


Paul Madick
 

This is really great news.  This was a difficult issue that sparked a lot of folks to join our SPDX Legal/Technical calls to the point we needed more con call space.  We are fortunate to have such a vibrant community concerned with the particulars of the SPDX License List. 

 

The solution addresses the primary concerns raised by Richard Stallman and the FSF while preserving the effectiveness of the SPDX License List for multiple use cases.  There are still additional opportunities to further refine the SPDX License List to address situations that were not previously handled.  I look forward to revisiting those issues in the future.

 

Paul

 

 

From: spdx-legal-bounces@... [mailto:spdx-legal-bounces@...] On Behalf Of J Lovejoy
Sent: Thursday, November 16, 2017 4:38 PM
To: SPDX-legal <spdx-legal@...>
Subject: update on only/or later etc.

 



Hi All,

 

Kate and I just had a call with Richard Stallman of the FSF to try and come to a resolution everyone can be happy with, taking into consideration the ask from the FSF and the many thorough discussions we’ve had on the mailing list and calls. This is similar to an approach we discussed on the last call, with one variation. As such, I’d like to propose the following path forward (again, using GPL-2.0 but for all GNU licenses):

 

Deprecate the "GPL-2.0" identifier and add the word “only” for GPL version 2 only, e.g., "GPL-2.0-only"

- this should not be problematic as it does not change the meaning of the identifier. GPL-2.0 has meant ‘version 2 only’ since the SPDX License List was born. We are simply adding explicit language for the identifier. No backwards compatibility issues in terms of the meaning.

- we can do a “warning” for people using the deprecated identifier for a period before “GPL-2.0" becomes invalid to give people a chance to update. This will also encourage people who have been sloppy to fix their sloppiness.

 

Add GPL version 2 or later back to the SPDX License List as it’s own entry with the short identifier of “GPL-2.0+” or “GPL-2.0-or-later” 

- This would essentially put us in the same position we are now: with two options - “only” and “or later” - it just alters how one gets there, where one finds it

- this would also put both options back on the license list thus highlighting that the GNU licenses provides these options more obviously and hopefully providing a more overt encouragement to using one or the other

- the identifier here could be “GPL-2.0+” (same as always) or “GPL-2.0-or-later” (differentiation from the + modifier might be better for tooling?) - we can discuss which is better, FSF is fine with either. 

- if we go with “GPL-2.0-or-later”, can take same approach with warning re: “GPL-2.0+” then invalid?

 

Keep the + modifier in the license expression language

- this allows use of + with other licenses as always, no change, no backwards compatibility

 

Do NOT add a identifier or operator, etc. for the found-license-text-only scenario where you don’t know if the intent of the copyright holder was “only or “or later” and are thus left to interpret clause 9 

- on the last call, we came up with two proposals that both incorporate 3 options for each GNU license, see: https://wiki.spdx.org/view/Legal_Team/Minutes/2017-11-09 - the above proposal is the same as “Paul’s alternative” / hard-coded proposal but omits adding the ‘text alone” option

- we don’t need to solve this right now and we can always add this option later

- without adding a third option, we are in the same position we have been in since the birth of the SPDX License List. incremental changes have always been our go-to strategy; let’s take a first step to clarify the current identifiers in a way that the FSF can get behind. If, for a later release, we think we need this third option, then we can discuss that further once we have some time under our belts with this change. 

 

 

I am really hoping we can all get behind this approach and spend the time on Tuesdays’ call discussing the specifics of implementation, whatever else needs to be done for the next release (for this change and generally), and then get the next release out in time for a nice Christmas present to us all :)

 

 

Thanks,

Jilayne

 

SPDX Legal Team co-lead
opensource@...

 



itevomcid


Karen C.
 

There are so many things I admire about the people involved and the process that has been followed to get to this proposal for consensus. Many thanks for all Jilayne and Kate and so many others have done to bring SPDX to a point that exceeds all of our expectations.

________________________________
From: spdx-legal-bounces@... [spdx-legal-bounces@...] on behalf of J Lovejoy [opensource@...]
Sent: Thursday, November 16, 2017 7:37 PM
To: SPDX-legal
Subject: update on only/or later etc.

Hi All,

Kate and I just had a call with Richard Stallman of the FSF to try and come to a resolution everyone can be happy with, taking into consideration the ask from the FSF and the many thorough discussions we’ve had on the mailing list and calls. This is similar to an approach we discussed on the last call, with one variation. As such, I’d like to propose the followingath forward (again, using GPL-2.0 but for all GNU licenses):

Deprecate the "GPL-2.0" identifier and add the word “only” for GPL version 2 only, e.g., "GPL-2.0-only"
- this should not be problematic as it does not change the meaning of the identifier. GPL-2.0 has meant ‘version 2 only’ since the SPDX License List was born. We are simply adding explicit language for the identifier. No backwards compatibility issues in terms of the meaning.
- we can do a “warning” for people using the deprecated identifier for a period before “GPL-2.0" becomes invalid to give people a chance to update. This will also encourage people who have been sloppy to fix their sloppiness.

Add GPL version 2 or later back to the SPDX License List as it’s own entry with the short identifier of “GPL-2.0+” or “GPL-2.0-or-later”
- This would essentially put us in the same position we are now: with two options - “only” and “or later” - it just alters how one gets there, where one finds it
- this would also put both options back on the license list thus highlighting that the GNU licenses provides these options more obviously and hopefully providing a more overt encouragement to using one or the other
- the identifier here could be “GPL-2.0+” (same as always) or “GPL-2.0-or-later” (differentiation from the + modifier might be better for tooling?) - we can discuss which is better, FSF is fine with either.
- if we go with “GPL-2.0-or-later”, can take same approach with warning re: “GPL-2.0+” then invalid?

Keep the + modifier in the license expression language
- this allows use of + with other licenses as always, no change, no backwards compatibility

Do NOT add a identifier or operator, etc. for the found-license-text-only scenario where you don’t know if the intent of the copyright holder was “only or “or later” and are thus left to interpret clause 9
- on the last call, we came up with two proposals that both incorporate 3 options for each GNU license, see: https://wiki.spdx.org/view/Legal_Team/Minutes/2017-11-09<https://wiki.spdx.org/view/Legal_Team/Minutes/2017-11-09> - the above proposal is the same as “Paul’s alternative” / hard-coded proposal but omits adding the ‘text alone” option
- we don’t need to solve this right now and we can always add this option later
- without adding a third option, we are in the same position we have been in since the birth of the SPDX License List. incremental changes have always been our go-to strategy; let’s take a first step to clarify the current identifiers in a way that the FSF can get behind. If, for a later release, we think we need this third option, then we can discuss that further once we have some time under our belts with this change.


I am really hoping we can all get behind this approach and spend the time on Tuesdays’ call discussing the specifics of implementation, whatever else needs to be done for the next release (for this change and generally), and then get the next release out in time for a nice Christmas present to us all :)


Thanks,
Jilayne

SPDX Legal Team co-lead
opensource@...<mailto:opensource@...>


Choate Hall & Stewart LLP Confidentiality Notice:

This message is transmitted to you by or on behalf of the law firm of Choate, Hall & Stewart LLP. It is intended exclusively for the individual or entity to which it is addressed. The substance of this message, along with any attachments, may contain information that is proprietary, confidential and/or legally privileged or otherwise legally exempt from disclosure. If you are not the designated recipient of this message, you are not authorized to read, print, retain, copy or disseminate this message or any part of it. If you have received this message in error, please destroy and/or delete all copies of it and notify the sender of the error by return e-mail or by calling 1-800-520-2427.

For more information about Choate, Hall & Stewart LLP, please visit us at choate.com


Brad Edmondson
 

Wow! Hopefully this resolves this issue for the foreseeable future (as I think it should). I echo Karen's sentiments -- great work!


As far as the next release, to my mind, the biggest open issue is adding XML for the recently added licenses, which I think should be 2.6+. I haven't done a careful check, but based on a quick scan of the Google Sheets document, that looks like it could be:
  • EPL-2.0
  • EUPL-1.2
  • BSD-2-Clause-Patent (done)
  • W3C-Software-2015
  • Unicode-DFS-2015 (done)
  • Unicode-DFS-2016 (done)
  • TCP-wrappers (done)
  • Net-SNMP (done)

And perhaps also some/all of the licenses still under review:
  • CDLA-Permissive-1.0
  • CDLA-Sharing-1.0
  • OSCAT

Then we should add the accepted exceptions:
  • Linux-syscall-note (done)
  • Bootloader-exception
And perhaps the same for exceptions under review, although I'm not as familiar with these and they may be stale at this point. But as marked, these are "under review":
  • aptana-exception-3.0
  • Cygwin-exception-2.0
  • FOSS-License-exception
  • MySQL-Connector-ODBC-exception-2.0
  • OCaml-exception
  • rrdtool-floss-exception-2.0
  • sencha-exception-3.0
  • trolltech-gpl-exception-1.2
  • wolfcms-exception-2.0
  • Zarafa-trademark-exception-3.0

Best,
Brad

--
Brad Edmondson, Esq.
512-673-8782 | brad.edmondson@...

On Thu, Nov 16, 2017 at 8:35 PM, Copenhaver, Karen <kcopenhaver@...> wrote:
There are so many things I admire about the people involved and the process that has been followed to get to this proposal for consensus.  Many thanks for all Jilayne and Kate and so many others have done to bring SPDX to a point that exceeds all of our expectations.

________________________________
From: spdx-legal-bounces@....org [spdx-legal-bounces@lists.spdx.org] on behalf of J Lovejoy [opensource@...]
Sent: Thursday, November 16, 2017 7:37 PM
To: SPDX-legal
Subject: update on only/or later etc.

Hi All,

Kate and I just had a call with Richard Stallman of the FSF to try and come to a resolution everyone can be happy with, taking into consideration the ask from the FSF and the many thorough discussions we’ve had on the mailing list and calls. This is similar to an approach we discussed on the last call, with one variation. As such, I’d like to propose the followingath forward (again, using GPL-2.0 but for all GNU licenses):

Deprecate the "GPL-2.0" identifier and add the word “only” for GPL version 2 only, e.g., "GPL-2.0-only"
- this should not be problematic as it does not change the meaning of the identifier. GPL-2.0 has meant ‘version 2 only’ since the SPDX License List was born. We are simply adding explicit language for the identifier. No backwards compatibility issues in terms of the meaning.
- we can do a “warning” for people using the deprecated identifier for a period before “GPL-2.0" becomes invalid to give people a chance to update. This will also encourage people who have been sloppy to fix their sloppiness.

Add GPL version 2 or later back to the SPDX License List as it’s own entry with the short identifier of “GPL-2.0+” or “GPL-2.0-or-later”
- This would essentially put us in the same position we are now: with two options - “only” and “or later” - it just alters how one gets there, where one finds it
- this would also put both options back on the license list thus highlighting that the GNU licenses provides these options more obviously and hopefully providing a more overt encouragement to using one or the other
- the identifier here could be “GPL-2.0+” (same as always) or “GPL-2.0-or-later” (differentiation from the + modifier might be better for tooling?) - we can discuss which is better, FSF is fine with either.
- if we go with “GPL-2.0-or-later”, can take same approach with warning re: “GPL-2.0+” then invalid?

Keep the + modifier in the license expression language
- this allows use of + with other licenses as always, no change, no backwards compatibility

Do NOT add a identifier or operator, etc. for the found-license-text-only scenario where you don’t know if the intent of the copyright holder was “only or “or later” and are thus left to interpret clause 9
- on the last call, we came up with two proposals that both incorporate 3 options for each GNU license, see: https://wiki.spdx.org/view/Legal_Team/Minutes/2017-11-09<https://wiki.spdx.org/view/Legal_Team/Minutes/2017-11-09> - the above proposal is the same as “Paul’s alternative” / hard-coded proposal but omits adding the ‘text alone” option
- we don’t need to solve this right now and we can always add this option later
- without adding a third option, we are in the same position we have been in since the birth of the SPDX License List. incremental changes have always been our go-to strategy; let’s take a first step to clarify the current identifiers in a way that the FSF can get behind. If, for a later release, we think we need this third option, then we can discuss that further once we have some time under our belts with this change.


I am really hoping we can all get behind this approach and spend the time on Tuesdays’ call discussing the specifics of implementation, whatever else needs to be done for the next release (for this change and generally), and then get the next release out in time for a nice Christmas present to us all :)


Thanks,
Jilayne

SPDX Legal Team co-lead
opensource@...<mailto:opensource@...>


Choate Hall & Stewart LLP Confidentiality Notice:

This message is transmitted to you by or on behalf of the law firm of Choate, Hall & Stewart LLP.  It is intended exclusively for the individual or entity to which it is addressed.  The substance of this message, along with any attachments, may contain information that is proprietary, confidential and/or legally privileged or otherwise legally exempt from disclosure.  If you are not the designated recipient of this message, you are not authorized to read, print, retain, copy or disseminate this message or any part of it.  If you have received this message in error, please destroy and/or delete all copies of it and notify the sender of the error by return e-mail or by calling 1-800-520-2427.

For more information about Choate, Hall & Stewart LLP, please visit us at choate.com

_______________________________________________
Spdx-legal mailing list
Spdx-legal@...
https://lists.spdx.org/mailman/listinfo/spdx-legal


Gary O'Neall
 

I think this is a good overall solution.

It solves the issue raised by the FSF and is reasonably compatible. On the last legal call, I raised a concern that it didn't handle the case where the version may be ambiguous. After the call, I realized that we have this issue today and we don't really need to solve this in this release of the license list. Probably better to solve one issue at a time, and I have no problem starting with the issue raised by Richard and the FSF.

Thanks Jilayne for moving this forward.

Additional thoughts on the '+' operator below:

-----Original Message-----
From: spdx-legal-bounces@... [mailto:spdx-legal-
bounces@...] On Behalf Of W. Trevor King
Sent: Thursday, November 16, 2017 4:53 PM
To: J Lovejoy
Cc: SPDX-legal
Subject: Re: update on only/or later etc.

Keep the + modifier in the license expression language
- this allows use of + with other licenses as always, no change, no
backwards compatibility
I am strongly against having both a ‘GPL-2.0+’ license ID and a ‘+’
operator. I think committing to a ‘GPL-2.0+’ license ID is an unfortunate but
tenable postition. And if we go that way, I'd rather remove the ‘+’ operator
entirely.

I'd be ok with ‘GPL-2.0-or-later’ while preserving the ‘+’ operator for other
licenses. But if a ‘+’ operator is deemed not good enough for the GPL, which
licenses would it be good enough for? This feels like “we don't know when
we'd recommend ‘+’, but didn't have the heart to kill it”.
I agree with Trevor that we should not have both the + modifier and the GPL-2.0+ as a license ID as it makes the parsing ambiguous.

My preference would be GPL-2.0-or-later and preserving the '+' operator. The '+' operator could be useful for licenses where they do not explicitly handle the 'or later' versions in the license text and it maintains better compatibility.

Cheers,
Gary


Philip Odence
 

Great. We will start calling you two Kings Solomon.

 

From: <spdx-legal-bounces@...> on behalf of Jilayne Lovejoy <opensource@...>
Date: Thursday, November 16, 2017 at 7:38 PM
To: SPDX-legal <spdx-legal@...>
Subject: update on only/or later etc.

 

Hi All,

 

Kate and I just had a call with Richard Stallman of the FSF to try and come to a resolution everyone can be happy with, taking into consideration the ask from the FSF and the many thorough discussions we’ve had on the mailing list and calls. This is similar to an approach we discussed on the last call, with one variation. As such, I’d like to propose the following path forward (again, using GPL-2.0 but for all GNU licenses):

 

Deprecate the "GPL-2.0" identifier and add the word “only” for GPL version 2 only, e.g., "GPL-2.0-only"

- this should not be problematic as it does not change the meaning of the identifier. GPL-2.0 has meant ‘version 2 only’ since the SPDX License List was born. We are simply adding explicit language for the identifier. No backwards compatibility issues in terms of the meaning.

- we can do a “warning” for people using the deprecated identifier for a period before “GPL-2.0" becomes invalid to give people a chance to update. This will also encourage people who have been sloppy to fix their sloppiness.

 

Add GPL version 2 or later back to the SPDX License List as it’s own entry with the short identifier of “GPL-2.0+” or “GPL-2.0-or-later” 

- This would essentially put us in the same position we are now: with two options - “only” and “or later” - it just alters how one gets there, where one finds it

- this would also put both options back on the license list thus highlighting that the GNU licenses provides these options more obviously and hopefully providing a more overt encouragement to using one or the other

- the identifier here could be “GPL-2.0+” (same as always) or “GPL-2.0-or-later” (differentiation from the + modifier might be better for tooling?) - we can discuss which is better, FSF is fine with either. 

- if we go with “GPL-2.0-or-later”, can take same approach with warning re: “GPL-2.0+” then invalid?

 

Keep the + modifier in the license expression language

- this allows use of + with other licenses as always, no change, no backwards compatibility

 

Do NOT add a identifier or operator, etc. for the found-license-text-only scenario where you don’t know if the intent of the copyright holder was “only or “or later” and are thus left to interpret clause 9 

- on the last call, we came up with two proposals that both incorporate 3 options for each GNU license, see: https://wiki.spdx.org/view/Legal_Team/Minutes/2017-11-09 - the above proposal is the same as “Paul’s alternative” / hard-coded proposal but omits adding the ‘text alone” option

- we don’t need to solve this right now and we can always add this option later

- without adding a third option, we are in the same position we have been in since the birth of the SPDX License List. incremental changes have always been our go-to strategy; let’s take a first step to clarify the current identifiers in a way that the FSF can get behind. If, for a later release, we think we need this third option, then we can discuss that further once we have some time under our belts with this change. 

 

 

I am really hoping we can all get behind this approach and spend the time on Tuesdays’ call discussing the specifics of implementation, whatever else needs to be done for the next release (for this change and generally), and then get the next release out in time for a nice Christmas present to us all :)

 

 

Thanks,

Jilayne

 

SPDX Legal Team co-lead
opensource@...

 


David A. Wheeler
 

Jilayne Lovejoy <opensource@...>:
Do NOT add a identifier or operator, etc. for the found-license-text-only scenario where you don’t know if the intent of the copyright holder was “only or “or later” and are thus left to interpret clause 9 
This "resolution" doesn't solve the problem.

Since tools are not yet sentient, tools often *cannot* determine if "or later" was intended. Yet "don't know" makes a tool useless, and it *did* see a copy of a license, so the tool *will* report something. Tools will probably report "GPL-2.0-only" when they only see the GPL-2.0. As a result, soon "GPL-2.0-only" will not IN PRACTICE mean "only GPL-2.0".

I'm fine with "GPL-2.0-only" and special-casing "GPL-2.0+", but we *STILL* need a way to indicate "GPL-2.0 at least and I don't know if later versions are okay".

People depend on automated tools, and automated tools often CAN'T figure out the "or later" question. There are a million ways to indicate "I don't know if a later version is okay", e.g., "AT LEAST" or "?" suffix, MAYBE operation, etc. But if SPDX can't represent this common case, then people will overload *other* expressions with this alternative meaning, meaning that the "only" soon won't have that meaning.

--- David A. Wheeler


David A. Wheeler
 

J Lovejoy:

Do NOT add a identifier or operator, etc. for the found-license-text-only scenario where you don’t know if the intent of the copyright holder was “only or “or later” and are thus left to interpret clause
I disagree, sorry.

- we don’t need to solve this right now and we can always add this option later
- without adding a third option, we are in the same position we have been in since the birth of the SPDX License List. incremental changes have always been our go-to strategy; let’s take a first step to clarify the current identifiers in a way that the FSF can get behind. If, for a later release, we think we need this third option, then we can discuss that further once we have some time under our belts with this change. 
No, this is the *reason* that there's a problem. The *reason* that "GPL-2.0" isn't working is, in part, because it overloads two notions. "GPL-2.0" is supposed to mean "Only 2.0" (per the spec) . But tools only know "I saw a GPL-2.0 license", so how can they represent that information? The obvious way is "GPL-2.0", so that same identifier can mean "2.0 at least, and I don't know if there are other versions allowed". That's not good.

If we wait to "add this option later", "GPL-2.0-only" will probably have morphed in *practice* into "GPL-2.0 at least, and I don't know if it's the only version". So while everyone can congratulate themselves about the clarity of the spec, very soon it will predictably be *unclear* in practice. If we want to be able to express "exactly this version", we also need to be able to represent "at least this version".

--- David A. Wheeler


Brad Edmondson
 

Hi David, 

I think your points are good ones, but it seems to me they go to the separate issues of "file:detected license" and "package:concluded license." 

The clarity of the spec argument is aimed at making the "file:detected license" case more explicit, and if it leaves tools with NOASSERTION for "package:concluded license," then I think that's OK, no?

Best,
Brad

--
Brad Edmondson, Esq.
512-673-8782 | brad.edmondson@...

On Fri, Nov 17, 2017 at 10:35 AM, Wheeler, David A <dwheeler@...> wrote:
J Lovejoy:

> Do NOT add a identifier or operator, etc. for the found-license-text-only scenario where you don’t know if the intent of the copyright holder was “only or “or later” and are thus left to interpret clause

I disagree, sorry.

> - we don’t need to solve this right now and we can always add this option later
> - without adding a third option, we are in the same position we have been in since the birth of the SPDX License List. incremental changes have always been our go-to strategy; let’s take a first step to clarify the current identifiers in a way that the FSF can get behind. If, for a later release, we think we need this third option, then we can discuss that further once we have some time under our belts with this change. 

No, this is the *reason* that there's a problem.  The *reason* that "GPL-2.0" isn't working is, in part, because it overloads two notions.  "GPL-2.0" is supposed to mean "Only 2.0" (per the spec) .  But tools only know "I saw a GPL-2.0 license", so how can they represent that information?  The obvious way is "GPL-2.0", so that same identifier can mean "2.0 at least, and I don't know if there are other versions allowed".  That's not good.

If we wait to "add this option later", "GPL-2.0-only" will probably have morphed in *practice* into "GPL-2.0 at least, and I don't know if it's the only version".  So while everyone can congratulate themselves about the clarity of the spec, very soon it will predictably be *unclear* in practice.  If we want to be able to express "exactly this version", we also need to be able to represent "at least this version".

--- David A. Wheeler

_______________________________________________
Spdx-legal mailing list
Spdx-legal@...
https://lists.spdx.org/mailman/listinfo/spdx-legal


John Sullivan
 

J Lovejoy <opensource@...> writes:

Hi All,

Kate and I just had a call with Richard Stallman of the FSF to try and
come to a resolution everyone can be happy with, taking into
consideration the ask from the FSF and the many thorough discussions
we’ve had on the mailing list and calls. This is similar to an
approach we discussed on the last call, with one variation. As such,
I’d like to propose the following path forward (again, using GPL-2.0
but for all GNU licenses):
Thanks to everyone for working with us on this!

-john

--
John Sullivan | Executive Director, Free Software Foundation
GPG Key: A462 6CBA FF37 6039 D2D7 5544 97BA 9CE7 61A0 963B
https://status.fsf.org/johns | https://fsf.org/blogs/RSS

Do you use free software? Donate to join the FSF and support freedom at
<https://my.fsf.org/join>.


David A. Wheeler
 

Brad Edmondson [mailto:brad.edmondson@...]
I think your points are good ones, but it seems to me they go to the separate issues of "file:detected license" and "package:concluded license." 
The clarity of the spec argument is aimed at making the "file:detected license" case more explicit, and if it leaves tools with NOASSERTION for "package:concluded license," then I think that's OK, no?
No, it fails to work for multiple reasons:
1. "NOASSERTION" is basically useless, because it provides no information. In many cases, all I need to know is "there's a version of the GPL here", and I can make a decision. Being able to provide *some* information is often all that's needed , while providing *no* information creates a lot of unnecessary work for tool users.
2. Tools, lacking sentience, often cannot determine whether or not "or later versions" applies. So they're unable to be "more explicit" in package:concluded. The current structure requires that conclude either "only 2.0" or "2.0 or later"... even though tools typically CANNOT make that determination. SPDX should make it possible report the information *actually* available.
3. The majority of SPDX users do not use SPDX files. Instead, they *only* use SPDX license expressions (as available in package managers, file content declarations, etc.). So there's no "file:detected" vs. "package:concluded" entries to compare anyway.

--- David A. Wheeler


Gary O'Neall
 

I understand and agree with David's concerns - also coming from a tooling perspective.

However, I believe this is a different problem than the FSF issue and a problem we have today with the current license expression syntax and the current license list.

It seems we are talking about 2 different usage scenarios for SPDX license expressions:
1) Someone is using a license expression to document what they "know" or assert is the license for a file or package. For example, the copyright owner is adding an SPDX license ID in their file headers.
2) Someone or something is documenting findings on license information for files or packages. For example, a license scanning tool.

For #1, we don't want to allow someone to be ambiguous about whether a GPL license is "only" or "or later" when describing a license using SPDX license expressions. I believe this is the issue the FSF is concerned about.

For #2, we will find situations where it is not clear if a GPL license is to be used "only" with that version or with that version or later (BTW - it's not just tools that have this problem). We would like to be able to express this situation using SPDX since it is very useful information.

On the last legal call, it seemed clear to me that our attempts to solve #2 created a great deal of concern for those trying to solve #1.

In order to make progress, I still feel we should divide and conquer solving the FSF issue first then addressing the ambiguous license version issue in a future release of the spec. Perhaps we can come up with a more generalized solution for ambiguous license findings for #2 if we had more time to design and discuss the solution.

One additional thought: We could use a LicenseRef to document the exact text of the ambiguous license version and add a license comment to indicate it is GPL, just not clear which version. The LicenseRef approach would only work for SPDX documents and would provide more information than a NOASSERTION.

Gary

-----Original Message-----
From: spdx-legal-bounces@... [mailto:spdx-legal-
bounces@...] On Behalf Of Wheeler, David A
Sent: Friday, November 17, 2017 3:20 PM
To: brad.edmondson@...
Cc: SPDX-legal
Subject: RE: update on only/or later etc.

Brad Edmondson [mailto:brad.edmondson@...]
I think your points are good ones, but it seems to me they go to the
separate issues of "file:detected license" and "package:concluded license."
The clarity of the spec argument is aimed at making the "file:detected
license" case more explicit, and if it leaves tools with NOASSERTION for
"package:concluded license," then I think that's OK, no?

No, it fails to work for multiple reasons:
1. "NOASSERTION" is basically useless, because it provides no information. In
many cases, all I need to know is "there's a version of the GPL here", and I
can make a decision. Being able to provide *some* information is often all
that's needed , while providing *no* information creates a lot of unnecessary
work for tool users.
2. Tools, lacking sentience, often cannot determine whether or not "or later
versions" applies. So they're unable to be "more explicit" in
package:concluded. The current structure requires that conclude either "only
2.0" or "2.0 or later"... even though tools typically CANNOT make that
determination. SPDX should make it possible report the information *actually*
available.
3. The majority of SPDX users do not use SPDX files. Instead, they *only* use
SPDX license expressions (as available in package managers, file content
declarations, etc.). So there's no "file:detected" vs. "package:concluded"
entries to compare anyway.

--- David A. Wheeler

_______________________________________________
Spdx-legal mailing list
Spdx-legal@...
https://lists.spdx.org/mailman/listinfo/spdx-legal


J Lovejoy
 




On Nov 17, 2017, at 8:35 AM, Wheeler, David A <dwheeler@...> wrote:

J Lovejoy:

Do NOT add a identifier or operator, etc. for the found-license-text-only scenario where you don’t know if the intent of the copyright holder was “only or “or later” and are thus left to interpret clause

I disagree, sorry.

- we don’t need to solve this right now and we can always add this option later
- without adding a third option, we are in the same position we have been in since the birth of the SPDX License List. incremental changes have always been our go-to strategy; let’s take a first step to clarify the current identifiers in a way that the FSF can get behind. If, for a later release, we think we need this third option, then we can discuss that further once we have some time under our belts with this change. 

No, this is the *reason* that there's a problem.  The *reason* that "GPL-2.0" isn't working is, in part, because it overloads two notions.  "GPL-2.0" is supposed to mean "Only 2.0" (per the spec) .  But tools only know "I saw a GPL-2.0 license", so how can they represent that information?  The obvious way is "GPL-2.0", so that same identifier can mean "2.0 at least, and I don't know if there are other versions allowed".  That's not good.

Hi David,

If this is a potential problem once GPL-2.0 is changed to GPL-2.0-only, then it is currently a problem. And perhaps by altering the current identifier (GPL-2.0) to be more explicit (GPL-2.0-only) we will expose just how often GPL-2.0 has been used incorrectly. That may provide better examples to work off of to decide what ‘third option’ we need.  

Just a reminder to all: when someone places a copy of the GPL, version 2 alongside source code files this does not make the licensing ambiguous; clearly there is a valid license. The question comes down to how you interpret clause 9:
- does the language, "If the Program specifies a version number of this License which applies to it and 'any later version,' you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation.” interpreted that placing a copy of the license is “specifying a version” and thu a user can redistribute the code under GPL version 2 (GPL-2.0-only) or, possibly some people read this as meaning GPL version 2 or any later version (GPL-2.0+)
- or does placing a copy of a version of the license NOT constitute specifying a version and thus the sentence, "If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation.” in which case one can redistribute the code under GPL-1.0+


Any scenario you could interpret, we have a way to express that currently and would still under the proposal. 

While on this subject, an article that appeared on opensource.com came up on the last call. I just want to point out that that article, which explains the above interpretation issues (which we have been talking about for several months), does not reach a conclusion but simply encourages people to provide clarity of their intentions.  We can certainly all agree on encouraging that!  https://opensource.com/article/17/11/avoiding-gpl-confusion (Although, I think we should consistently encourage people to use the standard license notices provided by the license and/or SPDX short identifiers) :)

Thanks,
Jilayne



David A. Wheeler
 

J Lovejoy [mailto:opensource@...]:
If this is a potential problem once GPL-2.0 is changed to GPL-2.0-only, then it is currently a problem.
Yes indeed, that's my point :-).

And perhaps by altering the current identifier (GPL-2.0) to be more explicit (GPL-2.0-only) we will expose just how often GPL-2.0 has been used incorrectly.
The tools are currently *required* to be incorrect, because they cannot report the information they have ("I have GPL-2.0, and I don't know if 'or later' applies"). Neither the proposed "GPL-2.0-only" nor "GPL-2.0+" correctly represents the information they have. Tools will have to output *something*, and whatever they produce will dilute in *practice* the strict meanings of those license identifiers.

--- David A. Wheeler


Philippe Ombredanne
 

On Tue, Nov 21, 2017 at 5:28 PM, Wheeler, David A <dwheeler@...> wrote:
J Lovejoy [mailto:opensource@...]:
If this is a potential problem once GPL-2.0 is changed to GPL-2.0-only, then
it is currently a problem.
Yes indeed, that's my point :-).

And perhaps by altering the current identifier (GPL-2.0) to be more explicit
(GPL-2.0-only) we will expose just how often GPL-2.0 has been used
incorrectly.
The tools are currently *required* to be incorrect, because they cannot report
the information they have ("I have GPL-2.0, and I don't know if 'or later'
applies"). Neither the proposed "GPL-2.0-only" nor "GPL-2.0+" correctly
represents the information they have. Tools will have to output *something*,
and whatever they produce will dilute in *practice* the strict meanings of
those license identifiers.
David,

Speaking as the author of a fine license detection engine, I think
this is a red herring.

A license detection result can be: "I am 95% sure this is GPL-2.0-only
but it could be GPL-2.0+: please review me to fill in your
conclusion."

So detection does not have to be binary as in either 100% right or
100% wrong. If a tool can only report red or blue binary results,
that's a possibly fine but weak tool.

For instance scancode-toolkit can cope with ambiguity alright and
surface this for review when it cannot come with a definitive
detection answer. Therefore I have no issue whatsoever to implement
Jilyane's comprehensive proposal and I can always output something on
my side.

So since this can be done by one tool alright this is NOT an issue for
the SPDX spec to worry about and tools should adjust: that's for tools
implementors to cope with ambiguity, not something to specify here.

Please let's keep this spec simple!

--
Cordially
Philippe Ombredanne


David A. Wheeler
 

Philippe Ombredanne:
I think there is no contention there at all.
Respectfully: There *IS* contention. I'm contending.

A summary (e.g. a license expression) cannot ever capture all the nuances
of the details.... This is a lossy "compression" by construction...
Sure, but all summaries, and all models, omit something. Indeed,
a SPDX license file *also* cannot capture all the nuances.

The correct question is, "is this model adequate for its uses?"
In most cases people want to know, "is this package legal to use?".
To answer that question, "it's at least GPL-2.0, and might be more"
s important information, and I think it's information that the SPDX
license expression should include.

Speaking as the author of a fine license detection engine, I think this is a
red herring.
A license detection result can be: "I am 95% sure this is GPL-2.0-only but it
could be GPL-2.0+: please review me to fill in your conclusion."
This inability to indicate the "in-between" state within a license expression
greatly increases the number of cases where an unnecessary review must occur.
Every unnecessary review is a significant increase in time and money.
In many cases, it's *NOT* necessary to make a decision, but in some cases it is.
If organizations can do the analysis *ONLY* when they need to,
they'd save a lot of time and money... and that is greatly aided by
having SPDX license expressions able to indicate this information.

So detection does not have to be binary as in either 100% right or 100%
wrong. If a tool can only report red or blue binary results, that's a possibly
fine but weak tool.
But that's what I'm saying. Most tools CAN provide more than 2 answers.
The problem is that the SPDX license expressions don't allow tools to report
more than the 2 answers within a license expression. So the tool doesn't have
to give a binary answer, but SPDX forces the tools to do so when they output
SDPX license expressions.

For instance scancode-toolkit can cope with ambiguity alright and surface
this for review when it cannot come with a definitive detection answer.
But it CANNOT surface this information via SPDX license expressions.
For most people, that's the ONLY thing that matters. I suspect at most 0.1% of
SPDX users use SPDX files, everyone else ONLY uses SDPX license expressions.
The percentage of SPDX users who use SPDX files may not be that high :-).

Therefore I have no issue whatsoever to implement Jilyane's comprehensive
proposal and I can always output something on my side.
You can always output something nonstandard that cannot be shared, sure,
and for many detailed analyses that's a good thing.
But that's less helpful for sharing compared to a standard format.

So since this can be done by one tool alright this is NOT an issue for the
SPDX spec to worry about and tools should adjust: that's for tools
implementors to cope with ambiguity, not something to specify here.

Please let's keep this spec simple!
Well, empty specs are the simplest possible :-).
Specs need to be as simple as possible... but no simpler.

There's also the long-term damage this decision will cause.
In practice, I expect failing to add this capability is going to make
"GPL-2.0-only" mean the same thing as "I saw a GPL-2.0 and I don't
know if 'other later' applies" - and as a result "GPL-2.0-only" will
NOT mean "GPL-2.0-only" as intended. The case of "I see a license
and no other information" is relatively common, and is *important*
for determining what is legal to do.

--- David A. Wheeler


Philippe Ombredanne
 

David:
You are bringing good points. Here are my counter points:

On Fri, Nov 24, 2017 at 5:15 PM, Wheeler, David A <dwheeler@...> wrote:
Philippe Ombredanne:
I think there is no contention there at all.
Respectfully: There *IS* contention. I'm contending.

A summary (e.g. a license expression) cannot ever capture all the nuances
of the details.... This is a lossy "compression" by construction...
Sure, but all summaries, and all models, omit something. Indeed,
a SPDX license file *also* cannot capture all the nuances.

The correct question is, "is this model adequate for its uses?"
In most cases people want to know, "is this package legal to use?".
You are making assumption about what the common use case might be. To
me the common use case is more simply: what's the license?

Whether this is "legal" or not is something you or your legal adviser
can decide based on this.
And practically, "legal" is more often than not a policy choice
instead, whether you are a FLOSS project author or a consumer of FLOSS
code.

To answer that question, "it's at least GPL-2.0, and might be more"
s important information, and I think it's information that the SPDX
license expression should include.
Is this really important to know this fact in the general case? In my
own experience the cases where I need hyper precision on GPL-2.0 vs
GPL-2.0+ are rather limited:
1. I am combining GPL 2 and GPL 3 code
2. OR I want to use a GPL 3 for GPL 2-licensed code

These cases are extremely rare for consumers of FLOSS code based on my
reasonably wide and many of experience in this space... So rare in
fact that they account for a handful across thousand+ products and
billions of LOC. So rare that I cannot recall of any OTH.

In each cases they require careful legal review before making a
decision. Making this careful decision solely on the few characters of
a license expression would be insanely foolish IMHO. I am not sure
SPDX needs to worry or cater about this.

In every other case, the GPL2 vs GPL2+ debate does not matter much as
this is still the same GPL terms that apply: same permissions and same
obligations.

Speaking as the author of a fine license detection engine, I think this is a
red herring.
A license detection result can be: "I am 95% sure this is GPL-2.0-only but it
could be GPL-2.0+: please review me to fill in your conclusion."
This inability to indicate the "in-between" state within a license expression
greatly increases the number of cases where an unnecessary review must occur.
Every unnecessary review is a significant increase in time and money.
In many cases, it's *NOT* necessary to make a decision, but in some cases it is.
If organizations can do the analysis *ONLY* when they need to,
they'd save a lot of time and money... and that is greatly aided by
having SPDX license expressions able to indicate this information.
Again, the cases where you need precision vs. good enough accuracy in
the GPL2/GPL2+ debate are rare. 99% of the time, you do not need this
precision at all.

Now, I could not agree more with you: inaccurate and clear licensing
information means that a user will need to review this to ensure this
is clear. But this is NOT a problem for SPDX to solve in the license
expression spec.

This is something that needs to fixed by working with every project
author such that there is clarity such as the work Kate and I have and
are doing with Linux maintainers to make the kernel licensing hyper
clear. Or the tickets I routinely file with projects that lack a clear
license. That's solving the problem IMHO: e.g. let's react to the
symptoms, but attack the root cause instead. And there SPDX and
license expression are a great way to make things clear upstream once
reviewed. There are not a substitute to a review.
FWIW, having an initiative to systematically help projects authors
clarify licensing is something that I have had in mind for quite a
while. I may do something about it eventually.

So detection does not have to be binary as in either 100% right or 100%
wrong. If a tool can only report red or blue binary results, that's a possibly
fine but weak tool.
But that's what I'm saying. Most tools CAN provide more than 2 answers.
The problem is that the SPDX license expressions don't allow tools to report
more than the 2 answers within a license expression. So the tool doesn't have
to give a binary answer, but SPDX forces the tools to do so when they output
SDPX license expressions.
I can output more than one expression then, can I?

For instance scancode-toolkit can cope with ambiguity alright and surface
this for review when it cannot come with a definitive detection answer.
But it CANNOT surface this information via SPDX license expressions.
For most people, that's the ONLY thing that matters.
It surely could (NB: it does not yet). that's a minor change.
e.g. something like a list of license expressions with a confidence:

- confidence: 100% , expression: GPL-2.0-only
- confidence: 60% , expression: ((GPL-2.0-only or GPL-2.0+) and MIT)

Each expression is valid, right?

I suspect at most 0.1% of
SPDX users use SPDX files, everyone else ONLY uses SDPX license expressions.
The percentage of SPDX users who use SPDX files may not be that high :-).
Would you have data or pointers to support these assertions about SPDX
usage? That would be mighty useful!

Therefore I have no issue whatsoever to implement Jilyane's comprehensive
proposal and I can always output something on my side.
You can always output something nonstandard that cannot be shared, sure,
and for many detailed analyses that's a good thing.
But that's less helpful for sharing compared to a standard format.
I think we had a similar discussion a while back about adding
something like a scope or purpose in the license expression syntax.
This is the same here: I can convey one or more license expressions
with a confidence attached if needed. The confidence or score is not
part of the expression but some external attribute that qualifies it.

I am not talking to output anything "non-standard" whatever this may
be: instead external data about an expression are best handled
externally.
When in an SPDX doc, there are ways to deal with it; outside of it,
you need to track other data attributes that would otherwise be
supported by an SPDX doc.

To take a (likely bad) analogy: What you are suggesting is somewhat
similar to storing the SHA1 of a file inside the file itself. This
will change the file content... and then you need to recompute the
SHA1 value beause of this. And store it inside the file, and
recompute, and so on .... forever.

External observations about something (here the confidence you may
attach to a certain license expression) are best managed outside the
observed thing, otherwise they modify the thing under observation.

Therefore, I track a file SHA1 outside of a file itself and not
inside. And I see it best to track the confidence or score I can
attach to a license expression outside of this expression.
And if we want to have this in SDPX, this would mean to add an
attribute to qualify a license expression "confidence", not add this
to the expression syntax IMHO.

So since this can be done by one tool alright this is NOT an issue for the
SPDX spec to worry about and tools should adjust: that's for tools
implementors to cope with ambiguity, not something to specify here.

Please let's keep this spec simple!
Well, empty specs are the simplest possible :-).
Specs need to be as simple as possible... but no simpler.
Are you suggesting that the SPDX expression spec is empty? (*cough*)
Or that the SPDX spec is empty? (*cough, cough*) I tend to think it as
a tad too fat and in need of a good diet instead ;)

There's also the long-term damage this decision will cause.
In practice, I expect failing to add this capability is going to make
"GPL-2.0-only" mean the same thing as "I saw a GPL-2.0 and I don't
know if 'other later' applies" - and as a result "GPL-2.0-only" will
NOT mean "GPL-2.0-only" as intended.
I do not grok what you mean there. Can you clarify?

Which part of "only" is not clear to you?

Why would "GPL-2.0-only" suddenly be meaning anything else that its
definition in SPDX as carefully crafted by experienced and FLOSS-savvy
lawyers (hat tip) and as agreed and reviewed with the GPL authority
that the FSF is without any possible argument (other hat tip) ?

The case of "I see a license
and no other information" is relatively common, and is *important*
for determining what is legal to do.
Do you have data to support this? My personal experience is that this
is a case that is not so common.
And again even if it were pervasive and the norm, the number of cases
where I need hyper precision to determine "what is legal to do" are
rare as I explained at first and that I am repeating here for clarity:

1. I am combining GPL 2 and GPL 3 code
2. OR I want to use a GPL 3 for GPL 2-licensed code

Outside of these two rare cases, a user of GPL-2.0-licensed code will
not care much about this: "what is legal to do" e.g. which GPL 2.0
permissions and obligations apply is clear and non-ambiguous: this all
that needs to be known. The eventual lack of precision here is not a
problem to me and the many user of many GPL-licensed code used I
helped and helped comply.

And yet, Jilayne's proposal makes these rare cases **crystal clear**
going forward: so this is all gravy to me!

--
Cordially
Philippe Ombredanne

+1 650 799 0949 | pombredanne@...
DejaCode - What's in your code?! - http://www.dejacode.com
AboutCode - Open source for open source - https://www.aboutcode.org
nexB Inc. - http://www.nexb.com


David A. Wheeler
 

David A. Wheeler:
To answer that question, "it's at least GPL-2.0, and might be more"
s important information, and I think it's information that the SPDX
license expression should include.
Philippe Ombredanne [mailto:pombredanne@...]
Is this really important to know this fact in the general case?
Yes, there are a number of cases where it's important.
The usual reason is because I'm trying to link Apache-2.0 licensed code with
other code, a non-problem for GPL-2.0+ but widely considered a problem for
GPL-2.0 only. The Apache-2.0 license is extremely common.

On the other hand, there are many other cases where it's not important.

Which is why it's important to know in cases, and important to *not* track it
down when it's unimportant.

Making this careful decision solely on the few characters of a license
expression would be insanely foolish IMHO.
Not at all. What matters in many circumstances is just being able to show
some sort of due diligence.

In many cases, the "usual" situation is to copy & paste code, regardless of license or legality.
Any improvement over *that* is a big win.

Now, I could not agree more with you: inaccurate and clear licensing
information means that a user will need to review this to ensure this is
clear....
This is something that needs to fixed by working with every project author...
[e.g.]... tickets I routinely file with projects that lack a clear license.
I *heartily* endorse that work, thank you! But for every license you add,
someone creates another project with unclear licensing.

The *real* root causes are going to be difficult to fix:
* A large proportion of software developers are self-taught (& so don't know about
the laws), and of the rest, schools typically fail to teach CS students about software-related laws.
You can teach one, but the next developer will do the same thing.
* We have a VC/business culture that often values speed of development over legality.
* Many software developers are young & only know other young developers,
so they don't have anyone more experienced to learn from (or discount
the knowledge of those who *have* suffered the problems before).
* Many software developers, especially young/inexperienced developers,
incorrectly think that laws don't apply to software; I blame in part
the RIAA, who have successfully convinced the latest software developers
that copyright is not a real law.
* Copyright law as-written is very complex, and
is so obviously bought off by special interests, that it's difficult to defend,
and that makes it difficult to get many developers to take it seriously.

You can fix a few egregious cases with tickets, and please do.
But you're *not* to fix these root causes with a few tickets.

Education is *great*, but for the foreseeable future we're going to continue to have problems.


It surely could (NB: it does not yet). that's a minor change.
e.g. something like a list of license expressions with a confidence:

- confidence: 100% , expression: GPL-2.0-only
- confidence: 60% , expression: ((GPL-2.0-only or GPL-2.0+) and MIT)
That's not a standard SPDX license expression.

SPDX license expression syntax could add a "confidence" value - but that's
more complex, and I don't think you're seriously proposing it.
Why not just a simple expression that indicates uncertainty of new versions?



Each expression is valid, right?

I suspect at most 0.1% of
SPDX users use SPDX files, everyone else ONLY uses SDPX license
expressions.
The percentage of SPDX users who use SPDX files may not be that high :-).
Would you have data or pointers to support these assertions about SPDX
usage? That would be mighty useful!
I agree that'd be useful - I don't have anything great.
Here's one try.

A Google search of "filetype:spdx" returns 164 results.
Clearly ".spdx" files are not lighting the world on file.

Contrasting this to SPDX license expressions, we have to look at their
uses, which include package managers, in-file statements, and simple
tools that just report SDPX license expressions (e.g., Ruby's LicenseFinder).

Many package managers use SPDX license expressions
to indicate the package license. E.g., NPM does:
https://docs.npmjs.com/files/package.json
by using the "license:" field - which is *NOT* a SPDX license file.
According to <http://modulecounts.com/>, *just* the NPM ecosystem
has 550,951 modules as of Nov 24, with 535 new packages a day on average.
I don't know what percentage of modules have a "license:" entry
(is someone willing to find out?) - but for discussion, I'll guess it's at *least* 10%..
That would mean that there are 55,095 NPM packages that use
SDPX license expressions.

This is a quick try, it'd be possible to get a more accurate estimate. But if you
add all the other package managers where
SPDX license expressions get used, and the per-file entries, and I think
It's clearly that SPDX use is *primarily* the use of SPDX license expressions.

External observations about something (here the confidence you may
attach to a certain license expression) are best managed outside the
observed thing, otherwise they modify the thing under observation.
No. *All* observations are external, there are no exceptions.
Even if a file is specifically labelled as a license, it might have been added by
someone not authorized to do so. More philosophically, I cannot observe
the world "directly"; I can only perceive the world through my senses
which in turn are mediated by my brain.

It is very valuable to be able to say, "the final result of my analysis"
in a single computer-processable expression. Especially since that "final" analysis
can in turn be used as an input for a larger analysis.


Are you suggesting that the SPDX expression spec is empty? (*cough*) Or
that the SPDX spec is empty?
No, I'm suggesting that simplicity as the *only* criteria is not enough;
It needs to be balanced with other needs.

(*cough, cough*) I tend to think it as a tad too
fat and in need of a good diet instead ;)
There's also the long-term damage this decision will cause.
In practice, I expect failing to add this capability is going to make
"GPL-2.0-only" mean the same thing as "I saw a GPL-2.0 and I don't
know if 'other later' applies" - and as a result "GPL-2.0-only" will
NOT mean "GPL-2.0-only" as intended.
I do not grok what you mean there. Can you clarify?

Which part of "only" is not clear to you?
Oh, I *understand* the proposal very well. The problem is that
I think it's ignoring some key facts on the ground.

I've said it several different ways, but I'll try again.

Many tools CANNOT determine "or any later version applies in all cases.
They *CAN* determine if a copy of the GPL-2.0 exists.
These tools WILL NOT report "UNKNOWN", because that's useless.
People are using these tools, and will continue to do so.
So, the tools will report "GPL-2.0-only" when they see "GPL-2.0" and
don't know if "or later" applies.

Why would "GPL-2.0-only" suddenly be meaning anything else that its
definition in SPDX....
The result: "GPL-2.0-only" WILL NOT mean "2.0 only" no matter how much
text is written in the spec. It will mean "GPL-2.0, and we don't know if
or later applies". It will mean that, because the spec fails to give
tool writers any alternative to report.

Thanks!!

Regards,


--- David A. Wheeler


Philippe Ombredanne
 

David,

On Fri, Nov 24, 2017 at 10:33 PM, Wheeler, David A <dwheeler@...> wrote:
David A. Wheeler:
To answer that question, "it's at least GPL-2.0, and might be more"
s important information, and I think it's information that the SPDX
license expression should include.
Philippe Ombredanne [mailto:pombredanne@...]
Is this really important to know this fact in the general case?
Yes, there are a number of cases where it's important.
The usual reason is because I'm trying to link Apache-2.0 licensed code with
other code, a non-problem for GPL-2.0+ but widely considered a problem for
GPL-2.0 only. The Apache-2.0 license is extremely common.
I understand your point, but __how many times__ did you ever encounter
this case in the real world?
On my side, I have analyzed 1000+ significant software products,
10,000+ packages and billions of line of code over the last 10 years.
An issue of Apache-2.0 compatibility with the GPL-2.0 has never showed
up: zero cases, not one single time.
I am not saying it does not exist in theory, but in practice this is a
rare case that is exceptional enough and therefore best left aside.

On the other hand, there are many other cases where it's not important.

Which is why it's important to know in cases, and important to *not* track it
down when it's unimportant.
My point is that it is so rare that it is NOT important at all to
track in the license expression spec at all.
This can be dealt with comments, and anything else but not within a
license expression syntax. There are likely tens of other crooked use
cases that cannot be expressed precisely with a license expression,
yet they are too rare to consider.

Making this careful decision solely on the few characters of a license
expression would be insanely foolish IMHO.
Not at all. What matters in many circumstances is just being able to show
some sort of due diligence.
Are you serious there? Where in the actual real world anyone is
looking after "being able to show some sort of due diligence" and
consider this enough? That does not sound reasonable. Who does this? I
would have a field day looking as such a codebase.

In many cases, the "usual" situation is to copy & paste code, regardless of license or legality.
Any improvement over *that* is a big win.
Where do you get that the "usual" situation is to copy & paste code?
Based on my long experience, copy/paste of snippets is a rare event
and usually account for only a handful of items even in very large
product codebases.

And this even rarer that license or origin was not tracked then. This
is not the norm I have experience with: I ever met only a couple
confused software development team doing serious copy of un-tracked
snippets.

Now, I could not agree more with you: inaccurate and clear licensing
information means that a user will need to review this to ensure this is
clear....
This is something that needs to fixed by working with every project author...
[e.g.]... tickets I routinely file with projects that lack a clear license.
I *heartily* endorse that work, thank you!
But for every license you add,
someone creates another project with unclear licensing.
Really, do you have data to back this? Note also we should not care if
"someone creates another project with unclear licensing".
We should care if someone creates another project with unclear
licensing that someone actually uses in the real world.
The hypothetical cases of goofy licensing of unused software are not
relevant IMHO.

The *real* root causes are going to be difficult to fix:
* A large proportion of software developers are self-taught (& so don't know about
the laws), and of the rest, schools typically fail to teach CS students about software-related laws.
You can teach one, but the next developer will do the same thing.
* We have a VC/business culture that often values speed of development over legality.
* Many software developers are young & only know other young developers,
so they don't have anyone more experienced to learn from (or discount
the knowledge of those who *have* suffered the problems before).
* Many software developers, especially young/inexperienced developers,
incorrectly think that laws don't apply to software; I blame in part
the RIAA, who have successfully convinced the latest software developers
that copyright is not a real law.
* Copyright law as-written is very complex, and
is so obviously bought off by special interests, that it's difficult to defend,
and that makes it difficult to get many developers to take it seriously.
I cannot comment on these or I would come out as rude: I have no idea
where these arguments come from and what data could support any of
these.
I guess they are best opinions, but cannot be used as supporting point
for a serious argument.

You can fix a few egregious cases with tickets, and please do.
But you're *not* to fix these root causes with a few tickets.
Education is *great*, but for the foreseeable future we're going to continue to have problems.
What if this is not a few tickets but a million? This can be
crowed-sourced and distributed with appropriate leverage.

Case in point: the Linux kernel is a large and mature codebase at the
bottom of a vast ecosystem of code that runs on top of Linux.

With the work Kate and I did to help maintainers adopt SPDX ids, we now have:
1. about ~15K'ish files with a proper SPDX id
2. doc and guidance for incoming patches that has been created by some
key maintainers

This is something that is being adopted by thousands of contributors
and will spill on the whole ecosystem. And this will require only
marginal effort going forward and these efforts are distributed on all
committers and contributors. That's leverage to me.


It surely could (NB: it does not yet). that's a minor change.
e.g. something like a list of license expressions with a confidence:

- confidence: 100% , expression: GPL-2.0-only
- confidence: 60% , expression: ((GPL-2.0-only or GPL-2.0+) and MIT)
That's not a standard SPDX license expression.
Since when "GPL-2.0-only" and "((GPL-2.0-only or GPL-2.0+) and MIT)"
are not valid expressions?


SPDX license expression syntax could add a "confidence" value - but that's
more complex, and I don't think you're seriously proposing it.
I am not indeed.

Why not just a simple expression that indicates uncertainty of new versions?
This is not common enough to warrant such addition until someone can
prove otherwise.


Oh, I *understand* the proposal very well. The problem is that
I think it's ignoring some key facts on the ground.

I've said it several different ways, but I'll try again.

Many tools CANNOT determine "or any later version applies in all cases.
If there is such tool, then it should either be updated or not used at all.

They *CAN* determine if a copy of the GPL-2.0 exists.
These tools WILL NOT report "UNKNOWN", because that's useless.
People are using these tools, and will continue to do so.
So, the tools will report "GPL-2.0-only" when they see "GPL-2.0" and
don't know if "or later" applies.
If I reformulate this: There are tools that do a poor job at providing
proper results. Therefore, the spec should provide a way to support
their lack of feature? This does not make sense to me. They should
instead either adapt or die if they are not fit for the job.


Why would "GPL-2.0-only" suddenly be meaning anything else that its
definition in SPDX....
The result: "GPL-2.0-only" WILL NOT mean "2.0 only" no matter how much
text is written in the spec. It will mean "GPL-2.0, and we don't know if
or later applies". It will mean that, because the spec fails to give
tool writers any alternative to report.
I cannot understand your reasoning here.

--
Cordially
Philippe Ombredanne