Topics

Names of licenses we currently support / where should license text live?

Jeff Luszcz
 

Hi Kate et al,
As we discussed on the call a few times, I think having this amazing list of license in one place is a great asset to the community and I believe will help reduce license proliferation (esp. if spdx.org, Linux foundation, OSI, etc... continue to work on anointing certain license as preferred.)

One of my concerns in having the SPDX document only contain links to these reference licenses instead of the actual full text is that we have the chance of drift and incompleteness a few years down the road, especially if the list of licenses we anointed as "reference license" becomes as large as it looks like it is becoming.

We see analogies to this in our day to day license analysis in these current cases:

Files that say "see License.txt file for more info" and the License.txt is missing
"See http://www.gnu.org/licenses/lgpl.html" in a file where this used to mean lgpl 2.1 in 2006, it now means lgpl 3.0 since the link target text was changed by the FSF
"Download from my university page http://www.ccsf.edu/~someStudent" which is now gone and no longer alive
"This is under a BSD license" when in fact they've added Copyleft style terms or other strange things to the actual license text.

My thoughts:
A SPDX doc should be completely self contained for long term validity, but can reference out to the spdx.org web site as a hook for optional data that may appear down the road
Some organizations have serious confidential concerns about outside web links/dependencies in Intellectual Property reports such as SPDX
By this I mean, if to render or validate the text of a license for a spdx report an organization has to hit the spdx.org website, this may cause leakage of confidential info
Having a large list of references licenses is great, especially if common names can be created for them
Template licenses / references are great for scanning tool verification / spec compliance etc but the SPDX doc should contain the actual text of the license in effect


Regards,
Jeff

-----Original Message-----
From: spdx-bounces@... [mailto:spdx-bounces@...] On
Behalf Of kate.stewart@...
Sent: Thursday, August 26, 2010 4:30 PM
To: spdx@...; dmg@...
Subject: Re: Names of licenses we currently support

Thanks Daniel. As the reference page for each license and its header
emerge - cross checking against this is going to be useful.

Hmm, am a little concerned about putting some of them into the standard
set of reference licenses. ... in particular some I think should be
flagged as exceptions so they get looked at. ie. BeerWareV42 ;)
SameTermsAs and SeeFile are likely not licences but references to
licenses.

Can I assume that the mapping of names to the actual license search
strings can be found in Ninka sources?

In particular the BSD and MIT varients look worrisome. Any volunteers to
review the details there and make some recommendations?

Kate

--- On Thu, 8/26/10, D M German <dmg@...> wrote:

From: D M German <dmg@...>
Subject: Names of licenses we currently support
To: spdx@...
Date: Thursday, August 26, 2010, 5:39 PM

these the licenses we currently identify. Look particularly
at the BSDs
and MIts. Some are not licenses but their exception
statements. I have
the feeling that these cover around 75-86% of files in
Debian/Fedora
(the source code files that have a license)

--dmg



AGPLv3+
Apachev1.1
Apachev2
artifex
ArtisticLicensev1
autoConfException
BeerWareVer42
BindMITX11Var
BisonException
boost
boostV1
BSD1
BSD2
BSD2AdvInsteadOfBinary
BSD2aic700
BSD2EndorseInsteadOfBinary
BSD2SoftAndDoc
BSD2var1
BSD2var2
BSD3
BSD3NoWarranty
BSD4
BSD4NoEndor
BSDCairoStyleWarr
BSDdovecotStyle
BSDOnlyAdv
CDDLic
CDDLicV1
CDDLv1orGPLv2
Cecill
ClassPathException
CPLv0.5
CPLv1
dovecotSeeCopying
DoWhatTheFuckYouWantv2
emacsLic
EPLv1
FreeType
GhostscriptGPL
GPLnoVersion
GPLv1
GPLv1+
GPLv1orArtistic
GPLv2
GPLv2+
GPLv2orLGPLv2.1
GPLv2orv3
GPLv2orv3qtException
GPLv3
GPLv3+
IBMv1
intelBSDLicense
InterACPILic
kerberos
LesserGPLnoVersion
LesserGPLv2
LesserGPLv2+
LesserGPLv2.1
LesserGPLv2.1+
LesserGPLv3
LesserGPLv3+
LGPLv2
LGPLv2+
LGPLv2_1
LGPLv2.1
LGPLv2.1+
LGPLv2_1orv3
LGPLv2MISTAKE
LGPLv2+MISTAKE
LGPLv2orv3
LGPLv3
LGPLv3+
LibGCJLic
LibraryGPLv2
LibraryGPLv2+
LinkException
LinkExceptionBison
LinkExceptionGPL
LinkExceptionLeGPL
LinkExceptionOpenSSL
MITandGPL
MITCMU
MITCMUvar2
MITCMUvar3
MITmodern
MITold
MIToldMichiganVersion
MIToldwithoutSell
MIToldwithoutSellandNoDocumentationRequi
MIToldwithoutSellCMUVariant
MITVariant
MITX11BSDvar
MITX11noNotice
MITX11NoSellNoDocDocBSDvar
MITX11simple
MPL1_1andLGPLv2_1
MPLGPL2orLGPLv2_1
MPL_LGPLsee
MPL-MIT-dual
MPLv1_0
MPLv1_1
MX4J
MX4JLicensev1
NCSA
NPLv1_0
NPLv1_1
openSSL
openSSLvar1
openSSLvar2
openSSLvar3
phpLicV3.01
Postfix
postgresql
publicDomain
QtGPLv2or3
QTv1
SameAsPerl
SameTermsAs
SeeFile
sequenceLic
SimpleLic
simpleLic
simpleLic2
simpleLicense1
SimpleOnlyKeepCopyright
SleepyCat
SSLeay
subversion
subversion+
subversionError
sunRPC
SunSimpleLic
svnkit
svnkit+
tmate+
W3CLic
WxException
X11
X11CMU
X11Festival
X11mit
zendv2
ZLIB
ZLIBref


--
--
Daniel M. German

http://turingmachine.org/
http://silvernegative.com/
dmg (at) uvic (dot) ca
replace (at) with @ and (dot) with .
_______________________________________________
Spdx mailing list
Spdx@...
https://fossbazaar.org/mailman/listinfo/spdx
_______________________________________________
Spdx mailing list
Spdx@...
https://fossbazaar.org/mailman/listinfo/spdx

dmg
 

Jeff Luszcz twisted the bytes to say:


Jeff> Hi Kate et al,
As we discussed on the call a few times, I think having this amazing
list of license in one place is a great asset to the community and I
believe will help reduce license proliferation (esp. if spdx.org,
Linux foundation, OSI, etc... continue to work on anointing certain
license as preferred.)

Just a small comment. Our license identification tool has very different
goals than SPDX. We need to identify as many as we can (even those that
are references, and mark them as such).

In my view, SPDX should be good enough to document the majority of
MIT/BSD variants, plus the common licenses (GPL, LGPL, MPL, EPL). This,
in my opinion, should cover around 80-85 percent of files with a
license.

The problem lies on the MIT and BSD variants, and I think the SPDX
committee is being blind to that problem. I have made this point
repeatedly, so I will not push it any further.

Ultimately, somebody will have to write a filter that goes from
Fossology/Ninka to SPDX accepted licenses (and names) and their XML
format. With almost certainly, if a license is the SPDX list, Ninka will
be able to identify it.

--dmg


--
Daniel M. German
http://turingmachine.org/
http://silvernegative.com/
dmg (at) uvic (dot) ca
replace (at) with @ and (dot) with .

Peter Williams <peter.williams@...>
 

On 8/27/10 2:47 PM, Jeff Luszcz wrote:
Hi Kate et al,
As we discussed on the call a few times, I think having this amazing list of license in one place is a great asset to the community and I believe will help reduce license proliferation (esp. if spdx.org, Linux foundation, OSI, etc... continue to work on anointing certain license as preferred.)

One of my concerns in having the SPDX document only contain links to these reference licenses instead of the actual full text is that we have the chance of drift and incompleteness a few years down the road, especially if the list of licenses we anointed as "reference license" becomes as large as it looks like it is becoming.

We see analogies to this in our day to day license analysis in these current cases:

Files that say "see License.txt file for more info" and the License.txt is missing
"See http://www.gnu.org/licenses/lgpl.html" in a file where this used to mean lgpl 2.1 in 2006, it now means lgpl 3.0 since the link target text was changed by the FSF
"Download from my university page http://www.ccsf.edu/~someStudent" which is now gone and no longer alive
"This is under a BSD license" when in fact they've added Copyleft style terms or other strange things to the actual license text.

My thoughts:
A SPDX doc should be completely self contained for long term validity, but can reference out to the spdx.org web site as a hook for optional data that may appear down the road
Some organizations have serious confidential concerns about outside web links/dependencies in Intellectual Property reports such as SPDX
By this I mean, if to render or validate the text of a license for a spdx report an organization has to hit the spdx.org website, this may cause leakage of confidential info
Having a large list of references licenses is great, especially if common names can be created for them
Template licenses / references are great for scanning tool verification / spec compliance etc but the SPDX doc should contain the actual text of the license in effect
Once a license is "approved" and placed in the repo it should be immutable. That way there is no chance of the text changing once the license name is in use.

To prevent links going defunct we could use PURL[1][2]. PURL is a permanent URL service provided under the auspices of the OCLC[3] (the library cooperative). PURL is widely used in the RDF, IETF and W3C communities for URIs that need to remain valid permanently.

Providing an optional way to embed the license text could still be useful. If we do allow an "approved" license to be specified along with it's license text the spec should clarify what the semantics are if the license text in the SPDX file doesn't match the license text of the named license in the license repo. Should it be treated as a custom license, and tools would ignore the specified license? Or should that constitute an error? Or should license text be ignore in favor of the license name?

Peter

[1]: http://purl.org
[2]: http://purl.oclc.org/docs/faq.html
[3]: http://oclc.org

dmg
 

Peter> Once a license is "approved" and placed in the repo it should be
Peter> immutable. That way there is no chance of the text changing once the
Peter> license name is in use.

Perhaps this is a good reason to go minimalistic in the very first
version (perhaps even not include ANY license at all). As people use the
draft it will become more clear what are the challenges of including
licenses in the standard, and potential pitfalls.

After all, if the license is not spdx-named, then it will have to be
included verbatim in the XML doc, which is not a bad thing. It can be
pragmatically upgraded once SPDX decides what licenses to include.

--dmg

--
Daniel M. German
http://turingmachine.org/
http://silvernegative.com/
dmg (at) uvic (dot) ca
replace (at) with @ and (dot) with .

RUFFIN MICHEL
 

The problem is that except for licenses like GPL or Apache2 a lot of licenses MIT, BSD, Apache1.1 contain a part which is different from one license to another (such as the copyright and for old BSD the acknowledgement). And most licenses contain the obligation to propagate the copyright/license. So if you do not keep a copy of such license, the day you want to properly package your product with hundreds of open sources, you have again to do the job to look for most licenses.

Michel
Michel.Ruffin@..., PhD
Software Coordination Manager, Bell Labs, Corporate CTO Dpt
Distinguished Member of Technical Staff
Tel +33 (0) 1 3077 7045
Alcatel-Lucent HQ, Centre de Villarceaux
Route De Villejust, 91620 Nozay, France

-----Message d'origine-----
De : spdx-bounces@... [mailto:spdx-bounces@...] De la part de D M German
Envoyé : mardi 31 août 2010 07:45
À : spdx@...
Objet : Re: Names of licenses we currently support / where should license text live?



Peter> Once a license is "approved" and placed in the repo it should be
Peter> immutable. That way there is no chance of the text changing once the
Peter> license name is in use.

Perhaps this is a good reason to go minimalistic in the very first
version (perhaps even not include ANY license at all). As people use the
draft it will become more clear what are the challenges of including
licenses in the standard, and potential pitfalls.

After all, if the license is not spdx-named, then it will have to be
included verbatim in the XML doc, which is not a bad thing. It can be
pragmatically upgraded once SPDX decides what licenses to include.

--dmg

--
Daniel M. German
http://turingmachine.org/
http://silvernegative.com/
dmg (at) uvic (dot) ca
replace (at) with @ and (dot) with .
_______________________________________________
Spdx mailing list
Spdx@...
https://fossbazaar.org/mailman/listinfo/spdx

dmg
 

RUFFIN> The problem is that except for licenses like GPL or Apache2 a
RUFFIN> lot of licenses MIT, BSD, Apache1.1 contain a part which is
RUFFIN> different from one license to another (such as the copyright
RUFFIN> and for old BSD the acknowledgement). And most licenses
RUFFIN> contain the obligation to propagate the copyright/license. So
RUFFIN> if you do not keep a copy of such license, the day you want to
RUFFIN> properly package your product with hundreds of open sources,
RUFFIN> you have again to do the job to look for most licenses.

The license would be embedded in the SPDX file. In fact, you will have
all different licenses in a single place (the SPDX file) for every
project. No need to go back to the source, if it hasn't changed.

Next versions of the SPDX will allow you to extract the licenses from
the SPDX and name them.


By the way, Ninka is not bad at extracting this data. Here are two
examples This is a nice one:

* Copyright (c) 2001 Marko Kreen
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.

that results into:

AllRights,0,Copyright (c) 2001 Marko Kreen ,,
BSDpre,70,,<colon>
BSDcondSource,70,,,above ,,
BSDcondBinary,70,,,,
BSDasIs,10,,,THE AUTHOR AND CONTRIBUTORS ,,A,
BSDWarr,70,,,THE AUTHOR OR CONTRIBUTORS,

----------------------------------------------------------------------
And this one is more complicated:


* Copyright (c) 1983, 1990, 1993
* The Regents of the University of California. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* 3. Neither the name of the University nor the names of its contributors
* may be used to endorse or promote products derived from this software
* without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE. */

that results into (yes, the all rights sentence misses the copyright
owner because it is in a different one):

AllRights,0,,,
BSDpre,70,,<colon>
BSDcondSource,70,,,above ,,
BSDcondBinary,70,,,,
BSDcondEndorse,70,,,,the University nor the names of its contributors,specific
BSDasIs,10,,,THE REGENTS AND CONTRIBUTORS ,,A,
BSDWarr,70,,,THE REGENTS OR CONTRIBUTORS,


--
--
Daniel M. German
http://turingmachine.org/
http://silvernegative.com/
dmg (at) uvic (dot) ca
replace (at) with @ and (dot) with .