Re: [spdx-tech] An example of a super simple SPDX licenses registry, for discussion


Gary O'Neall
 

-----Original Message-----
From: Spdx-legal@... <Spdx-legal@...> On Behalf Of via
Lists.Spdx.Org
Sent: Saturday, March 9, 2019 2:55 PM
To: pombredanne@...; SPDX-legal <spdx-legal@...>
Cc: Spdx-legal@...
Subject: Re: [spdx-tech] An example of a super simple SPDX licenses registry, for
discussion
...
...
IMO the "ideal" here is that there is some automated way of "fingerprinting"
license texts such that two parties, given more or less the same text, can
independently come up with the same id. At that point you would not need a
registry, just a shared algorithm. When/if eventually SPDX does recognize a
given license and gives it a formal id, there could be a relatively simple aliasing
step where SPDX id "SomeCoolLicense-1.0" is AKA "LicenseRef-43bdf298"
[G.O.] There are a few pieces of this in place. The SPDX legal team has developed matching guidelines (https://spdx.org/spdx-license-list/matching-guidelines) and implemented a template language to express variability in license text which agrees with the matching guidelines (https://spdx.org/spdx-specification-21-web-version#h.2mjng0vqrghe). There is an implementation of these guidelines in an algorithm (https://github.com/spdx/tools/blob/master/src/org/spdx/compare/LicenseCompareHelper.java). Currently, when you submit a license, this algorithm is invoked and the license you are submitted is compared to all approved licenses. This algorithm is also used when we generate the HTML and data formats of the license.

There are a few inhibitors to realizing Jeff's "Ideal" solution. The solution relies on templates expressed in the license XML. Not all licenses have the templates fully implemented. If you compare licenses and expect a match and it doesn't match, it likely is due to some missing templatization. Another inhibitor is the algorithm is rather slow which prevents its use in broader license scanning solutions. There are likely some bugs/improvements that can be made as well.

Contributions are more than welcome to overcome these limitations. To improve the license templates, contribute XML improvements to the LicenseList-XML repo: https://github.com/spdx/license-list-XML/blob/master/CONTRIBUTING.md To improve the algorithm, contribute to the SPDX tools repo: https://github.com/spdx/tools/blob/master/CONTRIBUTING.md

There are some good active discussions ongoing on improving the license submittal process, but I'll leave comments on the process to the legal team members.

Gary

Join {Spdx-legal@lists.spdx.org to automatically receive all group messages.