Re: [spdx-tech] An example of a super simple SPDX licenses registry, for discussion
toggle quoted message Show quoted text
IMO the "ideal" here is that there is some automated way of "fingerprinting"[G.O.] There are a few pieces of this in place. The SPDX legal team has developed matching guidelines (https://spdx.org/spdx-license-list/matching-guidelines) and implemented a template language to express variability in license text which agrees with the matching guidelines (https://spdx.org/spdx-specification-21-web-version#h.2mjng0vqrghe). There is an implementation of these guidelines in an algorithm (https://github.com/spdx/tools/blob/master/src/org/spdx/compare/LicenseCompareHelper.java). Currently, when you submit a license, this algorithm is invoked and the license you are submitted is compared to all approved licenses. This algorithm is also used when we generate the HTML and data formats of the license.
There are a few inhibitors to realizing Jeff's "Ideal" solution. The solution relies on templates expressed in the license XML. Not all licenses have the templates fully implemented. If you compare licenses and expect a match and it doesn't match, it likely is due to some missing templatization. Another inhibitor is the algorithm is rather slow which prevents its use in broader license scanning solutions. There are likely some bugs/improvements that can be made as well.
Contributions are more than welcome to overcome these limitations. To improve the license templates, contribute XML improvements to the LicenseList-XML repo: https://github.com/spdx/license-list-XML/blob/master/CONTRIBUTING.md To improve the algorithm, contribute to the SPDX tools repo: https://github.com/spdx/tools/blob/master/CONTRIBUTING.md
There are some good active discussions ongoing on improving the license submittal process, but I'll leave comments on the process to the legal team members.