Re: [spdx-tech] An example of a super simple SPDX licenses registry, for discussion
On Mon, Mar 11, 2019 at 10:32 PM Richard Fontana <rfontana@...> wrote:
Use of "LicenseRef" (not to mention something likeAgreed. What I am trying to achieve here is to make these become "standard" and
known at SPDX. I think this is possible.
On Sun, Mar 10, 2019 at 12:44 PM Jeff McAffer
This ideal works in theory but for several reasons I outline below would beIMO the "ideal" here is that there is some automated way of
too brittle in practice as you would have different fingerprints too often for
this to be working. Instead running a full license detection is a better way
to dedupe things. And this requires some form of centralization but could be
fully automated alright. The other thing is that IMO giving a name/id does
matter a lot: the license named 43bdf298 is not really human friendly.
Now even if license-text-fingerprint-as-id were to work out, the difficult part
is not so much the algorithm for computing these, but the content you feed for
fingerprinting. And that part is not easily to automate:
- For instance, is a copyright part of the license or not (I think not, but
- Or what about statements around a license? For instance these two SPDX
licenses may not really deserve a different id yet they have one:
The LICENSE file in the original code archives does not have a patent
disclaimer statement footer seen in bzip2-1.0.5's SPDX license text.
That footer is present on the archive.org website only. I would not treat
this as part of the license, but this was treated as part of it here. This
is a judgment call.
- Or for instance, there are 6+ version of the text of the GPL-2.0 which are
really the same but would fingerprint differently.
Therefore a fingerprint algorithm would be hard to generalize as there would be
many exceptions or a simple one would be too brittle in too many cases.
Deduping is best achieved by license detection with a full diff (which
is what scancode does FWIW).
Let me follow up with my suggestion.