SPDX: license equivalence rules
Peterson, Scott K (HP Legal) <scott.k.peterson@...>
Ths comment is NOT about what the normalization should be or what equivalences should be permitted. Rather, I suggest a different approach to how we represent the result of the agreed upon normalization/equivalence rules.
I suggest reconceiving "templatization" as defining "match rules".
Templatization seems to me to be a process of partially applying the match rules to take a step toward comparison. It is not apparent to me what the value is of giving that particular intermediate document special status.
I see a danger in two versions. Which contains the authoritative information? Whenever there are two, there is danger of them becoming misaligned.
There should be a single, canonical text. Applying the match rules against that canonical text and a candidate text would yield the authoritative answer to the question of whether the candidate text corresponds to the license represented by the canonical text.
Given the rules, anyone would be free to pre-process the canonical texts into whatever sorts of intermediate versions they thought would facilitate performance of their comparison tool. Choosing a particular intermediate version seems to add unnecessary complexity.
-- Scott
|
|
Kate Stewart <kate.stewart@...>
Hi Scott,
On Wed, 2011-03-23 at 16:33 +0000, Peterson, Scott K (HP Legal) wrote: Ths comment is NOT about what the normalization should be or whatThe problem is that the today the various tools have nothing to check against to make sure that they are applying the rules correctly. The templatized version is intended as a tool checker, that the right substitutions can be recognized rather than as an alternate human readable reference. The golden reference should always be whats on the official public web site of a license - which is human readable. The official authoritative version will be copied onto the SPDX web site verbatim. The proposal is to have the processed version there as well, and marked as such, so that when there are disagreements between various tools doing license recognition and asserting the short form, they have a common comparison point. Its intended more like an answer sheet for a teacher administering a test to students to know what answers are ok, and which aren't. For instance, I believe that Daniel German (Ninka tool) and Bob Gobeille (FOSSology tool) get together from time to time (or intended to last I talked to them about it, last year) to talk about why their tools don't recognize same licenses. Having a templatized license text would aid future tool creators (open source as well as commercial vendors) to check that they are able to recognize a license accurately before asserting the short form. It is meant to illustrates what should happen when the match rules are applied. As a check for the tools, and to build confidence that the match rules the spdx-legal team is comfortable, are applied consistently. The authoritative version is the version on the project's public web site. In some cases the OSI site has a copy and is used as the authoritative version though. We copy that version onto the SPDX web page for convenience, as well as, the link to the authoritative public site we get this from. On the SPDX web page, we'll also be adding the "templatized" version as a convenience, after the rules have been applied to the original authorized version, so folks can see what the results of applying the match rules yields ("the answer sheet" for the test to continue with my earlier analogy). The single canonical text will be copied verbatim (spaces, capitalization, etc. ) intact from the authorative web site for that license. The templatized version is just the result of applying the match rules to the authorative version. We should definitely take care to make this VERY clear on the web site. see comments above. Kate |
|
Peter Williams <peter.williams@...>
I think having some examples of text with the normalization rules applied is a good idea. However those examples should be in the spec. Having to go to the registry to see examples will make it harder to implement the normalization algorithm. If the only use of the normalized text for standard licenses is for example purposes, I don't think we really need to do all the licenses. Not having the normalized text in the registry would make its design easier. (The versioning issues are particularly non-trivial.) Peter On Mar 23, 2011 10:21 AM, "Kate Stewart" <kate.stewart@...> wrote: > Hi Scott, > > On Wed, 2011-03-23 at 16:33 +0000, Peterson, Scott K (HP Legal) wrote: >> Ths comment is NOT about what the normalization should be or what >> equivalences should be permitted. Rather, I suggest a different >> approach to how we represent the result of the agreed upon >> normalization/equivalence rules. >> >> >> >> I suggest reconceiving "templatization" as defining "match rules". >> > > The problem is that the today the various tools have nothing to check > against to make sure that they are applying the rules correctly. > > The templatized version is intended as a tool checker, that the right > substitutions can be recognized rather than as an alternate human > readable reference. The golden reference should always be whats on the > official public web site of a license - which is human readable. The > official authoritative version will be copied onto the SPDX web site > verbatim. The proposal is to have the processed version there as well, > and marked as such, so that when there are disagreements between various > tools doing license recognition and asserting the short form, they have > a common comparison point. > > Its intended more like an answer sheet for a teacher administering a > test to students to know what answers are ok, and which aren't. > > For instance, I believe that Daniel German (Ninka tool) and Bob > Gobeille (FOSSology tool) get together from time to time (or intended to > last I talked to them about it, last year) to talk about why their tools > don't recognize same licenses. Having a templatized license text would > aid future tool creators (open source as well as commercial vendors) to > check that they are able to recognize a license accurately before > asserting the short form. > > It is meant to illustrates what should happen when the match rules are > applied. > >> >> Templatization seems to me to be a process of partially applying the >> match rules to take a step toward comparison. It is not apparent to me >> what the value is of giving that particular intermediate document >> special status. >> > > As a check for the tools, and to build confidence that the match rules > the spdx-legal team is comfortable, are applied consistently. > >> >> I see a danger in two versions. Which contains the authoritative >> information? Whenever there are two, there is danger of them becoming >> misaligned. >> > The authoritative version is the version on the project's public web > site. In some cases the OSI site has a copy and is used as the > authoritative version though. > > We copy that version onto the SPDX web page for convenience, as well as, > the link to the authoritative public site we get this from. > > On the SPDX web page, we'll also be adding the "templatized" version as > a convenience, after the rules have been applied to the original > authorized version, so folks can see what the results of applying the > match rules yields ("the answer sheet" for the test to continue with my > earlier analogy). > >> >> There should be a single, canonical text. Applying the match rules >> against that canonical text and a candidate text would yield the >> authoritative answer to the question of whether the candidate text >> corresponds to the license represented by the canonical text. >> > > The single canonical text will be copied verbatim (spaces, > capitalization, etc. ) intact from the authorative web site for that > license. > > The templatized version is just the result of applying the match rules > to the authorative version. > > We should definitely take care to make this VERY clear on the web site. > >> >> Given the rules, anyone would be free to pre-process the canonical >> texts into whatever sorts of intermediate versions they thought would >> facilitate performance of their comparison tool. Choosing a particular >> intermediate version seems to add unnecessary complexity. >> > > see comments above. > > Kate > > > > > _______________________________________________ > Spdx-legal mailing list > Spdx-legal@... > https://fossbazaar.org/mailman/listinfo/spdx-legal |
|