Re: Correct handling of snippets


Max Mehl
 

~ Gary O'Neall [2020-07-28 02:41 +0200]:
1. We would both be fine with REUSE-Snippet-Begin, REUSE-SnippetBegin,
SPDX-Snippet-Begin or SPDX-SnippetBegin, and Snippet(-)End
respectively. Would SPDX want to introduce such a tag as an addition
to the existing Snippet information in the near future, or should
REUSE take the initiative here?
[G.O.] I personally would like to include this in the SPDX spec - we just need a volunteer to create an issue or (better yet) a pull request to update the Annex E Using SPDX license list short identifiers in source files (https://github.com/spdx/spdx-spec/blob/development/v2.2.1/chapters/using-SPDX-short-identifiers-in-source-files.md#annex-e-using-spdx-license-list-short-identifiers-in-source-files-informative). I would offer help on this, but I'm pretty busy with this year's Google Summer of Code and won't be able to help much for the next couple of months.

My preference would be SPDX-Snippet-Begin or SPDX-SnippetBegin.
Thank you! I've opened a Pull Request, but it only touches Annex E. I
wondered whether we also have to clarify other snippet specifics in
snippet-information.md subsequently, but see more here:

https://github.com/spdx/spdx-spec/pull/464

3. For license, we would prefer SPDX-License-Identifier. This is the tag
people use for declaring licensing of their files, but it could be
applicable for snippets as well, so in their enclosed context.

SPDX-LicenseInfoInSnippet might be the "official" way how to do it,
but to be brutally honest, I find this counter-intuitive and very
hard to memorise. I know that License-Identifier has become the
unloved child for a few people because of the lack of CamelCase and
clear context e.g. to files, but it's already out there, well-known,
and accepted. So I would suggest to use it for snippets as well.
[G.O] Is there a possible ambiguity of an SPDX-License-Identifier is associated with a file or a snippet?
For unaware tools, perhaps. They would detect that there are multiple
License-Identifiers (is this legal in SPDX?), but this way at least they
would know about the potentially differently licensed code in the file.

For tools, it should not be hard to detect whether License-Identifier is
inside a snippet or not. In my PR's description I explain why
"Snippet-License-Identifier" might be even more confusing to users.

Another question was raised regarding nesting of snippets, so the strange case
when a third-party code that I would like to use as a snippet would contain a
third-party snippet already. In this case, to also be compatible with the current
SPDX info on snippets, we would suggest to not allow nested snippets but
instead mandate that a snippet has to end in order for the next snippet to be
able to begin. Would you agree?
[G.O.] To be honest, I haven't considered the nesting of Snippets. Un-nested snippets are complex enough ;) In an SPDX document nesting is allowed since they are expressed with byte ranges and there is no rule to prevent nesting or even overlapping snippets. When marking snippets inline, it is a bit more challenging. I would definitely disallow overlapping snippets (e.g. Snippet A is lines 1 through 20 and Snippet B is lines 10 through 30). Nesting may be useful, however but it would significantly complicate the tooling. I don't feel strongly, but I tend to agree with the proposal that nesting not be allowed.
Great to know we're on the same page here then ;)

Best,
Max

--
Max Mehl - Programme Manager - Free Software Foundation Europe
Contact and information: https://fsfe.org/about/mehl | @mxmehl
Become a supporter of software freedom: https://fsfe.org/join

Join {Spdx-legal@lists.spdx.org to automatically receive all group messages.