Correct handling of snippets


Max Mehl
 

Hi all,

At REUSE, we currently discuss how to correctly handling snippets from a
third party, potentially under a different license [^1]. Since we strive
to make as much use of SPDX as possible, I wonder about how you would
solve this.

I saw that SPDX uses the following tags instead of FileCopyrightText and
License-Identifier:

* SPDX-SnippetCopyrightText: Foo Bar
* SPDX-SnippetLicenseConcluded: CC-BY-SA-4.0

This raised a bunch of questions:

* How would one mark the begin and end of a snippet?
* "LicenseConcluded" is quite different from the well-known
License-Identifier [^2], so not very intuitive for developers. Is
there some kind of alias that people can use?
* Is was asked how one could refer the source of the snippet.
"SnippetLicenseComments"?


Best,
Max


[^1]: https://lists.fsfe.org/pipermail/reuse/2020q1/000051.html

[^2]: I am aware that SPDX favours camel-case tag names nowadays.

--
Max Mehl - Programme Manager - Free Software Foundation Europe
Contact and information: https://fsfe.org/about/mehl | @mxmehl
Become a supporter of software freedom: https://fsfe.org/join


James Bottomley
 

On Fri, 2020-06-05 at 16:15 +0200, Max Mehl wrote:
Hi all,

At REUSE, we currently discuss how to correctly handling snippets
from a third party, potentially under a different license [^1]. Since
we strive to make as much use of SPDX as possible, I wonder about how
you would solve this.

I saw that SPDX uses the following tags instead of FileCopyrightText
and License-Identifier:

* SPDX-SnippetCopyrightText: Foo Bar
* SPDX-SnippetLicenseConcluded: CC-BY-SA-4.0

This raised a bunch of questions:

* How would one mark the begin and end of a snippet?
* "LicenseConcluded" is quite different from the well-known
License-Identifier [^2], so not very intuitive for developers. Is
there some kind of alias that people can use?
* Is was asked how one could refer the source of the snippet.
"SnippetLicenseComments"?
I really think this is a recipe for disaster. What's wrong with simply
keeping the licence of the file? since to be contributed, the snippet
must be compatible with it. To put it another way, why treat a snippet
(a cut and paste) differently from a usual contribution? If someone
really wants to cut a function out of Linux and put it in BSD based on
the theory that it's originally a snippet under a permissive licence,
they'll have to do a lot of legal analysis anyway.

The problem you'll run into if you track snippets differently is that
once a snippet inside a file is modified, under the DCO the
modifications are under the licence "of the file" not of the snippet,
so the newly derived snippet is now unconditionally under the licence
of the file and that would make any separate tracking of the snippet
licence wrong unless someone manually keep it in sync, which is not a
burden any maintainer wants.

The way we handle explicitly allowing code to move from Linux to BSD
(usually in the area of drivers) is to make the *file* licence dual
GPLv2/BSD to it's unequivocally agreed that every contribution is under
both licences and thus a cut and paste from anywhere in the file is OK
to go under a sole BSD licensed project.

James


Gary O'Neall
 

We spent quite a bit of time discussing snippets in the SPDX technical working group. There are definitely a number of issues and considerations.

At the conclusion of the discussions, there was a consensus that denoting snippets in an SPDX document was required for several use cases and was a common scenario in JavaScript / Node environments.

You could use the SPDX term "LicenseInfoInSnippet" since you are including the license information directly in the copied snippet. I've always treated this similar to the declared license for packages.

In terms of marking the start and end of the snippet, I don't know of any existing SPDX tags that would help. Within the SPDX document, we use a byte range. This would be rather impractical within the file containing the snippets. The proposal in the referenced thread of using a tag at the start and end of the snippet range looks like it would work.

Gary

-----Original Message-----
From: Spdx-legal@... <Spdx-legal@...> On Behalf Of
James Bottomley
Sent: Friday, June 5, 2020 9:08 AM
To: Max Mehl <max.mehl@...>; spdx-legal@...
Subject: Re: Correct handling of snippets

On Fri, 2020-06-05 at 16:15 +0200, Max Mehl wrote:
Hi all,

At REUSE, we currently discuss how to correctly handling snippets from
a third party, potentially under a different license [^1]. Since we
strive to make as much use of SPDX as possible, I wonder about how you
would solve this.

I saw that SPDX uses the following tags instead of FileCopyrightText
and License-Identifier:

* SPDX-SnippetCopyrightText: Foo Bar
* SPDX-SnippetLicenseConcluded: CC-BY-SA-4.0

This raised a bunch of questions:

* How would one mark the begin and end of a snippet?
* "LicenseConcluded" is quite different from the well-known
License-Identifier [^2], so not very intuitive for developers. Is
there some kind of alias that people can use?
* Is was asked how one could refer the source of the snippet.
"SnippetLicenseComments"?
I really think this is a recipe for disaster. What's wrong with simply keeping the
licence of the file? since to be contributed, the snippet must be compatible with
it. To put it another way, why treat a snippet (a cut and paste) differently from
a usual contribution? If someone really wants to cut a function out of Linux and
put it in BSD based on the theory that it's originally a snippet under a permissive
licence, they'll have to do a lot of legal analysis anyway.

The problem you'll run into if you track snippets differently is that once a
snippet inside a file is modified, under the DCO the modifications are under the
licence "of the file" not of the snippet, so the newly derived snippet is now
unconditionally under the licence of the file and that would make any separate
tracking of the snippet licence wrong unless someone manually keep it in sync,
which is not a burden any maintainer wants.

The way we handle explicitly allowing code to move from Linux to BSD (usually
in the area of drivers) is to make the *file* licence dual GPLv2/BSD to it's
unequivocally agreed that every contribution is under both licences and thus a
cut and paste from anywhere in the file is OK to go under a sole BSD licensed
project.

James



James Bottomley
 

On Fri, 2020-06-05 at 10:58 -0700, Gary O'Neall wrote:
We spent quite a bit of time discussing snippets in the SPDX
technical working group. There are definitely a number of issues and
considerations.

At the conclusion of the discussions, there was a consensus that
denoting snippets in an SPDX document was required for several use
cases and was a common scenario in JavaScript / Node environments.
To be clear: I'm agnostic about identifying snippets in a fixed work.
REUSE looks to be trying to apply snippets to living open source code
and the case that specifically concerns me is where the snippet is
under a different, but compatible, licence from the main file. How is
the case where a contribution modifies the snippet and thus changes the
licence supposed to be handled?

James


Max Mehl
 

Hi all,

At REUSE, we found some time to wrap our brains around this issue again.

~ Gary O'Neall [2020-06-05 19:58 +0200]:
You could use the SPDX term "LicenseInfoInSnippet" since you are
including the license information directly in the copied snippet.
I've always treated this similar to the declared license for packages.

In terms of marking the start and end of the snippet, I don't know of
any existing SPDX tags that would help. Within the SPDX document, we
use a byte range. This would be rather impractical within the file
containing the snippets. The proposal in the referenced thread of
using a tag at the start and end of the snippet range looks like it
would work.
Thanks for the suggestions. I think what we need is a consensus on 1.
how to mark the begin/end of a snippet, 2. how to mark copyright, and
3. how to mark the license of this snippet [^1].

1. We would both be fine with REUSE-Snippet-Begin, REUSE-SnippetBegin,
SPDX-Snippet-Begin or SPDX-SnippetBegin, and Snippet(-)End
respectively. Would SPDX want to introduce such a tag as an addition
to the existing Snippet information in the near future, or should
REUSE take the initiative here?

2. For copyright, we would prefer SPDX-SnippetCopyrightText as an
equivalent to SPDX-FileCopyrightText

3. For license, we would prefer SPDX-License-Identifier. This is the tag
people use for declaring licensing of their files, but it could be
applicable for snippets as well, so in their enclosed context.

SPDX-LicenseInfoInSnippet might be the "official" way how to do it,
but to be brutally honest, I find this counter-intuitive and very
hard to memorise. I know that License-Identifier has become the
unloved child for a few people because of the lack of CamelCase and
clear context e.g. to files, but it's already out there, well-known,
and accepted. So I would suggest to use it for snippets as well.


Another question was raised regarding nesting of snippets, so the
strange case when a third-party code that I would like to use as a
snippet would contain a third-party snippet already. In this case, to
also be compatible with the current SPDX info on snippets, we would
suggest to not allow nested snippets but instead mandate that a snippet
has to end in order for the next snippet to be able to begin. Would you
agree?


Looking forward to your replies and a constructive discussion!

Best,
Max



[^1]: To James' concerns: I would like to ignore the potential license
compliance problems with snippets carrying incompatible licenses for
this thread. Let's discuss first how devs can actually communicate that
they've used a third-party snippet and under which conditions, before we
think about how compliance might have to be handled in various (edge)
cases.

--
Max Mehl - Programme Manager - Free Software Foundation Europe
Contact and information: https://fsfe.org/about/mehl | @mxmehl
Become a supporter of software freedom: https://fsfe.org/join


Gary O'Neall
 

Hi Max,


1. We would both be fine with REUSE-Snippet-Begin, REUSE-SnippetBegin,
SPDX-Snippet-Begin or SPDX-SnippetBegin, and Snippet(-)End
respectively. Would SPDX want to introduce such a tag as an addition
to the existing Snippet information in the near future, or should
REUSE take the initiative here?
[G.O.] I personally would like to include this in the SPDX spec - we just need a volunteer to create an issue or (better yet) a pull request to update the Annex E Using SPDX license list short identifiers in source files (https://github.com/spdx/spdx-spec/blob/development/v2.2.1/chapters/using-SPDX-short-identifiers-in-source-files.md#annex-e-using-spdx-license-list-short-identifiers-in-source-files-informative). I would offer help on this, but I'm pretty busy with this year's Google Summer of Code and won't be able to help much for the next couple of months.

My preference would be SPDX-Snippet-Begin or SPDX-SnippetBegin.

2. For copyright, we would prefer SPDX-SnippetCopyrightText as an
equivalent to SPDX-FileCopyrightText
[G.O.] Agree

3. For license, we would prefer SPDX-License-Identifier. This is the tag
people use for declaring licensing of their files, but it could be
applicable for snippets as well, so in their enclosed context.

SPDX-LicenseInfoInSnippet might be the "official" way how to do it,
but to be brutally honest, I find this counter-intuitive and very
hard to memorise. I know that License-Identifier has become the
unloved child for a few people because of the lack of CamelCase and
clear context e.g. to files, but it's already out there, well-known,
and accepted. So I would suggest to use it for snippets as well.
[G.O] Is there a possible ambiguity of an SPDX-License-Identifier is associated with a file or a snippet?

Another question was raised regarding nesting of snippets, so the strange case
when a third-party code that I would like to use as a snippet would contain a
third-party snippet already. In this case, to also be compatible with the current
SPDX info on snippets, we would suggest to not allow nested snippets but
instead mandate that a snippet has to end in order for the next snippet to be
able to begin. Would you agree?
[G.O.] To be honest, I haven't considered the nesting of Snippets. Un-nested snippets are complex enough ;) In an SPDX document nesting is allowed since they are expressed with byte ranges and there is no rule to prevent nesting or even overlapping snippets. When marking snippets inline, it is a bit more challenging. I would definitely disallow overlapping snippets (e.g. Snippet A is lines 1 through 20 and Snippet B is lines 10 through 30). Nesting may be useful, however but it would significantly complicate the tooling. I don't feel strongly, but I tend to agree with the proposal that nesting not be allowed.


Looking forward to your replies and a constructive discussion!

Best,
Max



[^1]: To James' concerns: I would like to ignore the potential license
compliance problems with snippets carrying incompatible licenses for this
thread. Let's discuss first how devs can actually communicate that they've
used a third-party snippet and under which conditions, before we think about
how compliance might have to be handled in various (edge) cases.

--
Max Mehl - Programme Manager - Free Software Foundation Europe Contact
and information: https://fsfe.org/about/mehl | @mxmehl Become a
supporter of software freedom: https://fsfe.org/join


Max Mehl
 

~ Gary O'Neall [2020-07-28 02:41 +0200]:
1. We would both be fine with REUSE-Snippet-Begin, REUSE-SnippetBegin,
SPDX-Snippet-Begin or SPDX-SnippetBegin, and Snippet(-)End
respectively. Would SPDX want to introduce such a tag as an addition
to the existing Snippet information in the near future, or should
REUSE take the initiative here?
[G.O.] I personally would like to include this in the SPDX spec - we just need a volunteer to create an issue or (better yet) a pull request to update the Annex E Using SPDX license list short identifiers in source files (https://github.com/spdx/spdx-spec/blob/development/v2.2.1/chapters/using-SPDX-short-identifiers-in-source-files.md#annex-e-using-spdx-license-list-short-identifiers-in-source-files-informative). I would offer help on this, but I'm pretty busy with this year's Google Summer of Code and won't be able to help much for the next couple of months.

My preference would be SPDX-Snippet-Begin or SPDX-SnippetBegin.
Thank you! I've opened a Pull Request, but it only touches Annex E. I
wondered whether we also have to clarify other snippet specifics in
snippet-information.md subsequently, but see more here:

https://github.com/spdx/spdx-spec/pull/464

3. For license, we would prefer SPDX-License-Identifier. This is the tag
people use for declaring licensing of their files, but it could be
applicable for snippets as well, so in their enclosed context.

SPDX-LicenseInfoInSnippet might be the "official" way how to do it,
but to be brutally honest, I find this counter-intuitive and very
hard to memorise. I know that License-Identifier has become the
unloved child for a few people because of the lack of CamelCase and
clear context e.g. to files, but it's already out there, well-known,
and accepted. So I would suggest to use it for snippets as well.
[G.O] Is there a possible ambiguity of an SPDX-License-Identifier is associated with a file or a snippet?
For unaware tools, perhaps. They would detect that there are multiple
License-Identifiers (is this legal in SPDX?), but this way at least they
would know about the potentially differently licensed code in the file.

For tools, it should not be hard to detect whether License-Identifier is
inside a snippet or not. In my PR's description I explain why
"Snippet-License-Identifier" might be even more confusing to users.

Another question was raised regarding nesting of snippets, so the strange case
when a third-party code that I would like to use as a snippet would contain a
third-party snippet already. In this case, to also be compatible with the current
SPDX info on snippets, we would suggest to not allow nested snippets but
instead mandate that a snippet has to end in order for the next snippet to be
able to begin. Would you agree?
[G.O.] To be honest, I haven't considered the nesting of Snippets. Un-nested snippets are complex enough ;) In an SPDX document nesting is allowed since they are expressed with byte ranges and there is no rule to prevent nesting or even overlapping snippets. When marking snippets inline, it is a bit more challenging. I would definitely disallow overlapping snippets (e.g. Snippet A is lines 1 through 20 and Snippet B is lines 10 through 30). Nesting may be useful, however but it would significantly complicate the tooling. I don't feel strongly, but I tend to agree with the proposal that nesting not be allowed.
Great to know we're on the same page here then ;)

Best,
Max

--
Max Mehl - Programme Manager - Free Software Foundation Europe
Contact and information: https://fsfe.org/about/mehl | @mxmehl
Become a supporter of software freedom: https://fsfe.org/join