Idea: SPDX-DCO-File-License-Identifier

Richard Fontana

I've thought some more about certain unintended problems some of us
were previously discussing regarding the use of
SPDX-License-Identifier: in source files. In particular it's occurred
to me that the practice is in tension with the use of the Developer
Certificate of Origin. This is significant given that Linux is the
most important project to have adopted both the DCO and the use of

Somewhat problematically, the DCO makes reference to a license of a
*file* in its certification: "The contribution was created in whole or
in part by me and I have the right to submit it under the open source
license indicated in the file." Most open source projects (including
possibly most DCO-using open source projects, a tiny minority of open
source projects) do not use explicit licensing of source files, a
practice I understand SPDX at least implicitly disapproves of. In
typical situations, the use of SPDX-License-Identifier: has the nice
feature of clarifying what the "open source license indicated in the
file" is in a standard way.

But for any source file that properly uses SPDX-License-Identifier and
is based on multiply-licensed code (for example, a file properly
reflecting contributions under both GPLv2 and the 3-clause BSD
license), the DCO-related benefit goes away. I don't know if there is
an example like this in the kernel, but suppose the kernel had such a
source file, saying "SPDX-License-Identifier: GPL-2.0-only AND
BSD-3-Clause". What is the "license indicated in the file" for
purposes of the DCO? Prior to the use of SPDX-License-Identifier, the
variously-worded GPL headers might have made it sufficiently clear
that the license for purposes of the DCO is GPLv2 (GPL-2.0-only). Or
the situation was ambiguous but you could then rely on the note from
Linus Torvalds in the kernel COPYING file (or wherever that lives
now). Now, it appears that a DCO-certifying contributor to a file that
already has (based on an analysis of past-incorporated license
notices) "SPDX-License-Identifier: GPL-2.0-only AND BSD-3-Clause" is
certifying that they have the right to submit it under the stated
composite license. It's not even clear this is non-nonsensical -- how
can a single isolable contribution by one person be licensed under two
different open source licenses in a conjunctive, not disjunctive,
sense? (As I said previously, the snippet construct is probably not
going to be practical to use in most cases and I'm not sure it solves
the problem anyway.)

So I am wondering if it would be a good solution to introduce an
additional construct like "SPDX-DCO-File-License-Identifier" for those
cases involving source files where SPDX-License-Identifier: would
imply a licensing policy at odds with the actual policy of the
project. For example, you could have a source file with the following

SPDX-License-Identifier: GPL-2.0-only AND BSD-3-Clause
SPDX-DCO-File-License-Identifier: GPL-2.0-only

Note this is different from the issue that McCoy was asking about in
an earlier thread, but I think it is somewhat related.

The issue of course is not really limited to projects using the DCO,
but rather any project that informally or otherwise adopts an
"inbound=outbound" approach to licensing of contributions, which is
the vast majority of open source projects (so maybe a solution
shouldn't seem to be DCO-specific). A concern I have is that the use
of SPDX-License-Identifier: may be unintentionally optimized for the
use of that minority of projects that use asymmetric contributor
license agreements to handle licensing in of contributions.


On Mon, Jul 18, 2022 at 8:14 PM Richard Fontana <rfontana@...> wrote:

I feel like what some projects might find useful is something like:


since these might point to different licenses. The snippet construct
can possibly express this adequately in some cases but I think
reliable identification of a snippet will normally be impractical.


On Sun, Jul 17, 2022 at 3:18 PM McCoy Smith <mccoy@...> wrote:

At the risk of sounding like I’m hijacking this to re-raise my prior issue:
If AND is the operator to be used when having different inbound vs outbound, then AND may not be commutative, since the order of listing the licenses may convey information about which license is inbound vs outbound, and (maybe) which license applies to different parts of the code.
Which militates to me toward a new expression, but I’ve made that point already.

On Jul 17, 2022, at 11:22 AM, Richard Fontana <rfontana@...> wrote:

I'm working on some draft documentation for Fedora around use of SPDX
expressions in RPM spec file License: fields. I was surprised to
apparently not see anything in the SPDX spec that says that the AND
and OR operators are commutative. I want to assert that the expression
"MIT AND Apache-2.0" is equivalent to "Apache-2.0 AND MIT". Does the
SPDX spec actually take no position on this?