Re: License of an open source license text

J Lovejoy

Hi all,

Thanks Till for weighing in here!

I think there are two general issues that come up here:

(a) A technical question: When generating SPDX data at the file level, how does one identify the LICENSE.txt file?
Various ideas have been raised here. Some of you might be interested to know (if you weren’t here or don’t remember) that when we were discussing the change to GPL-x.y-or-later and GPL-x.y-only identifiers, one proposal that circulated was to keep the “plain” GPL-2.0 identifier for the purpose of identifying when one finds the text of the license alone which does not indicate whether it’s “only” or “or later” because this is indicated in the license header or with the SPDX identifier. (Unfortunately, this proposal did not win out for unrelated reasons). 

(b) A legal question: what is the license for the LICENSE.txt file itself?

I think there is a different question that is being missed:
(c) Does it matter? (or Do I need to know the license of the LICENSE.txt file itself? or Is there a license for the LICENSE.txt file itself?)

Because if the answer to (c) is - it doesn’t matter, I don’t need to know, and no, then that answers (b) and “solving” a) or how you answer a) doesn’t really matter much either.

Question (b) has come up several times over the years, always answered with various levels of detailed copyright analysis, as well as pragmatism (see above) by the excellent lawyers on this list. The people asking are not lawyers (but in some cases, have not been satisfied with answers by several lawyers…) 

I’d like to emphasize a few things Till said below:
I have no interest to know how the license text is licensed itself. 


I have an interest to know whether or not the license text is identical to
the original one 

This is what really matters. If I find a LICENSE.txt file and it’s an exact match to MIT - why wouldn’t I simply identify it as MIT?  I guess I don’t understand why having a new license identifier is needed or how that helps.  I’d be really curious to hear what other lawyers think on this bit - as we are the ones who are going to consume/review the license fields part of the SPDX data.

But in any case, let’s please start with the understanding that the license of the LICENSE.txt file doesn’t matter. Mostly because it’s generally understood that text in legal agreements is not copyrightable or (for the pragmatic approach) shouldn’t be and/or no one cares for it to be. Legal agreements, in their best form, convey a “meeting of the minds” between the parties in a way that’s clear and remains clear over time. As lawyers, we always copy well-written legal agreements. It would be silly (and wildly inefficient) not to. 

The (very few) open source licenses that do have a copyright notice or some other such communication as to the license text itself, I would interpret more as an artifact of trying to prevent license proliferation or at least encourage people to name the license something else, so avoid confusion (now we have scanners that can and SPDX identifiers to help too).


On Jun 18, 2020, at 3:52 PM, Till Jaeger via <> wrote:

Hi all,

I have some remarks from a lawyer's perspective who is scanning source code
and/or has to deal with the results from scanning.

It is helpful if the license text file is differently identified from
licensed source files. There are some reasons for that:
- This license text is not licensed under itself.
- The information can be misleading. The LGPL-2.1 would be LGPL-2.1-only
although all source files might be LGPL-2.1-or-later
- It is good to know whether or not the license text is included in a source
package (and not just referenced). Accordingly, you know if adding the
license text is needed.

Identifiers like "LicenseRef-GPL-3.0-license-text" would be great since you
can see on first view what is in the license file.

I have no interest to know how the license text is licensed itself. All
known FOSS licenses allow copying and distribution. More is not needed.

I have an interest to know whether or not the license text is identical to
the original one (or modified/shortened).

Not sure if this is helpful for you but I hope so.

Best regards,


Am 18.06.20 um 16:32 schrieb Philippe Ombredanne:
Hi Richard:

On Thu, Jun 18, 2020 at 2:57 PM Richard Purdie wrote:

Just to be really clear, the license ID of a given specific
package *is* correct and definitive. What is unclear is the license of
the license information.

The challenge is that one software project can be split into multiple
binary packages and those binary packages can have finer grained

For example, gcc which contains libgcc. gcc is GPL-3.0 and libgcc is
the under the runtime license exception. We specifically mark the
binary packages with the correct license.

This isn't enough for some legal departments and some licenses, we have
to have the full license text somewhere. We have options:

a) Include the full license text in every binary package
b) Have a licence package per test and require each binary package to
depend on that license package
c) As per b) but have the package management or tools figure out the
dependencies if requested
d) Have a license package per piece of software containing all the
licensing texts for that piece of software.

There are pros and cons for all of these, some of the issues are very
significant, particularly in a constrained embedded system. Rightly or
wrongly, we have d) implemented today and this is consistent with what
other distros like Debian do (although they merge docs and license
info, we split them).

Also, this assumes the licenses can be split into specific individual
chunks. I suspect in some cases this is not possible.

The question is what license is that package in d) under.

Then in this case you can take the same approach as Debian's
packaging: your package in d) can be under its own license unrelated
to the license of the things it contains.

You could state that the license of the packaging of these license
data is under a CC0-1.0. You are not making any assertion about the
license of the licenses which are under whatever license they may be;
and whatever these may be are self-contained in their own license

This is the approach I take in scancode.
I bundle thousand license texts and I am not reporting any specific
license for these license texts..
Instead I am only declaring that the license data set is under CC0-1.0

As an aside, this might make scancode's [1] processing a little more
complicated ... but this could be fixed if we know we are looking at
the license of Yocto packages somehow.

Join to automatically receive all group messages.