SPDX should take a stronger stance against vanity/promotional licenses


Richard Fontana
 

As I've been following the issue queue for
github.com/spdx/license-list-XML/issues over the past several months,
it seems to me that you get a significant number of license
submissions like this latest one:
https://github.com/spdx/license-list-XML/issues/1790

The pattern is, someone has drafted their own license, it either isn't
being used at all in the real world or it is being used for a few
insignificant projects of the license author. In some cases the
license seems to be connected to some contemplated commercial activity
of the license submitter. Presumably SPDX license list inclusion is
seen as a way of legitimizing or popularizing the novel license. I am
quite familiar with this sort of phenomenon from my past involvement
with the OSI, where the nature of the OSI process as it was
historically defined seemed to unintentionally result in many license
submissions of this sort.

When I look at the SPDX license inclusion guidelines, I am concerned
that this sort of behavior is not sufficiently discouraged. The
guidelines say "The license has actual, substantial use such that it
is likely to be encountered. Substantial use may be demonstrated via
use in many projects, or in one or a few significant projects. For new
licenses, there are definitive plans for the license to be used in one
or a few significant projects."
But this is not one of the "definitive" factors and it is the third of
a list of non-definitive factors that are given "roughly in order of
importance". Someone might understandably conclude that "substantial
use" isn't too important to SPDX.

My main criticism of the SPDX license list from years ago was that it
was not representative of the makeup of the FOSS project world that I
was seeing in Linux distribution packages and other software I
encountered in my work. I have been engaged in trying to get the SPDX
license list to more accurately reflect the state of widely-used FOSS
today and it is frustrating to see repeated examples of vanity license
submissions. I suggest that the license inclusion principles should be
revised to elevate and perhaps strengthen the "substantial use"
requirement and the maintainers of license-list-XML should more
actively make clear that such licenses are generally inappropriate for
the SPDX license list.

Richard


Kyle Mitchell
 

If distros are seeing packaged-but-not-identified licenses
in numbers to the point of pain, I'd suggest addressing that
pain directly. Perhaps by laying a wider pipe from distros'
workflows to SPDX's.

From personal experience, the biggest blocks might actually
be the XML schema and just reading through all the process
doc. If SPDX had a special track for identification based
on calls from popular distros, and the distros could submit
plain text terms and have them formatted for inclusion by
someone else, would that flush the backlog?

I've pushed for new licenses. There's some contention for
attention. But I've never been _against_ identification of
any older terms, and they certainly have been, even as some
of mine weren't. The SPDX list is five hundred-odd
licenses. If someone cares enough about one more to present
in format for inclusion, without name contention, let it be
five hundred and one. Especially if that means less human
drudgery somewhere else.

As for motivations, I've sever seen SPDX identification as
approval. I don't expect it's ever made a license popular.
And I've yet to meet any dev who does. The proportion of
coders who even know the list exists is tiny. Those who do,
and who've spoken to me, just see a HashMap, akin to a
protocol registry or MIME type list. Not a license club.
It's clear where those are.

The motivation I've seen and felt comes from where and how
the list has been used. Not necessarily as originally
intended. I've brought licenses here because package
manager metadata warnings are annoying. I take it the
distro people might be irked in similar ways. Both probably
seem insubstantial, looking over from the other side. But a
few kB of XML file in a GitHub repo is pretty cheap cure.

--
Kyle Mitchell, attorney // Oakland // (510) 712 - 0933


Ria Schalnat (HPE)
 

+1 to Richard!

-----Original Message-----
From: Spdx-legal@... <Spdx-legal@...> On Behalf Of Richard Fontana
Sent: Tuesday, January 24, 2023 3:30 PM
To: SPDX-legal <spdx-legal@...>
Subject: SPDX should take a stronger stance against vanity/promotional licenses

As I've been following the issue queue for github.com/spdx/license-list-XML/issues over the past several months, it seems to me that you get a significant number of license submissions like this latest one:
https://github.com/spdx/license-list-XML/issues/1790

The pattern is, someone has drafted their own license, it either isn't being used at all in the real world or it is being used for a few insignificant projects of the license author. In some cases the license seems to be connected to some contemplated commercial activity of the license submitter. Presumably SPDX license list inclusion is seen as a way of legitimizing or popularizing the novel license. I am quite familiar with this sort of phenomenon from my past involvement with the OSI, where the nature of the OSI process as it was historically defined seemed to unintentionally result in many license submissions of this sort.

When I look at the SPDX license inclusion guidelines, I am concerned that this sort of behavior is not sufficiently discouraged. The guidelines say "The license has actual, substantial use such that it is likely to be encountered. Substantial use may be demonstrated via use in many projects, or in one or a few significant projects. For new licenses, there are definitive plans for the license to be used in one or a few significant projects."
But this is not one of the "definitive" factors and it is the third of a list of non-definitive factors that are given "roughly in order of importance". Someone might understandably conclude that "substantial use" isn't too important to SPDX.

My main criticism of the SPDX license list from years ago was that it was not representative of the makeup of the FOSS project world that I was seeing in Linux distribution packages and other software I encountered in my work. I have been engaged in trying to get the SPDX license list to more accurately reflect the state of widely-used FOSS today and it is frustrating to see repeated examples of vanity license submissions. I suggest that the license inclusion principles should be revised to elevate and perhaps strengthen the "substantial use"
requirement and the maintainers of license-list-XML should more actively make clear that such licenses are generally inappropriate for the SPDX license list.

Richard


J Lovejoy
 

Thanks for this write-up, Richard.

Having spent an exorbitant amount of my time over the years of my involvement in SPDX trying to politely say "no" to licenses for the reasons you describe below, I cannot begin to express how much I would welcome a way to make that easier and quicker.

(That is not to say that we should not be polite! I take a lot of joy in the congeniality of the SPDX-legal community - it's a big part of what keeps me around :)

This reminds me that I think I had submitted a PR when we were working on our "documentation release" to swap factors #2 and #3, as it seemed like the substantial use factor should be higher up the list. I think we may have even discussed this on a call. But changing the inclusion guidelines (even ordering) is a big deal and Steve reminded me that is more apt for a formal Change Proposal or its own discussion.

https://github.com/spdx/license-list-XML/blob/main/DOCS/license-inclusion-principles.md
Looking again now at how the factors are organized - we could probably do a bit better on the "ordering" and grouping than simply swapping 2 and 3. Some of the "definitive" factors aren't really factors. For example, A and D are more of threshold questions; and B is more of a policy that we always have had, but never wrote down anywhere. E is important, but not sure it's definitive (it's also a bit of a warning). Anyway, if someone wants to put some more "definitive" suggestions on paper (the Change Proposal format would be useful here, I think) that would be great. (I would, but I'm up to my ears in other things, so I won't get to it for a bit.)

Thanks,
Jilayne

On 1/24/23 5:07 PM, Ria Schalnat (HPE) wrote:

+1 to Richard!

-----Original Message-----
From: Spdx-legal@... <Spdx-legal@...> On Behalf Of Richard Fontana
Sent: Tuesday, January 24, 2023 3:30 PM
To: SPDX-legal <spdx-legal@...>
Subject: SPDX should take a stronger stance against vanity/promotional licenses

As I've been following the issue queue for github.com/spdx/license-list-XML/issues over the past several months, it seems to me that you get a significant number of license submissions like this latest one:
https://github.com/spdx/license-list-XML/issues/1790

The pattern is, someone has drafted their own license, it either isn't being used at all in the real world or it is being used for a few insignificant projects of the license author. In some cases the license seems to be connected to some contemplated commercial activity of the license submitter. Presumably SPDX license list inclusion is seen as a way of legitimizing or popularizing the novel license. I am quite familiar with this sort of phenomenon from my past involvement with the OSI, where the nature of the OSI process as it was historically defined seemed to unintentionally result in many license submissions of this sort.

When I look at the SPDX license inclusion guidelines, I am concerned that this sort of behavior is not sufficiently discouraged. The guidelines say "The license has actual, substantial use such that it is likely to be encountered. Substantial use may be demonstrated via use in many projects, or in one or a few significant projects. For new licenses, there are definitive plans for the license to be used in one or a few significant projects."
But this is not one of the "definitive" factors and it is the third of a list of non-definitive factors that are given "roughly in order of importance". Someone might understandably conclude that "substantial use" isn't too important to SPDX.

My main criticism of the SPDX license list from years ago was that it was not representative of the makeup of the FOSS project world that I was seeing in Linux distribution packages and other software I encountered in my work. I have been engaged in trying to get the SPDX license list to more accurately reflect the state of widely-used FOSS today and it is frustrating to see repeated examples of vanity license submissions. I suggest that the license inclusion principles should be revised to elevate and perhaps strengthen the "substantial use"
requirement and the maintainers of license-list-XML should more actively make clear that such licenses are generally inappropriate for the SPDX license list.

Richard













J Lovejoy
 

Hi Kyle,

You raise some specific points that highlight some things we have worked on recently, so responding here inline.

Jilayne

On 1/24/23 4:13 PM, Kyle Mitchell wrote:
If distros are seeing packaged-but-not-identified licenses
in numbers to the point of pain, I'd suggest addressing that
pain directly.  Perhaps by laying a wider pipe from distros'
workflows to SPDX's.
Richard and I have been working on that given Fedora's recent adoption of SPDX id's for Fedora's license metadata. The "pipe" is not exactly smooth or efficient at this point, but sometimes you need to open the flow and then sort out the plumbing :)

From personal experience, the biggest blocks might actually
be the XML schema and just reading through all the process
doc.  If SPDX had a special track for identification based
on calls from popular distros, and the distros could submit
plain text terms and have them formatted for inclusion by
someone else, would that flush the backlog?
We just adopted something along these lines in terms of trying to make it easier for the review process step by needing 2 (instead of 3) SPDX-legal folks to approve. See the update to the Review Process, (1)(ii) https://github.com/spdx/license-list-XML/blob/main/DOCS/request-new-license.md
We still simply need more people comfortable with reviewing and commenting, though.

To help with identifying licenses as per above, we have a new label to make it easier to spot these submissions, but have yet to go through all submissions and apply it.
https://github.com/spdx/license-list-XML/issues?q=is%3Aopen+is%3Aissue+label%3A%22used+in+major+distro%22

We are also trying to work on better documentation all around, it's coming along and has actually improved a lot recently, but always more to do and not enough hands and time!


As for motivations, I've sever seen SPDX identification as
approval.  I don't expect it's ever made a license popular.
And I've yet to meet any dev who does.  
I'm glad to hear this as I don't and never have seen those kinds of values as the role of SPDX License List. It should be rather dry.

That being said, there are, even if it's a small percentage, of people who seem to attach some (mis)placed value. I think at times, this drives their submissions. And, going on my subjective memory and experience, it certainly feels like those submissions can end up soaking up more time, as someone from SPDX-legal has to explain that their license is not accepted and why, which can often include explaining some basic facts about SPDX and the SPDX License List, and in doing so, manage the submitter's reaction. This takes valuable time and energy. We have tried in various places that are a "point of entry" to remind people to familiarize themselves with the SPDX landscape before submitting a license, but you know what they say about leading a horse to water...

I'm all ears for any better way to deal with these kinds of submissions (back to Richard's email...)


The motivation I've seen and felt comes from where and how
the list has been used.  Not necessarily as originally
intended.  I've brought licenses here because package
manager metadata warnings are annoying.  I take it the
distro people might be irked in similar ways.  Both probably
seem insubstantial, looking over from the other side.  But a
few kB of XML file in a GitHub repo is pretty cheap cure.

if we can just get people to help creating the XML files... then yes :)


Warner Losh
 



On Tue, Jan 24, 2023 at 10:56 PM J Lovejoy <opensource@...> wrote:
Thanks for this write-up, Richard.

Having spent an exorbitant amount of my time over the years of my involvement in SPDX trying to politely say "no" to licenses for the reasons you describe below, I cannot begin to express how much I would welcome a way to make that easier and quicker.

(That is not to say that we should not be polite! I take a lot of joy in the congeniality of the SPDX-legal community - it's a big part of what keeps me around :)

This reminds me that I think I had submitted a PR when we were working on our "documentation release" to swap factors #2 and #3, as it seemed like the substantial use factor should be higher up the list. I think we may have even discussed this on a call. But changing the inclusion guidelines (even ordering) is a big deal and Steve reminded me that is more apt for a formal Change Proposal or its own discussion.

https://github.com/spdx/license-list-XML/blob/main/DOCS/license-inclusion-principles.md
Looking again now at how the factors are organized - we could probably do a bit better on the "ordering" and grouping than simply swapping 2 and 3. Some of the "definitive" factors aren't really factors. For example, A and D are more of threshold questions; and B is more of a policy that we always have had, but never wrote down anywhere. E is important, but not sure it's definitive (it's also a bit of a warning). Anyway, if someone wants to put some more "definitive" suggestions on paper (the Change Proposal format would be useful here, I think) that would be great. (I would, but I'm up to my ears in other things, so I won't get to it for a bit.)

D had always bothered me a little, but mostly in the context of historically preserved licenses. The BSD, CMU and MIT license families have undergone a fair amount of copying with errors and mutation. Some of the errors and mutations are harmless, while others deserve their own license (and some are downright weird and/or require understanding the context of the changes rather than just the plain language of the change).

I've also struggled with the right way to codify these things. Richard has been fighting the good fight with the whack-a-mole-esque task of finding all the variants that survived in packages long enough to worm their way into Debian. I believe that FreeBSD has dozens of such variations that I've not even begun to sort through. I'd like to hope that if I ever did, and a good way to bucket the ones with significant differences were found, that they could be included, even though they are in some ways similar to vanity licenses, though without the full-blown fanfare some of the others have. They are historically persisting licenses from a by-gone era when license standardization hadn't happened... I see that in 'other factors' the wide-spread use of factor 3. Given the scope of the problem, I'm not at all sure how best to solve it.

I hope that any tightening of the stable texts and other guidelines designed to sweep away many vanity licenses won't sweep these historical artifacts up as well. I broadly support limiting vanity licenses because they cause nothing but grief, on the average, and rarely wind up with something good and useful that pushes the state of the art. They just add to the churn and chores of license compliance w/o offering the authors using them any better protection than other licenses, nor eased compliance burdens since they are off the 'paved path' of old standards like BSD, GPL, MIT, Apache, etc.

Warner
 
Thanks,
Jilayne

On 1/24/23 5:07 PM, Ria Schalnat (HPE) wrote:
+1 to Richard!

-----Original Message-----
From: Spdx-legal@... <Spdx-legal@...> On Behalf Of Richard Fontana
Sent: Tuesday, January 24, 2023 3:30 PM
To: SPDX-legal <spdx-legal@...>
Subject: SPDX should take a stronger stance against vanity/promotional licenses

As I've been following the issue queue for github.com/spdx/license-list-XML/issues over the past several months, it seems to me that you get a significant number of license submissions like this latest one:
https://github.com/spdx/license-list-XML/issues/1790

The pattern is, someone has drafted their own license, it either isn't being used at all in the real world or it is being used for a few insignificant projects of the license author. In some cases the license seems to be connected to some contemplated commercial activity of the license submitter. Presumably SPDX license list inclusion is seen as a way of legitimizing or popularizing the novel license. I am quite familiar with this sort of phenomenon from my past involvement with the OSI, where the nature of the OSI process as it was historically defined seemed to unintentionally result in many license submissions of this sort.

When I look at the SPDX license inclusion guidelines, I am concerned that this sort of behavior is not sufficiently discouraged. The guidelines say "The license has actual, substantial use such that it is likely to be encountered. Substantial use may be demonstrated via use in many projects, or in one or a few significant projects. For new licenses, there are definitive plans for the license to be used in one or a few significant projects."
But this is not one of the "definitive" factors and it is the third of a list of non-definitive factors that are given "roughly in order of importance". Someone might understandably conclude that "substantial use" isn't too important to SPDX.

My main criticism of the SPDX license list from years ago was that it was not representative of the makeup of the FOSS project world that I was seeing in Linux distribution packages and other software I encountered in my work. I have been engaged in trying to get the SPDX license list to more accurately reflect the state of widely-used FOSS today and it is frustrating to see repeated examples of vanity license submissions. I suggest that the license inclusion principles should be revised to elevate and perhaps strengthen the "substantial use"
requirement and the maintainers of license-list-XML should more actively make clear that such licenses are generally inappropriate for the SPDX license list.

Richard













Brian Fox
 

Maybe start assigning ids for these with a format like vanity-xxx and that might make people think twice about it and actually put some work into really explaining why they need yet-another-license that does something different from the standards so they can avoid the vanity label, which undermines the desire to create such licenses.

On Wed, Jan 25, 2023 at 11:21 AM Warner Losh <imp@...> wrote:


On Tue, Jan 24, 2023 at 10:56 PM J Lovejoy <opensource@...> wrote:
Thanks for this write-up, Richard.

Having spent an exorbitant amount of my time over the years of my involvement in SPDX trying to politely say "no" to licenses for the reasons you describe below, I cannot begin to express how much I would welcome a way to make that easier and quicker.

(That is not to say that we should not be polite! I take a lot of joy in the congeniality of the SPDX-legal community - it's a big part of what keeps me around :)

This reminds me that I think I had submitted a PR when we were working on our "documentation release" to swap factors #2 and #3, as it seemed like the substantial use factor should be higher up the list. I think we may have even discussed this on a call. But changing the inclusion guidelines (even ordering) is a big deal and Steve reminded me that is more apt for a formal Change Proposal or its own discussion.

https://github.com/spdx/license-list-XML/blob/main/DOCS/license-inclusion-principles.md
Looking again now at how the factors are organized - we could probably do a bit better on the "ordering" and grouping than simply swapping 2 and 3. Some of the "definitive" factors aren't really factors. For example, A and D are more of threshold questions; and B is more of a policy that we always have had, but never wrote down anywhere. E is important, but not sure it's definitive (it's also a bit of a warning). Anyway, if someone wants to put some more "definitive" suggestions on paper (the Change Proposal format would be useful here, I think) that would be great. (I would, but I'm up to my ears in other things, so I won't get to it for a bit.)

D had always bothered me a little, but mostly in the context of historically preserved licenses. The BSD, CMU and MIT license families have undergone a fair amount of copying with errors and mutation. Some of the errors and mutations are harmless, while others deserve their own license (and some are downright weird and/or require understanding the context of the changes rather than just the plain language of the change).

I've also struggled with the right way to codify these things. Richard has been fighting the good fight with the whack-a-mole-esque task of finding all the variants that survived in packages long enough to worm their way into Debian. I believe that FreeBSD has dozens of such variations that I've not even begun to sort through. I'd like to hope that if I ever did, and a good way to bucket the ones with significant differences were found, that they could be included, even though they are in some ways similar to vanity licenses, though without the full-blown fanfare some of the others have. They are historically persisting licenses from a by-gone era when license standardization hadn't happened... I see that in 'other factors' the wide-spread use of factor 3. Given the scope of the problem, I'm not at all sure how best to solve it.

I hope that any tightening of the stable texts and other guidelines designed to sweep away many vanity licenses won't sweep these historical artifacts up as well. I broadly support limiting vanity licenses because they cause nothing but grief, on the average, and rarely wind up with something good and useful that pushes the state of the art. They just add to the churn and chores of license compliance w/o offering the authors using them any better protection than other licenses, nor eased compliance burdens since they are off the 'paved path' of old standards like BSD, GPL, MIT, Apache, etc.

Warner
 
Thanks,
Jilayne

On 1/24/23 5:07 PM, Ria Schalnat (HPE) wrote:
+1 to Richard!

-----Original Message-----
From: Spdx-legal@... <Spdx-legal@...> On Behalf Of Richard Fontana
Sent: Tuesday, January 24, 2023 3:30 PM
To: SPDX-legal <spdx-legal@...>
Subject: SPDX should take a stronger stance against vanity/promotional licenses

As I've been following the issue queue for github.com/spdx/license-list-XML/issues over the past several months, it seems to me that you get a significant number of license submissions like this latest one:
https://github.com/spdx/license-list-XML/issues/1790

The pattern is, someone has drafted their own license, it either isn't being used at all in the real world or it is being used for a few insignificant projects of the license author. In some cases the license seems to be connected to some contemplated commercial activity of the license submitter. Presumably SPDX license list inclusion is seen as a way of legitimizing or popularizing the novel license. I am quite familiar with this sort of phenomenon from my past involvement with the OSI, where the nature of the OSI process as it was historically defined seemed to unintentionally result in many license submissions of this sort.

When I look at the SPDX license inclusion guidelines, I am concerned that this sort of behavior is not sufficiently discouraged. The guidelines say "The license has actual, substantial use such that it is likely to be encountered. Substantial use may be demonstrated via use in many projects, or in one or a few significant projects. For new licenses, there are definitive plans for the license to be used in one or a few significant projects."
But this is not one of the "definitive" factors and it is the third of a list of non-definitive factors that are given "roughly in order of importance". Someone might understandably conclude that "substantial use" isn't too important to SPDX.

My main criticism of the SPDX license list from years ago was that it was not representative of the makeup of the FOSS project world that I was seeing in Linux distribution packages and other software I encountered in my work. I have been engaged in trying to get the SPDX license list to more accurately reflect the state of widely-used FOSS today and it is frustrating to see repeated examples of vanity license submissions. I suggest that the license inclusion principles should be revised to elevate and perhaps strengthen the "substantial use"
requirement and the maintainers of license-list-XML should more actively make clear that such licenses are generally inappropriate for the SPDX license list.

Richard













James Bottomley
 

On Tue, 2023-01-24 at 21:56 -0800, J Lovejoy wrote:
Thanks for this write-up, Richard.

Having spent an exorbitant amount of my time over the years of my
involvement in SPDX trying to politely say "no" to licenses for the
reasons you describe below, I cannot begin to express how much I
would welcome a way to make that easier and quicker.

(That is not to say that we should not be polite! I take a lot of joy
in the congeniality of the SPDX-legal community - it's a big part of
what keeps me around :)
Could I make a suggestion rooted in some engineering history here. In
the early days we tried to make global lists of relevant features (IANA
port numbers, reference constants, etc) and allowed anyone to write
standards. What we found is that everyone wanted to include their
vanity projects and the various bodies we set up end up having to try
to pick winners (which is always a losing proposition).

We eventually got ourselves out of this by saying that to be
standardised, something needed existing implementations. We mostly
arranged the constants to be UUIDs so anyone can simply generate a
unique one and it only gets recorded if it proves generally useful.
You could do the same for SPDX: give a way for a project to pick a
unique tag and use it, supplying all the information the SPDX analysers
require in the LICENCES/ directory (UUIDs are probably overkill, but
you could require that they be a certain length). Then the SPDX
directory could simply become the list of abbreviations and information
for commonly found licences, so if a non-listed licence keeps turning
up, it would be up to an SPDX consumer, not the licence author, to say
"I've come across this licence 50 times in the last year in a variety
of projects, should we add it to the common list?"

James


McCoy Smith
 

-----Original Message-----
From: Spdx-legal@... <Spdx-legal@...> On Behalf Of
James Bottomley
Sent: Wednesday, January 25, 2023 9:50 AM

You could do the same for SPDX: give a way for a project to pick a unique tag
and use it, supplying all the information the SPDX analysers require in the
LICENCES/ directory (UUIDs are probably overkill, but you could require that
they be a certain length). Then the SPDX directory could simply become the
list of abbreviations and information for commonly found licences, so if a
non-listed licence keeps turning up, it would be up to an SPDX consumer, not
the licence author, to say "I've come across this licence 50 times in the last
year in a variety of projects, should we add it to the common list?"
I think this is the way to go. Ultimately, SPDX is for the consumer, not the producer. If producer wants onto the list, find some consumers to support it.


Kyle Mitchell
 

If the idea is really to hunt down every license lurking in
every potentially popular public package, I can see how
distro adoption's a real big deal. Congrats! I worry about
more work for distro people, but suppose those chasing
completeness goals like this likely have financial support.

On the process front, three ideas:

First, separate processes for "I've got a license and
champion its identification" from "I spotted a license and
think SPDX may not have it already". Create a separate
intake track for the latter, I imagine often distro people.
This would unburden those submitting just to replace
exceptions with IDs someday. They may otherwise have nothing
to say about terms, beyond what the words are and where they
found them. Put their "sightings" in a separate queue and
let people who care take them up for full submission. Those
can be people more invested in process and criteria.

Second, seriously consider requiring only text for
submissions up front, with XML coding if and when the
license moves forward. Grokking the schema and overcoming
validation errors takes time, even for the XML-astute. I see
the benefits for the tech team in the end. I also see
temptation to use the burden as a general brake on
submissions, or as a backhand "do you really care?" test.
But I don't see XML mattering to the identification
question. It becomes worthwhile only once a license gets
voted in. At that point, well versed SPDX people may be more
inclined to do in five what can take new people an hour.

Third, create a new "provisional" license status and
identify licenses awaiting significance there. Essentially
let folks call dibs on IDs. Supplement with a guideline on
to prefer prefixes like `Apache` to collision-prone
initialisms like `APSL`. Publish the list JSON with a
provisional flag, so implementers can then decide whether to
validate provisionals or not, like they choose for
deprecated. Give provisionals a holding period, say a couple
years, then either promote or deprecate. Think Lanham Act
supplemental register for lawyers, merge-behind-feature-flag
for coders.

On a personal note, I hope I can be honest about my
motivation without coming over blunt. I'm not in
license-list-XML helping clear backlog, even though I
maintain several projects using IDs, because I'm not
interested in a process that I _do_ see as passing
judgments, "approving" more than merely identifying. The
very thrust of this e-mail chain is more effectively shooing
away drafters deemed vain and projects deemed insubstantial.
Those are value judgments.

Value judgments make assessments eat more time. They open
them to controversy. They ask more of reviewers, which
contributes to backlog. I wouldn't expect reordering factors
in the factor test to change that.

If SPDX doesn't want to identify new licenses it doesn't
like, or wants to use its adoption as leverage to discourage
new forms, it should come out and say that. Those of us
building with broader needs can fork or superset.

--
Kyle Mitchell, attorney // Oakland // (510) 712 - 0933


Steve Winslow
 

Thanks all for your comments in this thread. I'm not going to try to reply here to every comment, but wanted to note a few pieces that might be informative to folks who are less deep in the SPDX license ID weeds.

Custom license IDs:

Anyone who wants to use an SPDX-format-compatible license ID for a license that isn't on the license list is able to do so, via the LicenseRef- syntax. [1] The characters for the ID are the same as those permitted for IDs on the SPDX License List: letters, digits, hyphen ("-") and period ("."). [2]

Making reusable custom license IDs:

If someone wanted to create a standalone, reusable LicenseRef- ID that implemented a UUID, or a hash of a license text, I believe they could do so just by prepending "LicenseRef-" to the UUID or hash. I suspect there are some automated tools out there that work in this manner. (Of course, it's not going to be a particularly meaningful ID on its own, but just noting it since UUIDs were mentioned in the thread.)

The challenge with using LicenseRef-, of course, is in letting people know which license text corresponds to your custom license ID. There are various ways to do this without ever talking to anyone at SPDX, including by creating your own SPDX document that defines it in an "Other License Information" section, or by following practices such as REUSE. [3]

For an approach that could enable anyone to create more meaningful custom IDs and share the corresponding license text, we've had discussions several times over the past 4+ years about creating a formalized "license namespace" format, built on top of the existing LicenseRef- syntax. This has repeatedly failed to reach consensus, in my view primarily due to disagreements about the nuances of what the syntax should look like, and I don't think there's any appetite to reopen that discussion yet again.

As a result, community members are welcome to establish informal practices for how they format LicenseRef- IDs within the permitted syntax and how they share the corresponding license text, such as via REUSE.

Standards for what goes on the SPDX License List:

I agree with Richard that the documentation should be clearer about "vanity" licenses generally being inappropriate for inclusion on the SPDX License List.

I think there is value in the license list not being just a hash of license IDs to arbitrary text. The work that the SPDX Legal community does to review and curate licenses, insert markup to group them together where appropriate, and omit licenses that are not likely to be encountered in FOSS(-ish) development, seems to be of value to downstream users of the list. If it isn't, and if downstream users do in fact want a list that is just a hash of unique IDs to arbitrary text, then anyone is of course free to implement such a list and to persuade the broader ecosystem to adopt it.

For newly-drafted licenses that are used in only one or a couple of projects (or sometimes zero projects), I agree with Richard that we often burn lots of cycles going back and forth with the license author without real benefit. I'd be in favor of bumping the "substantial use" factor higher on the License Inclusion Principles list [4]. And perhaps being more explicit in related documentation about the likelihood that vanity licenses with little usage, particularly non-FOSS licenses that fall in that category are highly unlikely to be added to the list. For a change to the inclusion principles, as Jilayne mentioned earlier I do think that a Change Proposal [5] is probably the right place to discuss the specifics of what that would look like.

Submitters of newly-drafted licenses with little-to-no usage do sometimes mention that they need their license to be added to the SPDX License List so that their software with their new license can be included in a package manager. For package managers that use license list IDs as a requirement, I'd encourage them to consider implementing and permitting LicenseRef- IDs as well. (Or, if they don't want to permit LicenseRef- IDs, then that suggests to me that they are in fact finding some value in the curation that we perform for the License List.)

And of course, to James's point: if a brand new license does see significant usage in the wild such that it is likely to be encountered in a broad set of community-developed software projects, then at that point it may be appropriate to add to the list. But I don't see value in having the SPDX License List be the first stop for a newly-drafted, non-FOSS license that is used in someone's personal project, or in having us burn cycles repeatedly explaining that.

Steve


On Wed, Jan 25, 2023 at 1:14 PM Kyle Mitchell <kyle@...> wrote:
If the idea is really to hunt down every license lurking in
every potentially popular public package, I can see how
distro adoption's a real big deal. Congrats! I worry about
more work for distro people, but suppose those chasing
completeness goals like this likely have financial support.

On the process front, three ideas:

First, separate processes for "I've got a license and
champion its identification" from "I spotted a license and
think SPDX may not have it already". Create a separate
intake track for the latter, I imagine often distro people.
This would unburden those submitting just to replace
exceptions with IDs someday. They may otherwise have nothing
to say about terms, beyond what the words are and where they
found them. Put their "sightings" in a separate queue and
let people who care take them up for full submission. Those
can be people more invested in process and criteria.

Second, seriously consider requiring only text for
submissions up front, with XML coding if and when the
license moves forward. Grokking the schema and overcoming
validation errors takes time, even for the XML-astute. I see
the benefits for the tech team in the end. I also see
temptation to use the burden as a general brake on
submissions, or as a backhand "do you really care?" test.
But I don't see XML mattering to the identification
question. It becomes worthwhile only once a license gets
voted in. At that point, well versed SPDX people may be more
inclined to do in five what can take new people an hour.

Third, create a new "provisional" license status and
identify licenses awaiting significance there. Essentially
let folks call dibs on IDs. Supplement with a guideline on
to prefer prefixes like `Apache` to collision-prone
initialisms like `APSL`. Publish the list JSON with a
provisional flag, so implementers can then decide whether to
validate provisionals or not, like they choose for
deprecated. Give provisionals a holding period, say a couple
years, then either promote or deprecate. Think Lanham Act
supplemental register for lawyers, merge-behind-feature-flag
for coders.

On a personal note, I hope I can be honest about my
motivation without coming over blunt. I'm not in
license-list-XML helping clear backlog, even though I
maintain several projects using IDs, because I'm not
interested in a process that I _do_ see as passing
judgments, "approving" more than merely identifying. The
very thrust of this e-mail chain is more effectively shooing
away drafters deemed vain and projects deemed insubstantial.
Those are value judgments.

Value judgments make assessments eat more time. They open
them to controversy. They ask more of reviewers, which
contributes to backlog. I wouldn't expect reordering factors
in the factor test to change that.

If SPDX doesn't want to identify new licenses it doesn't
like, or wants to use its adoption as leverage to discourage
new forms, it should come out and say that. Those of us
building with broader needs can fork or superset.

--
Kyle Mitchell, attorney // Oakland // (510) 712 - 0933






Max Mehl
 

+1 to everything Steve just wrote, with one comment.

For an approach that could enable anyone to create more meaningful custom IDs and share the corresponding license text, we've had discussions several times over the past 4+ years about creating a formalized "license namespace" format, built on top of the existing LicenseRef- syntax. This has repeatedly failed to reach consensus, in my view primarily due to disagreements about the nuances of what the syntax should look like, and I don't think there's any appetite to reopen that discussion yet again.

As a result, community members are welcome to establish informal practices for how they format LicenseRef- IDs within the permitted syntax and how they share the corresponding license text, such as via REUSE.
License namespaces were the first thing that came to my mind when reading the thread. Thanks that you confirmed that the proposal was never really buried for good, but just faded out - I wasn't sure.

What would be worse? People inventing incompatible practices with LicenseRef IDs (and eventually IDs that mean different licenses), or finally settling on a syntax for license namespaces, even if it's only 80% perfect?

I can see one scenario in which the former is better: making the LicenseRef hacks appear so chaotic that people strive to use proper official SPDX IDs and therefore do not add to license proliferation. In any other case, I wonder whether it shouldn't be a priority of the whole SPDX project to reach a consensus via a well-managed process.

Best,
Max

--
Max Mehl
Open Source Strategy & Governance
Enterprise-Team Chief Technology Office (CTO), T.IP E-T-378

DB Systel GmbH
Jürgen-Ponto-Platz 1, 60329 Frankfurt/M

________________________________

Pflichtangaben anzeigen<https://www.deutschebahn.com/pflichtangaben/20230105>

Nähere Informationen zur Datenverarbeitung im DB-Konzern finden Sie hier: https://www.deutschebahn.com/de/konzern/datenschutz


Kyle Mitchell
 

I was involved implementing SPDX license IDs as package
license metadata for a few package managers. How to handle
licenses that don't have IDs came up every time.

`LicenseRef-*` would get mentioned, usually because I
brought it up. Maintainers preferred to implement something
less arbitrary looking that better fit their system and
style. npm defined a magic string to point to files in
tarballs. Rust defined a separate metadata key for license
file paths rather than license expressions. I see someone
has gone back and revised the GemSpec reference with
`LicenseRef-`s, but I've never seen one in the wild.

They need an escape hatch to "whatever's in the license
file" for existing, custom-licensed packages. Some of these
are one-off commercial terms whose authors want them noticed
and read rather than abbreviated. Meanwhile, if anyone can
put whatever `LicenseRef` in their package meta, there's
potential for collisions, which means auditors have to treat
`LicenseRef-*` as "look in the tarball" no matter comes
after `LicenseRef-`.

Back on the maintainer side, the last thing they want is to
spend time refereeing yet another global namespace, for
`LicenseRef`s in addition to for package names. They're not
interested in the SPDX spec per se, just the license list,
which is grokked quickly by reading the first paragraph of
spdx.org/licenses. Some take the expression syntax,
especially if there's a preexisting parser in their
language. Others just use lists or arrays, like RubyGems.

None of the maintainers I worked with or have spoken to seem
to ascribe curation value to the license list. Some cheer
more for particular kinds of licenses SPDX has identified,
like the common permissive open licenses. But they don't use
their software to force or coerce toward them. In the end,
they're running systems that accept and distribute packages
under whatever kind of terms. That's part of being
competitive these days, especially early on in a language
hype cycle.

Whether there are five hundred licenses on the list or a
thousand doesn't matter, so long as they can automate
pulling down new list versions for their builds---which I
helped them do. They'd prefer if new versions of the list
don't trigger stampedes of bug reports about new warning
messages or validation errors. That's about it.

--
Kyle Mitchell, attorney // Oakland // (510) 712 - 0933