Date
1 - 13 of 13
SPDX should take a stronger stance against vanity/promotional licenses
Richard Fontana
As I've been following the issue queue for
github.com/spdx/license-list-XML/issues over the past several months, it seems to me that you get a significant number of license submissions like this latest one: https://github.com/spdx/license-list-XML/issues/1790 The pattern is, someone has drafted their own license, it either isn't being used at all in the real world or it is being used for a few insignificant projects of the license author. In some cases the license seems to be connected to some contemplated commercial activity of the license submitter. Presumably SPDX license list inclusion is seen as a way of legitimizing or popularizing the novel license. I am quite familiar with this sort of phenomenon from my past involvement with the OSI, where the nature of the OSI process as it was historically defined seemed to unintentionally result in many license submissions of this sort. When I look at the SPDX license inclusion guidelines, I am concerned that this sort of behavior is not sufficiently discouraged. The guidelines say "The license has actual, substantial use such that it is likely to be encountered. Substantial use may be demonstrated via use in many projects, or in one or a few significant projects. For new licenses, there are definitive plans for the license to be used in one or a few significant projects." But this is not one of the "definitive" factors and it is the third of a list of non-definitive factors that are given "roughly in order of importance". Someone might understandably conclude that "substantial use" isn't too important to SPDX. My main criticism of the SPDX license list from years ago was that it was not representative of the makeup of the FOSS project world that I was seeing in Linux distribution packages and other software I encountered in my work. I have been engaged in trying to get the SPDX license list to more accurately reflect the state of widely-used FOSS today and it is frustrating to see repeated examples of vanity license submissions. I suggest that the license inclusion principles should be revised to elevate and perhaps strengthen the "substantial use" requirement and the maintainers of license-list-XML should more actively make clear that such licenses are generally inappropriate for the SPDX license list. Richard |
|
Kyle Mitchell
If distros are seeing packaged-but-not-identified licenses
in numbers to the point of pain, I'd suggest addressing that pain directly. Perhaps by laying a wider pipe from distros' workflows to SPDX's. From personal experience, the biggest blocks might actually be the XML schema and just reading through all the process doc. If SPDX had a special track for identification based on calls from popular distros, and the distros could submit plain text terms and have them formatted for inclusion by someone else, would that flush the backlog? I've pushed for new licenses. There's some contention for attention. But I've never been _against_ identification of any older terms, and they certainly have been, even as some of mine weren't. The SPDX list is five hundred-odd licenses. If someone cares enough about one more to present in format for inclusion, without name contention, let it be five hundred and one. Especially if that means less human drudgery somewhere else. As for motivations, I've sever seen SPDX identification as approval. I don't expect it's ever made a license popular. And I've yet to meet any dev who does. The proportion of coders who even know the list exists is tiny. Those who do, and who've spoken to me, just see a HashMap, akin to a protocol registry or MIME type list. Not a license club. It's clear where those are. The motivation I've seen and felt comes from where and how the list has been used. Not necessarily as originally intended. I've brought licenses here because package manager metadata warnings are annoying. I take it the distro people might be irked in similar ways. Both probably seem insubstantial, looking over from the other side. But a few kB of XML file in a GitHub repo is pretty cheap cure. -- Kyle Mitchell, attorney // Oakland // (510) 712 - 0933 |
|
+1 to Richard!
toggle quoted message
Show quoted text
-----Original Message-----
From: Spdx-legal@... <Spdx-legal@...> On Behalf Of Richard Fontana Sent: Tuesday, January 24, 2023 3:30 PM To: SPDX-legal <spdx-legal@...> Subject: SPDX should take a stronger stance against vanity/promotional licenses As I've been following the issue queue for github.com/spdx/license-list-XML/issues over the past several months, it seems to me that you get a significant number of license submissions like this latest one: https://github.com/spdx/license-list-XML/issues/1790 The pattern is, someone has drafted their own license, it either isn't being used at all in the real world or it is being used for a few insignificant projects of the license author. In some cases the license seems to be connected to some contemplated commercial activity of the license submitter. Presumably SPDX license list inclusion is seen as a way of legitimizing or popularizing the novel license. I am quite familiar with this sort of phenomenon from my past involvement with the OSI, where the nature of the OSI process as it was historically defined seemed to unintentionally result in many license submissions of this sort. When I look at the SPDX license inclusion guidelines, I am concerned that this sort of behavior is not sufficiently discouraged. The guidelines say "The license has actual, substantial use such that it is likely to be encountered. Substantial use may be demonstrated via use in many projects, or in one or a few significant projects. For new licenses, there are definitive plans for the license to be used in one or a few significant projects." But this is not one of the "definitive" factors and it is the third of a list of non-definitive factors that are given "roughly in order of importance". Someone might understandably conclude that "substantial use" isn't too important to SPDX. My main criticism of the SPDX license list from years ago was that it was not representative of the makeup of the FOSS project world that I was seeing in Linux distribution packages and other software I encountered in my work. I have been engaged in trying to get the SPDX license list to more accurately reflect the state of widely-used FOSS today and it is frustrating to see repeated examples of vanity license submissions. I suggest that the license inclusion principles should be revised to elevate and perhaps strengthen the "substantial use" requirement and the maintainers of license-list-XML should more actively make clear that such licenses are generally inappropriate for the SPDX license list. Richard |
|
J Lovejoy
Thanks for this write-up, Richard.
toggle quoted message
Show quoted text
Having spent an exorbitant amount of my time over the years of my involvement in SPDX trying to politely say "no" to licenses for the reasons you describe below, I cannot begin to express how much I would welcome a way to make that easier and quicker. (That is not to say that we should not be polite! I take a lot of joy in the congeniality of the SPDX-legal community - it's a big part of what keeps me around :) This reminds me that I think I had submitted a PR when we were working on our "documentation release" to swap factors #2 and #3, as it seemed like the substantial use factor should be higher up the list. I think we may have even discussed this on a call. But changing the inclusion guidelines (even ordering) is a big deal and Steve reminded me that is more apt for a formal Change Proposal or its own discussion. https://github.com/spdx/license-list-XML/blob/main/DOCS/license-inclusion-principles.md Looking again now at how the factors are organized - we could probably do a bit better on the "ordering" and grouping than simply swapping 2 and 3. Some of the "definitive" factors aren't really factors. For example, A and D are more of threshold questions; and B is more of a policy that we always have had, but never wrote down anywhere. E is important, but not sure it's definitive (it's also a bit of a warning). Anyway, if someone wants to put some more "definitive" suggestions on paper (the Change Proposal format would be useful here, I think) that would be great. (I would, but I'm up to my ears in other things, so I won't get to it for a bit.) Thanks, Jilayne On 1/24/23 5:07 PM, Ria Schalnat (HPE)
wrote:
+1 to Richard! -----Original Message----- From: Spdx-legal@... <Spdx-legal@...> On Behalf Of Richard Fontana Sent: Tuesday, January 24, 2023 3:30 PM To: SPDX-legal <spdx-legal@...> Subject: SPDX should take a stronger stance against vanity/promotional licenses As I've been following the issue queue for github.com/spdx/license-list-XML/issues over the past several months, it seems to me that you get a significant number of license submissions like this latest one: https://github.com/spdx/license-list-XML/issues/1790 The pattern is, someone has drafted their own license, it either isn't being used at all in the real world or it is being used for a few insignificant projects of the license author. In some cases the license seems to be connected to some contemplated commercial activity of the license submitter. Presumably SPDX license list inclusion is seen as a way of legitimizing or popularizing the novel license. I am quite familiar with this sort of phenomenon from my past involvement with the OSI, where the nature of the OSI process as it was historically defined seemed to unintentionally result in many license submissions of this sort. When I look at the SPDX license inclusion guidelines, I am concerned that this sort of behavior is not sufficiently discouraged. The guidelines say "The license has actual, substantial use such that it is likely to be encountered. Substantial use may be demonstrated via use in many projects, or in one or a few significant projects. For new licenses, there are definitive plans for the license to be used in one or a few significant projects." But this is not one of the "definitive" factors and it is the third of a list of non-definitive factors that are given "roughly in order of importance". Someone might understandably conclude that "substantial use" isn't too important to SPDX. My main criticism of the SPDX license list from years ago was that it was not representative of the makeup of the FOSS project world that I was seeing in Linux distribution packages and other software I encountered in my work. I have been engaged in trying to get the SPDX license list to more accurately reflect the state of widely-used FOSS today and it is frustrating to see repeated examples of vanity license submissions. I suggest that the license inclusion principles should be revised to elevate and perhaps strengthen the "substantial use" requirement and the maintainers of license-list-XML should more actively make clear that such licenses are generally inappropriate for the SPDX license list. Richard |
|
J Lovejoy
Hi Kyle,
You raise some specific points that highlight some things we have worked on recently, so responding here inline. Jilayne On 1/24/23 4:13 PM, Kyle Mitchell
wrote:
Richard and I have been working on that given Fedora's recent adoption of SPDX id's for Fedora's license metadata. The "pipe" is not exactly smooth or efficient at this point, but sometimes you need to open the flow and then sort out the plumbing :)If distros are seeing packaged-but-not-identified licenses in numbers to the point of pain, I'd suggest addressing that pain directly. Perhaps by laying a wider pipe from distros' workflows to SPDX's. We just adopted something along these lines in terms of trying to make it easier for the review process step by needing 2 (instead of 3) SPDX-legal folks to approve. See the update to the Review Process, (1)(ii) https://github.com/spdx/license-list-XML/blob/main/DOCS/request-new-license.mdFrom personal experience, the biggest blocks might actually be the XML schema and just reading through all the process doc. If SPDX had a special track for identification based on calls from popular distros, and the distros could submit plain text terms and have them formatted for inclusion by someone else, would that flush the backlog? We still simply need more people comfortable with reviewing and commenting, though. To help with identifying licenses as per above, we have a new label to make it easier to spot these submissions, but have yet to go through all submissions and apply it. https://github.com/spdx/license-list-XML/issues?q=is%3Aopen+is%3Aissue+label%3A%22used+in+major+distro%22 We are also trying to work on better documentation all around, it's coming along and has actually improved a lot recently, but always more to do and not enough hands and time! I'm glad to hear this as I don't and never have seen those kinds of values as the role of SPDX License List. It should be rather dry.As for motivations, I've sever seen SPDX identification as approval. I don't expect it's ever made a license popular. And I've yet to meet any dev who does. That being said, there are, even if it's a small percentage, of people who seem to attach some (mis)placed value. I think at times, this drives their submissions. And, going on my subjective memory and experience, it certainly feels like those submissions can end up soaking up more time, as someone from SPDX-legal has to explain that their license is not accepted and why, which can often include explaining some basic facts about SPDX and the SPDX License List, and in doing so, manage the submitter's reaction. This takes valuable time and energy. We have tried in various places that are a "point of entry" to remind people to familiarize themselves with the SPDX landscape before submitting a license, but you know what they say about leading a horse to water... I'm all ears for any better way to deal with these kinds of submissions (back to Richard's email...) if we can just get people to help creating the XML files... then yes :)The motivation I've seen and felt comes from where and how the list has been used. Not necessarily as originally intended. I've brought licenses here because package manager metadata warnings are annoying. I take it the distro people might be irked in similar ways. Both probably seem insubstantial, looking over from the other side. But a few kB of XML file in a GitHub repo is pretty cheap cure. |
|
Warner Losh
On Tue, Jan 24, 2023 at 10:56 PM J Lovejoy <opensource@...> wrote:
D had always bothered me a little, but mostly in the context of historically preserved licenses. The BSD, CMU and MIT license families have undergone a fair amount of copying with errors and mutation. Some of the errors and mutations are harmless, while others deserve their own license (and some are downright weird and/or require understanding the context of the changes rather than just the plain language of the change). I've also struggled with the right way to codify these things. Richard has been fighting the good fight with the whack-a-mole-esque task of finding all the variants that survived in packages long enough to worm their way into Debian. I believe that FreeBSD has dozens of such variations that I've not even begun to sort through. I'd like to hope that if I ever did, and a good way to bucket the ones with significant differences were found, that they could be included, even though they are in some ways similar to vanity licenses, though without the full-blown fanfare some of the others have. They are historically persisting licenses from a by-gone era when license standardization hadn't happened... I see that in 'other factors' the wide-spread use of factor 3. Given the scope of the problem, I'm not at all sure how best to solve it. I hope that any tightening of the stable texts and other guidelines designed to sweep away many vanity licenses won't sweep these historical artifacts up as well. I broadly support limiting vanity licenses because they cause nothing but grief, on the average, and rarely wind up with something good and useful that pushes the state of the art. They just add to the churn and chores of license compliance w/o offering the authors using them any better protection than other licenses, nor eased compliance burdens since they are off the 'paved path' of old standards like BSD, GPL, MIT, Apache, etc. Warner
|
|
Brian Fox
Maybe start assigning ids for these with a format like vanity-xxx and that might make people think twice about it and actually put some work into really explaining why they need yet-another-license that does something different from the standards so they can avoid the vanity label, which undermines the desire to create such licenses. On Wed, Jan 25, 2023 at 11:21 AM Warner Losh <imp@...> wrote:
|
|
James Bottomley
On Tue, 2023-01-24 at 21:56 -0800, J Lovejoy wrote:
Thanks for this write-up, Richard.Could I make a suggestion rooted in some engineering history here. In the early days we tried to make global lists of relevant features (IANA port numbers, reference constants, etc) and allowed anyone to write standards. What we found is that everyone wanted to include their vanity projects and the various bodies we set up end up having to try to pick winners (which is always a losing proposition). We eventually got ourselves out of this by saying that to be standardised, something needed existing implementations. We mostly arranged the constants to be UUIDs so anyone can simply generate a unique one and it only gets recorded if it proves generally useful. You could do the same for SPDX: give a way for a project to pick a unique tag and use it, supplying all the information the SPDX analysers require in the LICENCES/ directory (UUIDs are probably overkill, but you could require that they be a certain length). Then the SPDX directory could simply become the list of abbreviations and information for commonly found licences, so if a non-listed licence keeps turning up, it would be up to an SPDX consumer, not the licence author, to say "I've come across this licence 50 times in the last year in a variety of projects, should we add it to the common list?" James |
|
McCoy Smith
toggle quoted message
Show quoted text
-----Original Message-----I think this is the way to go. Ultimately, SPDX is for the consumer, not the producer. If producer wants onto the list, find some consumers to support it. |
|
Kyle Mitchell
If the idea is really to hunt down every license lurking in
every potentially popular public package, I can see how distro adoption's a real big deal. Congrats! I worry about more work for distro people, but suppose those chasing completeness goals like this likely have financial support. On the process front, three ideas: First, separate processes for "I've got a license and champion its identification" from "I spotted a license and think SPDX may not have it already". Create a separate intake track for the latter, I imagine often distro people. This would unburden those submitting just to replace exceptions with IDs someday. They may otherwise have nothing to say about terms, beyond what the words are and where they found them. Put their "sightings" in a separate queue and let people who care take them up for full submission. Those can be people more invested in process and criteria. Second, seriously consider requiring only text for submissions up front, with XML coding if and when the license moves forward. Grokking the schema and overcoming validation errors takes time, even for the XML-astute. I see the benefits for the tech team in the end. I also see temptation to use the burden as a general brake on submissions, or as a backhand "do you really care?" test. But I don't see XML mattering to the identification question. It becomes worthwhile only once a license gets voted in. At that point, well versed SPDX people may be more inclined to do in five what can take new people an hour. Third, create a new "provisional" license status and identify licenses awaiting significance there. Essentially let folks call dibs on IDs. Supplement with a guideline on to prefer prefixes like `Apache` to collision-prone initialisms like `APSL`. Publish the list JSON with a provisional flag, so implementers can then decide whether to validate provisionals or not, like they choose for deprecated. Give provisionals a holding period, say a couple years, then either promote or deprecate. Think Lanham Act supplemental register for lawyers, merge-behind-feature-flag for coders. On a personal note, I hope I can be honest about my motivation without coming over blunt. I'm not in license-list-XML helping clear backlog, even though I maintain several projects using IDs, because I'm not interested in a process that I _do_ see as passing judgments, "approving" more than merely identifying. The very thrust of this e-mail chain is more effectively shooing away drafters deemed vain and projects deemed insubstantial. Those are value judgments. Value judgments make assessments eat more time. They open them to controversy. They ask more of reviewers, which contributes to backlog. I wouldn't expect reordering factors in the factor test to change that. If SPDX doesn't want to identify new licenses it doesn't like, or wants to use its adoption as leverage to discourage new forms, it should come out and say that. Those of us building with broader needs can fork or superset. -- Kyle Mitchell, attorney // Oakland // (510) 712 - 0933 |
|
Steve Winslow
Thanks all for your comments in this thread. I'm not going to try to reply here to every comment, but wanted to note a few pieces that might be informative to folks who are less deep in the SPDX license ID weeds. Custom license IDs: Anyone who wants to use an SPDX-format-compatible license ID for a license that isn't on the license list is able to do so, via the LicenseRef- syntax. [1] The characters for the ID are the same as those permitted for IDs on the SPDX License List: letters, digits, hyphen ("-") and period ("."). [2] Making reusable custom license IDs: If someone wanted to create a standalone, reusable LicenseRef- ID that implemented a UUID, or a hash of a license text, I believe they could do so just by prepending "LicenseRef-" to the UUID or hash. I suspect there are some automated tools out there that work in this manner. (Of course, it's not going to be a particularly meaningful ID on its own, but just noting it since UUIDs were mentioned in the thread.) The challenge with using LicenseRef-, of course, is in letting people know which license text corresponds to your custom license ID. There are various ways to do this without ever talking to anyone at SPDX, including by creating your own SPDX document that defines it in an "Other License Information" section, or by following practices such as REUSE. [3] For an approach that could enable anyone to create more meaningful custom IDs and share the corresponding license text, we've had discussions several times over the past 4+ years about creating a formalized "license namespace" format, built on top of the existing LicenseRef- syntax. This has repeatedly failed to reach consensus, in my view primarily due to disagreements about the nuances of what the syntax should look like, and I don't think there's any appetite to reopen that discussion yet again. As a result, community members are welcome to establish informal practices for how they format LicenseRef- IDs within the permitted syntax and how they share the corresponding license text, such as via REUSE. Standards for what goes on the SPDX License List: I agree with Richard that the documentation should be clearer about "vanity" licenses generally being inappropriate for inclusion on the SPDX License List. I think there is value in the license list not being just a hash of license IDs to arbitrary text. The work that the SPDX Legal community does to review and curate licenses, insert markup to group them together where appropriate, and omit licenses that are not likely to be encountered in FOSS(-ish) development, seems to be of value to downstream users of the list. If it isn't, and if downstream users do in fact want a list that is just a hash of unique IDs to arbitrary text, then anyone is of course free to implement such a list and to persuade the broader ecosystem to adopt it. For newly-drafted licenses that are used in only one or a couple of projects (or sometimes zero projects), I agree with Richard that we often burn lots of cycles going back and forth with the license author without real benefit. I'd be in favor of bumping the "substantial use" factor higher on the License Inclusion Principles list [4]. And perhaps being more explicit in related documentation about the likelihood that vanity licenses with little usage, particularly non-FOSS licenses that fall in that category are highly unlikely to be added to the list. For a change to the inclusion principles, as Jilayne mentioned earlier I do think that a Change Proposal [5] is probably the right place to discuss the specifics of what that would look like. Submitters of newly-drafted licenses with little-to-no usage do sometimes mention that they need their license to be added to the SPDX License List so that their software with their new license can be included in a package manager. For package managers that use license list IDs as a requirement, I'd encourage them to consider implementing and permitting LicenseRef- IDs as well. (Or, if they don't want to permit LicenseRef- IDs, then that suggests to me that they are in fact finding some value in the curation that we perform for the License List.) And of course, to James's point: if a brand new license does see significant usage in the wild such that it is likely to be encountered in a broad set of community-developed software projects, then at that point it may be appropriate to add to the list. But I don't see value in having the SPDX License List be the first stop for a newly-drafted, non-FOSS license that is used in someone's personal project, or in having us burn cycles repeatedly explaining that. Steve [1] LicenseRef- syntax: see https://spdx.github.io/spdx-spec/v2.3/using-SPDX-short-identifiers-in-source-files/ and https://spdx.github.io/spdx-spec/v2.3/other-licensing-information-detected/ [2] idstring format: https://spdx.github.io/spdx-spec/v2.3/SPDX-license-expressions/ [3] REUSE: https://reuse.software/spec/#license-files [4] License Inclusion Principles: https://github.com/spdx/license-list-XML/blob/main/DOCS/license-inclusion-principles.md [5] Change Proposals: https://github.com/spdx/change-proposal On Wed, Jan 25, 2023 at 1:14 PM Kyle Mitchell <kyle@...> wrote: If the idea is really to hunt down every license lurking in |
|
Max Mehl
+1 to everything Steve just wrote, with one comment.
For an approach that could enable anyone to create more meaningful custom IDs and share the corresponding license text, we've had discussions several times over the past 4+ years about creating a formalized "license namespace" format, built on top of the existing LicenseRef- syntax. This has repeatedly failed to reach consensus, in my view primarily due to disagreements about the nuances of what the syntax should look like, and I don't think there's any appetite to reopen that discussion yet again.License namespaces were the first thing that came to my mind when reading the thread. Thanks that you confirmed that the proposal was never really buried for good, but just faded out - I wasn't sure. What would be worse? People inventing incompatible practices with LicenseRef IDs (and eventually IDs that mean different licenses), or finally settling on a syntax for license namespaces, even if it's only 80% perfect? I can see one scenario in which the former is better: making the LicenseRef hacks appear so chaotic that people strive to use proper official SPDX IDs and therefore do not add to license proliferation. In any other case, I wonder whether it shouldn't be a priority of the whole SPDX project to reach a consensus via a well-managed process. Best, Max -- Max Mehl Open Source Strategy & Governance Enterprise-Team Chief Technology Office (CTO), T.IP E-T-378 DB Systel GmbH Jürgen-Ponto-Platz 1, 60329 Frankfurt/M ________________________________ Pflichtangaben anzeigen<https://www.deutschebahn.com/pflichtangaben/20230105> Nähere Informationen zur Datenverarbeitung im DB-Konzern finden Sie hier: https://www.deutschebahn.com/de/konzern/datenschutz |
|
Kyle Mitchell
I was involved implementing SPDX license IDs as package
license metadata for a few package managers. How to handle licenses that don't have IDs came up every time. `LicenseRef-*` would get mentioned, usually because I brought it up. Maintainers preferred to implement something less arbitrary looking that better fit their system and style. npm defined a magic string to point to files in tarballs. Rust defined a separate metadata key for license file paths rather than license expressions. I see someone has gone back and revised the GemSpec reference with `LicenseRef-`s, but I've never seen one in the wild. They need an escape hatch to "whatever's in the license file" for existing, custom-licensed packages. Some of these are one-off commercial terms whose authors want them noticed and read rather than abbreviated. Meanwhile, if anyone can put whatever `LicenseRef` in their package meta, there's potential for collisions, which means auditors have to treat `LicenseRef-*` as "look in the tarball" no matter comes after `LicenseRef-`. Back on the maintainer side, the last thing they want is to spend time refereeing yet another global namespace, for `LicenseRef`s in addition to for package names. They're not interested in the SPDX spec per se, just the license list, which is grokked quickly by reading the first paragraph of spdx.org/licenses. Some take the expression syntax, especially if there's a preexisting parser in their language. Others just use lists or arrays, like RubyGems. None of the maintainers I worked with or have spoken to seem to ascribe curation value to the license list. Some cheer more for particular kinds of licenses SPDX has identified, like the common permissive open licenses. But they don't use their software to force or coerce toward them. In the end, they're running systems that accept and distribute packages under whatever kind of terms. That's part of being competitive these days, especially early on in a language hype cycle. Whether there are five hundred licenses on the list or a thousand doesn't matter, so long as they can automate pulling down new list versions for their builds---which I helped them do. They'd prefer if new versions of the list don't trigger stampedes of bug reports about new warning messages or validation errors. That's about it. -- Kyle Mitchell, attorney // Oakland // (510) 712 - 0933 |
|