Re: A proposal for Jilayne's foreign language challenge

Brad Edmondson

Thanks Karsten for sharing your idea. 

It's a very interesting one, and compresses a lot of information into a small representation, sort of like a bitfield. I wonder, though, if that's really necessary given the verbosity we're otherwise already accepting with XML/RDF/JSON/etc. representations of the licenses on the license list. Could we represent the same information in a format both human- and machine-readable?

First, let me say that I emphatically believe SPDX should cover non-English licenses. The world of FOSS software contribution is multilingual, and I think SPDX should be as well. This may require some extra work when adding a new license (finding a native-speaking attorney, English review of an auto-translation, or something in between), but I think it will prove worthwhile in the end as we expand coverage to all widely-used FOSS contribution languages. In addition, we have the license list source-controlled so that we can make changes and fix issues over time, so I wouldn't be too worried about our ability to make corrections if we felt an addition was ultimately a mistake.

Second, my current opinion is that each license/language text should be tracked, treated, and marked up individually by SPDX, i.e. one license for GPL-en, another for GPL-de, another for GPL-fr, etc. (presumably 24 for the EUPL?). To my mind, these are collections of related license texts, not multiple ways to get to the "same" license, since even if the "same" license is in fact what the author intended (e.g. in the case of an "official" translation), it would still be up to a court to decide whether the legal terms as represented in one language are identical to a similar, purportedly "identical" representation in another language (even in the same jurisdiction). So I would say, let's track them all, and get at the problem of relating one to another with more metadata.

Third, assuming all license translations are individually tracked, I think the best way to go about relating them to each other is to use something as close to native XML as possible. We already have the unique identifiers, XML tags, and attributes for each license, so why not add an XML tag that can reference another license by unique identifier? For EU Public License in German, that might look something like this:
   <relatedLicense relationshipType="official-translation" targetLicenseIdentifier="EUPL-1.1">EUPL-1.1</relatedLicense>
Other relationshipTypes might be "unofficial-translation," "official-translation-ported," "official-translation-unported," and "derived-from" (there may be others, or maybe we don't need all of those). That just represents the facts as we've perceived them, without getting into too much judgment as to how close the relationship might be. This would allow us to say, essentially, "this is what we think the relationships are; have your open-source counsel review what that means for you."

Another way of throwing data at the problem might be to individually track all of the licenses, without built-in cross-references to other license IDs, but at the same time also publish a separate document specifying which of those licenses are related to each other and what kind of bundles those are. This is the same data as proposed in the previous paragraph, but laid out explicitly (again with reference to the unique license ID) rather than emergent from the XML. I think I prefer the emergent solution, but that's just me, and what I think today. I'm no XML expert, just a young attorney with a bit of programming experience doing my best to help.

What do others think of this? Should we have Kate add handling multi-language licenses to the tech team's spec discussion?


PS - Preemptive apologies to Jilayne -- I'm guessing your preferred solution would not be "just make the license list longer!" -- but I do actually think that's the best way to handle these clusters of related licenses (plus a little more metadata about relationships).     :-)

Brad Edmondson, Esq.
512-673-8782 | brad.edmondson@...

On Wed, May 3, 2017 at 10:33 AM, <Karsten.Reincke@...> wrote:
Dear Alan

> Karsten,
> Thanks for the thoughtful suggestion. I like it and think it could
> work.

I am happy for having been able to help. I need a running SPDX system for my further work. So, it is not totally unselfish ;-)

> One issue I see is the issue we run into about trying to avoid
> making a legal judgment when classifying the licenses.  That would
> imply we wouldn't use dimension 4 about "preserving legal power."

It is important that you define the list of necessary dimensions: you are the SPDX experts. I personally agree with your attitude: Inserting such a value could make SPDX a bit pejorative (and will surely evoke unnecessary discussions). Howsoever, I inserted that dimension only because it has been mentioned/requested on the LLW.

> Also for dimension 3 regarding "official" licenses, perhaps we need
> some more gradation for something where it's not "official" but it's at
> least acknowledged or referenced.  For example, the GPL translations
> aren't official:  I
> think if we're factually relying on statements made by the license
> steward, it's less a concern about making a legal judgment.

Such a differentiation would be helpful. Together with the simplification not to use the dimension 'legal power' you can use a better and simpler representation:

- original
  - English 00
  - foreign 01
- translation
  - approved 10
  - audited 20
  - ...
  - unclear f0

Feel free to expand and redesign this little domain

With best regards

Deutsche Telekom Technik GmbH  / Infrastructure Cloud
Karsten Reincke, Senior Expert Key Projects - Telekom Open Source Committee
[display complete signatur: ]

Spdx mailing list

Join { to automatically receive all group messages.