A proposal for Jilayne's foreign language challenge

Karsten Reincke

Dear SPDX community;

I am currently again sitting in a exciting session of the FSFE Legal and Licensing Workshop in Barcelona.

Yesterday we had the pleasure to listen to an exciting SPDX lecture - titled 'FOSS licenses and different languages' and held by Jilayne Lovejoy. She asked for feedback and ideas for some challenges. During a coffee break I could offer to her an idea of a solution. And she asked me, to publish the rough sketch in this mailing list. So, here it is:

The problem was that we have to deal with translations of original FOSS licenses. An example for such a set of related licenses is the EUPL. In this specific case the translations have an 'official' state. Other licenses are sometimes translated by 'not so official translators'. About the EUPL it is said that the official translations preserve the legal power with respect to the different European countries. Unfortunately - as one member of that LLW session mentioned - it turned out that this statement is not true.

So, the problem is how to reliably group licenses which are linked to other licenses in any sense. During our LLW session there exist the clear wish to create a specific SPDX file for each translation of each FOSS license. That means NOT to group the licenses - due to the fact that they are not 'identical'.

In consequence we will have a large set SPDX files. And that might reduce the usability of SPDX.

So, my proposal is, to classify each element of such a license cluster like EUPL by a number which indicates the distances to the original. The idea is to encode the reliability of a license in number. Using that technique would allow us to specify a license cluster by one SPDX file (and a distance number [which could be incorporated into the SPDX file]). The advantage of this proposal is that we finally approximately do not have more than licenses than the OSI license list.

For being able to use SPDX License Cluster Distance Value, we would have to define some dimensions whose values determine the distance to an original. Then we would have to prioritize these dimensions and values so that we get an ordered row of distance factors - ordered by priority. To create a distance number on that base is simple. The main idea would be:

The less that number the less the distance to the original.

How could look that concretely?

Let us link an English original to a zero. Here are some dimensions (which have been mentioned in the LLW session):

(0) Is the license an English written original? (YES=0 | NO=1)
(1) Is the license a translation / derivation (YES=2 | NO=0)
(3) Is the license an official translation (YES=0 | NO=4)
(4) Does the translated license preserve the legal power (YES=0 | UNKNOWN=8 | NO=16)

Finally build the sum.

With respect to the EUPL, this algorithm delivers the following distance values

a) English version = 0 + 0 + 0 + 0 = 0

b) Greek version =
1 (because it's not the English original) +
2 (because it is a translation) +
0 (because it is an official translation)
0 (because it preserves the legal power)
= 03

b) Spain version =
1 (because it's not the English original) +
2 (because it is a translation) +
0 (because it is an official translation)
8 (because it is unknown whether it preserves the legal power)
= 11

c) Freman [of the hypothetical prospective country French+German] version =
1 (because it's not the English original) +
2 (because it is a translation) +
4 (because it is an unofficial translation)
16 (because it does not preserve the legal power)
= 23

What does such a technique mean for one the problems Jilayne mentioned?

A.1) In the case, that we do not have an English spoken original, the SPDX License Cluster Distance Value would be 1 instead of zero, but nevertheless this number indicates a very small distance from the ideal. And it indicates, that the English spoken community might have (minor) problems to use such a licensed software.

A.1) On the other hand, if we have an English translation of a non English original, that license get the value (0 + 2 + x + y) whch clearly indicates that the distance to the original / ideal is greater than the distance between a foreign original and the ideal.

A predictable question:

This idea might evoke the idea also to cluster variants like BSD-4-Clause, BSD-3-Clause, BSD-2-Clause' and the newest version 'BSD-3-Clause with patent'. This would mean to encode also such contentual differences into the SPDX License Cluster Distance Value.

I don't like that idea. I think, that textual literal different license in the same language should ever have a different SPDX file - because they intentionally are different licenses.

A last remark:

In the LLW session someone voted for having only English originals. He argued that in case of foreign-language licenses, SPDX does not reliably know whether it really is a FOSS license. I can't follow that position:

Even as a English native speaker you do not know in case of an English written License, whether it is really a Free or Open Source License. This can only be evaluated by an established official process - as for example the OSI offered. Hence:

1) If SPDX strictly stuck to the OSI list of open source licenses that problem would not exist. All OSI licenses are English.

3) If SPDX wants to cover other licenses which are not blessed by any processes the problem of the reliable FOSS status is the same, in English and in foreign-language license. Foreign-language license have the advantage the they more clearly indicate the existence of the problem.

So, please feel free to use this idea, to throw it away, to find other dimensions, to refine the algorithm. The work you do is very valuable for the FOSS community - as we not only could see at the lecture Jilayne gave.

With best reards

Deutsche Telekom Technik GmbH  / Infrastructure Cloud
Karsten Reincke, Senior Expert Key Projects - Telekom Open Source Committee
[display complete signatur: http://opensource.telekom.net/kreincke/kr-dtag-sign-en.txt ]

Join spdx@lists.spdx.org to automatically receive all group messages.