Date   

Welcome SPDX Google Summer of Code Students

Gary O'Neall
 

Please join me in welcoming our Google Summer of Code students Anna Buhman, Nuvadga Christian Tete, Rohit Lodha, and Aleksandr Lisianoi to the SPDX community.

 

Anna, Nuvadga, Rohit and Aleksandr will be collaborating with the SPDX tech team in improving our tooling, adding Github integration, creating a license coverage grading tool and implemented the long awaited online validation tools.

 

This month, the students will be focused on community bonding with the actual projects beginning on May 30.

 

More information on the Google Summer of Code can be found at https://summerofcode.withgoogle.com/

 

Gary

 

 

-------------------------------------------------

Gary O'Neall

Principal Consultant

Source Auditor Inc.

Mobile: 408.805.0586

Email: gary@...

 


Minutes from SPDX May General Meeting

Philip Odence
 

https://wiki.spdx.org/view/General_Meeting/Minutes/2017-05-04

 

 

 

General Meeting/Minutes/2017-05-04

General Meeting‎ | Minutes

  • Attendance: 10
  • Lead by Phil Odence
  • Minutes of April meeting approved 

 

Contents

 [hide

Guest Presentation- Philippe[edit]

  • ScanCode
    • Open source project
    • Tool to enable developers to find the license and origin of components they are using
  • Features
    • Accurate
      • Scanned Linux kernel and results were superior to two other tools that were tested
    • Handles source and binaries
    • Well tested and community maintained
    • Easy to improve license detection
  • How it works
    • Input: Simple test files
    • Performs a diff against a large number of licenses and mentions
    • Handles packages via package manager.
    • Uses natural language parser for copyrights
    • Output in SPDX (or JSON)
  • Two pieces ScanCode Toolkit and Code Manager
  • Other projects in code.org

 

Tech Team Report - Kate/Gary[edit]

  • Restarted discussions about feature needs for next release
    • Looking at:
      • Philippe’s results
      • Debian
      • Other testing results
    • Wiki page has ideas for next release or two
      • Feel free to add there or via email
  • Also looking at putting together a test suite
    • Set of packages 
    • Results to be compared
  • Google SoC
    • Select three proposals
    • Students being notified about now
    • Next steps
      • Community bonding
      • Working with Students
      • Will provide status

 

Outreach Team Report - Jack[edit]

  • Working on Umbrella project
    • A wrapper around all the repositories for tools
  • Discussion of a tool certification project
    • Aiming to have done in Q1 18 timeframe
    • Initial testing at LinuxCon Europe. Prague in Oct
  • Call for Papers this week for NA LinuxCon, LA in August

 

Legal Team Report - Jilayne[edit]

  • Down to 24 licenses to review
  • Proposal for how to handle non-English licenses
    • Have handled some ad hoc
    • Need a broader policy
    • Will have implications for license matching guidelines

 

Attendees[edit]

  • Phil Odence, Black Duck
  • Kate Stewart, Linux Foundation
  • Gary O’Neill, SourceAuditor 
  • Philippe Ombrédanne- nexB
  • Brad Edmondson, Harvard
  • Jilayne Lovejoy, ARM
  • Jack Manbeck, TI
  • Robin Gandhi, UNO
  • Kevin Nelson, Optum
  • Dennis Clark, Palamida

 

 


Re: Thursday SPDX General Meeting

Philippe Ombredanne
 

On Tue, May 2, 2017 at 7:20 PM, Phil Odence
<podence@...> wrote:
Our special guest this month is Philippe Ombredanne who has been involved
with SPDX for a number of years. He will be talking about an open source
tool he developed to scan code and detect licenses, copyrights, packages
metadata & dependencies and more…including generating SPDX docs.
All:
I have attached a PDF if you are only calling in


--
Cordially
Philippe Ombredanne


Re: A proposal for Jilayne's foreign language challenge

J Lovejoy
 

Thanks so much Karsten for posting this, as my memory would have never sparked such discussion and now we have the benefit of your input, as well as the others in the SPDX team!

I am copying this over to the SPDX-legal mailing list as that is where the will continue, although as Brad already noted, I think this is a cross-team issue.  As such, this is helpful to have started the discussion on the general list for exposure and I will also mention it today at the general call.  If anyone is on the general list, but not on the legal mailing list, please do join so you can be part of the continued discussions: https://lists.spdx.org/mailman/listinfo/spdx-legal

And to Brad’s PS - if we need to make the license list longer to accommodate the best, international solution to this, then that is what we need to do!  I’m trusting that with the switch to XML format and using Github, maintaining the license list will fall on more shoulders going forward, as we have already seen in the transition!!

Thanks,
Jilayne

SPDX Legal Team co-lead
opensource@...


On May 4, 2017, at 12:18 AM, Brad Edmondson <brad.edmondson@...> wrote:

Thanks Karsten for sharing your idea. 

It's a very interesting one, and compresses a lot of information into a small representation, sort of like a bitfield. I wonder, though, if that's really necessary given the verbosity we're otherwise already accepting with XML/RDF/JSON/etc. representations of the licenses on the license list. Could we represent the same information in a format both human- and machine-readable?


First, let me say that I emphatically believe SPDX should cover non-English licenses. The world of FOSS software contribution is multilingual, and I think SPDX should be as well. This may require some extra work when adding a new license (finding a native-speaking attorney, English review of an auto-translation, or something in between), but I think it will prove worthwhile in the end as we expand coverage to all widely-used FOSS contribution languages. In addition, we have the license list source-controlled so that we can make changes and fix issues over time, so I wouldn't be too worried about our ability to make corrections if we felt an addition was ultimately a mistake.

Second, my current opinion is that each license/language text should be tracked, treated, and marked up individually by SPDX, i.e. one license for GPL-en, another for GPL-de, another for GPL-fr, etc. (presumably 24 for the EUPL?). To my mind, these are collections of related license texts, not multiple ways to get to the "same" license, since even if the "same" license is in fact what the author intended (e.g. in the case of an "official" translation), it would still be up to a court to decide whether the legal terms as represented in one language are identical to a similar, purportedly "identical" representation in another language (even in the same jurisdiction). So I would say, let's track them all, and get at the problem of relating one to another with more metadata.

Third, assuming all license translations are individually tracked, I think the best way to go about relating them to each other is to use something as close to native XML as possible. We already have the unique identifiers, XML tags, and attributes for each license, so why not add an XML tag that can reference another license by unique identifier? For EU Public License in German, that might look something like this:
...
<relatedLicenses>
   <relatedLicense relationshipType="official-translation" targetLicenseIdentifier="EUPL-1.1">EUPL-1.1</relatedLicense>
</relatedLicenses>
...
Other relationshipTypes might be "unofficial-translation," "official-translation-ported," "official-translation-unported," and "derived-from" (there may be others, or maybe we don't need all of those). That just represents the facts as we've perceived them, without getting into too much judgment as to how close the relationship might be. This would allow us to say, essentially, "this is what we think the relationships are; have your open-source counsel review what that means for you."


Another way of throwing data at the problem might be to individually track all of the licenses, without built-in cross-references to other license IDs, but at the same time also publish a separate document specifying which of those licenses are related to each other and what kind of bundles those are. This is the same data as proposed in the previous paragraph, but laid out explicitly (again with reference to the unique license ID) rather than emergent from the XML. I think I prefer the emergent solution, but that's just me, and what I think today. I'm no XML expert, just a young attorney with a bit of programming experience doing my best to help.


What do others think of this? Should we have Kate add handling multi-language licenses to the tech team's spec discussion?

Best,
Brad

PS - Preemptive apologies to Jilayne -- I'm guessing your preferred solution would not be "just make the license list longer!" -- but I do actually think that's the best way to handle these clusters of related licenses (plus a little more metadata about relationships).     :-)

--
Brad Edmondson, Esq.
512-673-8782 | brad.edmondson@...

On Wed, May 3, 2017 at 10:33 AM, <Karsten.Reincke@...> wrote:
Dear Alan

> Karsten,
> Thanks for the thoughtful suggestion. I like it and think it could
> work.

I am happy for having been able to help. I need a running SPDX system for my further work. So, it is not totally unselfish ;-)

> One issue I see is the issue we run into about trying to avoid
> making a legal judgment when classifying the licenses.  That would
> imply we wouldn't use dimension 4 about "preserving legal power."

It is important that you define the list of necessary dimensions: you are the SPDX experts. I personally agree with your attitude: Inserting such a value could make SPDX a bit pejorative (and will surely evoke unnecessary discussions). Howsoever, I inserted that dimension only because it has been mentioned/requested on the LLW.

> Also for dimension 3 regarding "official" licenses, perhaps we need
> some more gradation for something where it's not "official" but it's at
> least acknowledged or referenced.  For example, the GPL translations
> aren't official: https://www.gnu.org/licenses/translations.en.html  I
> think if we're factually relying on statements made by the license
> steward, it's less a concern about making a legal judgment.

Such a differentiation would be helpful. Together with the simplification not to use the dimension 'legal power' you can use a better and simpler representation:

licenses
- original
  - English 00
  - foreign 01
- translation
  - approved 10
  - audited 20
  - ...
  - unclear f0

Feel free to expand and redesign this little domain

With best regards
Karsten

---
Deutsche Telekom Technik GmbH  / Infrastructure Cloud
Karsten Reincke, Senior Expert Key Projects - Telekom Open Source Committee
[display complete signatur: http://opensource.telekom.net/kreincke/kr-dtag-sign-en.txt ]

_______________________________________________
Spdx mailing list
Spdx@...
https://lists.spdx.org/mailman/listinfo/spdx

_______________________________________________
Spdx mailing list
Spdx@...
https://lists.spdx.org/mailman/listinfo/spdx


Re: A proposal for Jilayne's foreign language challenge

Brad Edmondson
 

Thanks Karsten for sharing your idea. 

It's a very interesting one, and compresses a lot of information into a small representation, sort of like a bitfield. I wonder, though, if that's really necessary given the verbosity we're otherwise already accepting with XML/RDF/JSON/etc. representations of the licenses on the license list. Could we represent the same information in a format both human- and machine-readable?


First, let me say that I emphatically believe SPDX should cover non-English licenses. The world of FOSS software contribution is multilingual, and I think SPDX should be as well. This may require some extra work when adding a new license (finding a native-speaking attorney, English review of an auto-translation, or something in between), but I think it will prove worthwhile in the end as we expand coverage to all widely-used FOSS contribution languages. In addition, we have the license list source-controlled so that we can make changes and fix issues over time, so I wouldn't be too worried about our ability to make corrections if we felt an addition was ultimately a mistake.

Second, my current opinion is that each license/language text should be tracked, treated, and marked up individually by SPDX, i.e. one license for GPL-en, another for GPL-de, another for GPL-fr, etc. (presumably 24 for the EUPL?). To my mind, these are collections of related license texts, not multiple ways to get to the "same" license, since even if the "same" license is in fact what the author intended (e.g. in the case of an "official" translation), it would still be up to a court to decide whether the legal terms as represented in one language are identical to a similar, purportedly "identical" representation in another language (even in the same jurisdiction). So I would say, let's track them all, and get at the problem of relating one to another with more metadata.

Third, assuming all license translations are individually tracked, I think the best way to go about relating them to each other is to use something as close to native XML as possible. We already have the unique identifiers, XML tags, and attributes for each license, so why not add an XML tag that can reference another license by unique identifier? For EU Public License in German, that might look something like this:
...
<relatedLicenses>
   <relatedLicense relationshipType="official-translation" targetLicenseIdentifier="EUPL-1.1">EUPL-1.1</relatedLicense>
</relatedLicenses>
...
Other relationshipTypes might be "unofficial-translation," "official-translation-ported," "official-translation-unported," and "derived-from" (there may be others, or maybe we don't need all of those). That just represents the facts as we've perceived them, without getting into too much judgment as to how close the relationship might be. This would allow us to say, essentially, "this is what we think the relationships are; have your open-source counsel review what that means for you."


Another way of throwing data at the problem might be to individually track all of the licenses, without built-in cross-references to other license IDs, but at the same time also publish a separate document specifying which of those licenses are related to each other and what kind of bundles those are. This is the same data as proposed in the previous paragraph, but laid out explicitly (again with reference to the unique license ID) rather than emergent from the XML. I think I prefer the emergent solution, but that's just me, and what I think today. I'm no XML expert, just a young attorney with a bit of programming experience doing my best to help.


What do others think of this? Should we have Kate add handling multi-language licenses to the tech team's spec discussion?

Best,
Brad

PS - Preemptive apologies to Jilayne -- I'm guessing your preferred solution would not be "just make the license list longer!" -- but I do actually think that's the best way to handle these clusters of related licenses (plus a little more metadata about relationships).     :-)

--
Brad Edmondson, Esq.
512-673-8782 | brad.edmondson@...

On Wed, May 3, 2017 at 10:33 AM, <Karsten.Reincke@...> wrote:
Dear Alan

> Karsten,
> Thanks for the thoughtful suggestion. I like it and think it could
> work.

I am happy for having been able to help. I need a running SPDX system for my further work. So, it is not totally unselfish ;-)

> One issue I see is the issue we run into about trying to avoid
> making a legal judgment when classifying the licenses.  That would
> imply we wouldn't use dimension 4 about "preserving legal power."

It is important that you define the list of necessary dimensions: you are the SPDX experts. I personally agree with your attitude: Inserting such a value could make SPDX a bit pejorative (and will surely evoke unnecessary discussions). Howsoever, I inserted that dimension only because it has been mentioned/requested on the LLW.

> Also for dimension 3 regarding "official" licenses, perhaps we need
> some more gradation for something where it's not "official" but it's at
> least acknowledged or referenced.  For example, the GPL translations
> aren't official: https://www.gnu.org/licenses/translations.en.html  I
> think if we're factually relying on statements made by the license
> steward, it's less a concern about making a legal judgment.

Such a differentiation would be helpful. Together with the simplification not to use the dimension 'legal power' you can use a better and simpler representation:

licenses
- original
  - English 00
  - foreign 01
- translation
  - approved 10
  - audited 20
  - ...
  - unclear f0

Feel free to expand and redesign this little domain

With best regards
Karsten

---
Deutsche Telekom Technik GmbH  / Infrastructure Cloud
Karsten Reincke, Senior Expert Key Projects - Telekom Open Source Committee
[display complete signatur: http://opensource.telekom.net/kreincke/kr-dtag-sign-en.txt ]

_______________________________________________
Spdx mailing list
Spdx@...
https://lists.spdx.org/mailman/listinfo/spdx


Re: A proposal for Jilayne's foreign language challenge

Karsten Reincke
 

Dear Alan

Karsten,
Thanks for the thoughtful suggestion. I like it and think it could
work.
I am happy for having been able to help. I need a running SPDX system for my further work. So, it is not totally unselfish ;-)

One issue I see is the issue we run into about trying to avoid
making a legal judgment when classifying the licenses. That would
imply we wouldn't use dimension 4 about "preserving legal power."
It is important that you define the list of necessary dimensions: you are the SPDX experts. I personally agree with your attitude: Inserting such a value could make SPDX a bit pejorative (and will surely evoke unnecessary discussions). Howsoever, I inserted that dimension only because it has been mentioned/requested on the LLW.

Also for dimension 3 regarding "official" licenses, perhaps we need
some more gradation for something where it's not "official" but it's at
least acknowledged or referenced. For example, the GPL translations
aren't official: https://www.gnu.org/licenses/translations.en.html I
think if we're factually relying on statements made by the license
steward, it's less a concern about making a legal judgment.
Such a differentiation would be helpful. Together with the simplification not to use the dimension 'legal power' you can use a better and simpler representation:

licenses
- original
- English 00
- foreign 01
- translation
- approved 10
- audited 20
- ...
- unclear f0

Feel free to expand and redesign this little domain

With best regards
Karsten

---
Deutsche Telekom Technik GmbH / Infrastructure Cloud
Karsten Reincke, Senior Expert Key Projects - Telekom Open Source Committee
[display complete signatur: http://opensource.telekom.net/kreincke/kr-dtag-sign-en.txt ]


Thursday SPDX General Meeting

Philip Odence
 

Our special guest this month is Philippe Ombredanne who has been involved with SPDX for a number of years. He will be talking about an open source tool he developed to scan code and detect licenses, copyrights, packages metadata & dependencies and more…including generating SPDX docs.

 

 

GENERAL MEETING

 

Meeting Time: Thurs, May 4, 8am PDT / 10 am CDT / 11am EDT / 15:00 UTC. http://www.timeanddate.com/worldclock/converter.html


Conf call dial-in:

Join the call: https://www.uberconference.com/katestewart

Optional dial in number: 877-297-7470

Alternate number: 512-910-4433

No PIN needed

 

Administrative Agenda

Attendance

Minutes Approval  https://wiki.spdx.org/view/General_Meeting/Minutes/2017-04-06

 

Guest Presentation – Philippe

 

Technical Team Report – Kate/Gary

 

Legal Team Report – Jilayne

 

Business Team Report – Jack

 

Cross Functional Issues –All

 

 

 

 



Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more Click Here.


Re: A proposal for Jilayne's foreign language challenge

Alan Tse
 

Karsten,
Thanks for the thoughtful suggestion. I like it and think it could work. One issue I see is the issue we run into about trying to avoid making a legal judgment when classifying the licenses. That would imply we wouldn't use dimension 4 about "preserving legal power."

Also for dimension 3 regarding "official" licenses, perhaps we need some more gradation for something where it's not "official" but it's at least acknowledged or referenced. For example, the GPL translations aren't official: https://www.gnu.org/licenses/translations.en.html I think if we're factually relying on statements made by the license steward, it's less a concern about making a legal judgment.

Alan D. Tse

-----Original Message-----
From: spdx-bounces@... [mailto:spdx-bounces@...] On Behalf Of Karsten.Reincke@...
Sent: Friday, April 28, 2017 2:29 AM
To: spdx@...
Subject: A proposal for Jilayne's foreign language challenge

Dear SPDX community;

I am currently again sitting in a exciting session of the FSFE Legal and Licensing Workshop in Barcelona.

Yesterday we had the pleasure to listen to an exciting SPDX lecture - titled 'FOSS licenses and different languages' and held by Jilayne Lovejoy. She asked for feedback and ideas for some challenges. During a coffee break I could offer to her an idea of a solution. And she asked me, to publish the rough sketch in this mailing list. So, here it is:

The problem was that we have to deal with translations of original FOSS licenses. An example for such a set of related licenses is the EUPL. In this specific case the translations have an 'official' state. Other licenses are sometimes translated by 'not so official translators'. About the EUPL it is said that the official translations preserve the legal power with respect to the different European countries. Unfortunately - as one member of that LLW session mentioned - it turned out that this statement is not true.

So, the problem is how to reliably group licenses which are linked to other licenses in any sense. During our LLW session there exist the clear wish to create a specific SPDX file for each translation of each FOSS license. That means NOT to group the licenses - due to the fact that they are not 'identical'.

In consequence we will have a large set SPDX files. And that might reduce the usability of SPDX.

So, my proposal is, to classify each element of such a license cluster like EUPL by a number which indicates the distances to the original. The idea is to encode the reliability of a license in number. Using that technique would allow us to specify a license cluster by one SPDX file (and a distance number [which could be incorporated into the SPDX file]). The advantage of this proposal is that we finally approximately do not have more than licenses than the OSI license list.

For being able to use SPDX License Cluster Distance Value, we would have to define some dimensions whose values determine the distance to an original. Then we would have to prioritize these dimensions and values so that we get an ordered row of distance factors - ordered by priority. To create a distance number on that base is simple. The main idea would be:

The less that number the less the distance to the original.

How could look that concretely?

Let us link an English original to a zero. Here are some dimensions (which have been mentioned in the LLW session):

(0) Is the license an English written original? (YES=0 | NO=1)
(1) Is the license a translation / derivation (YES=2 | NO=0)
(3) Is the license an official translation (YES=0 | NO=4)
(4) Does the translated license preserve the legal power (YES=0 | UNKNOWN=8 | NO=16)

Finally build the sum.

With respect to the EUPL, this algorithm delivers the following distance values

a) English version = 0 + 0 + 0 + 0 = 0

b) Greek version =
1 (because it's not the English original) +
2 (because it is a translation) +
0 (because it is an official translation)
0 (because it preserves the legal power) = 03

b) Spain version =
1 (because it's not the English original) +
2 (because it is a translation) +
0 (because it is an official translation)
8 (because it is unknown whether it preserves the legal power) = 11

c) Freman [of the hypothetical prospective country French+German] version =
1 (because it's not the English original) +
2 (because it is a translation) +
4 (because it is an unofficial translation)
16 (because it does not preserve the legal power) = 23

What does such a technique mean for one the problems Jilayne mentioned?

A.1) In the case, that we do not have an English spoken original, the SPDX License Cluster Distance Value would be 1 instead of zero, but nevertheless this number indicates a very small distance from the ideal. And it indicates, that the English spoken community might have (minor) problems to use such a licensed software.

A.1) On the other hand, if we have an English translation of a non English original, that license get the value (0 + 2 + x + y) whch clearly indicates that the distance to the original / ideal is greater than the distance between a foreign original and the ideal.

A predictable question:

This idea might evoke the idea also to cluster variants like BSD-4-Clause, BSD-3-Clause, BSD-2-Clause' and the newest version 'BSD-3-Clause with patent'. This would mean to encode also such contentual differences into the SPDX License Cluster Distance Value.

I don't like that idea. I think, that textual literal different license in the same language should ever have a different SPDX file - because they intentionally are different licenses.

A last remark:

In the LLW session someone voted for having only English originals. He argued that in case of foreign-language licenses, SPDX does not reliably know whether it really is a FOSS license. I can't follow that position:

Even as a English native speaker you do not know in case of an English written License, whether it is really a Free or Open Source License. This can only be evaluated by an established official process - as for example the OSI offered. Hence:

1) If SPDX strictly stuck to the OSI list of open source licenses that problem would not exist. All OSI licenses are English.

3) If SPDX wants to cover other licenses which are not blessed by any processes the problem of the reliable FOSS status is the same, in English and in foreign-language license. Foreign-language license have the advantage the they more clearly indicate the existence of the problem.

So, please feel free to use this idea, to throw it away, to find other dimensions, to refine the algorithm. The work you do is very valuable for the FOSS community - as we not only could see at the lecture Jilayne gave.

With best reards
Karsten


---
Deutsche Telekom Technik GmbH  / Infrastructure Cloud Karsten Reincke, Senior Expert Key Projects - Telekom Open Source Committee [display complete signatur: http://opensource.telekom.net/kreincke/kr-dtag-sign-en.txt ]

_______________________________________________
Spdx mailing list
Spdx@...
https://lists.spdx.org/mailman/listinfo/spdx
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.


A proposal for Jilayne's foreign language challenge

Karsten Reincke
 

Dear SPDX community;

I am currently again sitting in a exciting session of the FSFE Legal and Licensing Workshop in Barcelona.

Yesterday we had the pleasure to listen to an exciting SPDX lecture - titled 'FOSS licenses and different languages' and held by Jilayne Lovejoy. She asked for feedback and ideas for some challenges. During a coffee break I could offer to her an idea of a solution. And she asked me, to publish the rough sketch in this mailing list. So, here it is:

The problem was that we have to deal with translations of original FOSS licenses. An example for such a set of related licenses is the EUPL. In this specific case the translations have an 'official' state. Other licenses are sometimes translated by 'not so official translators'. About the EUPL it is said that the official translations preserve the legal power with respect to the different European countries. Unfortunately - as one member of that LLW session mentioned - it turned out that this statement is not true.

So, the problem is how to reliably group licenses which are linked to other licenses in any sense. During our LLW session there exist the clear wish to create a specific SPDX file for each translation of each FOSS license. That means NOT to group the licenses - due to the fact that they are not 'identical'.

In consequence we will have a large set SPDX files. And that might reduce the usability of SPDX.

So, my proposal is, to classify each element of such a license cluster like EUPL by a number which indicates the distances to the original. The idea is to encode the reliability of a license in number. Using that technique would allow us to specify a license cluster by one SPDX file (and a distance number [which could be incorporated into the SPDX file]). The advantage of this proposal is that we finally approximately do not have more than licenses than the OSI license list.

For being able to use SPDX License Cluster Distance Value, we would have to define some dimensions whose values determine the distance to an original. Then we would have to prioritize these dimensions and values so that we get an ordered row of distance factors - ordered by priority. To create a distance number on that base is simple. The main idea would be:

The less that number the less the distance to the original.

How could look that concretely?

Let us link an English original to a zero. Here are some dimensions (which have been mentioned in the LLW session):

(0) Is the license an English written original? (YES=0 | NO=1)
(1) Is the license a translation / derivation (YES=2 | NO=0)
(3) Is the license an official translation (YES=0 | NO=4)
(4) Does the translated license preserve the legal power (YES=0 | UNKNOWN=8 | NO=16)

Finally build the sum.

With respect to the EUPL, this algorithm delivers the following distance values

a) English version = 0 + 0 + 0 + 0 = 0

b) Greek version =
1 (because it's not the English original) +
2 (because it is a translation) +
0 (because it is an official translation)
0 (because it preserves the legal power)
= 03

b) Spain version =
1 (because it's not the English original) +
2 (because it is a translation) +
0 (because it is an official translation)
8 (because it is unknown whether it preserves the legal power)
= 11

c) Freman [of the hypothetical prospective country French+German] version =
1 (because it's not the English original) +
2 (because it is a translation) +
4 (because it is an unofficial translation)
16 (because it does not preserve the legal power)
= 23

What does such a technique mean for one the problems Jilayne mentioned?

A.1) In the case, that we do not have an English spoken original, the SPDX License Cluster Distance Value would be 1 instead of zero, but nevertheless this number indicates a very small distance from the ideal. And it indicates, that the English spoken community might have (minor) problems to use such a licensed software.

A.1) On the other hand, if we have an English translation of a non English original, that license get the value (0 + 2 + x + y) whch clearly indicates that the distance to the original / ideal is greater than the distance between a foreign original and the ideal.

A predictable question:

This idea might evoke the idea also to cluster variants like BSD-4-Clause, BSD-3-Clause, BSD-2-Clause' and the newest version 'BSD-3-Clause with patent'. This would mean to encode also such contentual differences into the SPDX License Cluster Distance Value.

I don't like that idea. I think, that textual literal different license in the same language should ever have a different SPDX file - because they intentionally are different licenses.

A last remark:

In the LLW session someone voted for having only English originals. He argued that in case of foreign-language licenses, SPDX does not reliably know whether it really is a FOSS license. I can't follow that position:

Even as a English native speaker you do not know in case of an English written License, whether it is really a Free or Open Source License. This can only be evaluated by an established official process - as for example the OSI offered. Hence:

1) If SPDX strictly stuck to the OSI list of open source licenses that problem would not exist. All OSI licenses are English.

3) If SPDX wants to cover other licenses which are not blessed by any processes the problem of the reliable FOSS status is the same, in English and in foreign-language license. Foreign-language license have the advantage the they more clearly indicate the existence of the problem.

So, please feel free to use this idea, to throw it away, to find other dimensions, to refine the algorithm. The work you do is very valuable for the FOSS community - as we not only could see at the lecture Jilayne gave.

With best reards
Karsten


---
Deutsche Telekom Technik GmbH  / Infrastructure Cloud
Karsten Reincke, Senior Expert Key Projects - Telekom Open Source Committee
[display complete signatur: http://opensource.telekom.net/kreincke/kr-dtag-sign-en.txt ]


Re: Today's SPDX General Meeting

Philip Odence
 

Philippe, we like to shoo for user presentations, but yours could be of interest sometime too.

On 4/6/17, 1:26 PM, "Philippe Ombredanne" <pombredanne@...> wrote:

On Thu, Apr 6, 2017 at 12:32 PM, Phil Odence
<podence@...> wrote:
> With no guest speaker the month, we will try to keep the meeting to 30
> minutes.

If that's of interest to the group, I would be happy to present our
efforts to build a better SPDX license mousetrap with the FOSS
ScanCode toolkit that I maintain.
This could be for the May meeting?
--
Cordially
Philippe Ombredanne




Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more Click Here.


Re: Today's SPDX General Meeting

Philippe Ombredanne
 

On Thu, Apr 6, 2017 at 12:32 PM, Phil Odence
<podence@...> wrote:
With no guest speaker the month, we will try to keep the meeting to 30
minutes.
If that's of interest to the group, I would be happy to present our
efforts to build a better SPDX license mousetrap with the FOSS
ScanCode toolkit that I maintain.
This could be for the May meeting?
--
Cordially
Philippe Ombredanne


Minutes from April SPDX General Meeting

Philip Odence
 

 

General Meeting/Minutes/2017-04-06

General Meeting‎ | Minutes

  • Attendance: 12
  • Lead by Phil Odence
  • Minutes of March meeting approved 

 

Contents

 [hide

Tech Team Report - Kate/Gary[edit]

  • Prepping for Google Summer of Code
    • 10 proposals
    • At least 3 or 4 are very promising, only 1 or 2 that aren’t that great
      • Examples: 
        • On-line validation tool (already engaged with community)
        • Automating SPDX gen in GitHub (already thought out architecture)
        • License grade
    • Mentors lined up for 2-4 slots we’ve requested
    • Will pick appropriate number based on slots

 

Outreach Team Report - Jack[edit]

  • Creating a developer-friendly area in Git for SPDX
    • Versions of the spec
    • Information exchange
      • following example of nexB creating an umbrella repo for the nexB family of repos
  • Talking about what we expect people to do
    • So we can make more definitive statements
    • for different market segments sort of
    • Will guide website

 

Legal Team Report - Jilayne[edit]

  • XML markup
    • Made some recent decisions
    • Down to less that 60 licenses
    • Still work to do and volunteers needed
  • Reception
    • Seeing interest on GitHub
    • Getting useful feedback on particulars of licenses
  • Challenge
    • Juggling two lists until next release
    • Unclear where to log updates
    • Enduring until next release
    • May make sense to post “Bear with us” message
      • in GitHub
      • and current repo
  • Timing
    • uncertain as it depends on getting the work done
    • plus a week or two of Gary once XML is done
  • Brad presented to ABA
    • Put a lot of time into slides
    • Should be posted on website

 

 

Attendees[edit]

  • Phil Odence, Black Duck
  • Kate Stewart, Linux Foundation
  • Paul Madick, Dimension Data
  • Jilayne Lovejoy, ARM
  • Jack Manbeck, TI
  • Michael Herzog- nexB
  • Mark Gisi, Wind River 
  • Alexios Zavras, Intel
  • Thomas Steenbergen, HERE
  • Matije Suklje, LF
  • Gary O’Neill, SourceAuditor 
  • Dave Marr, Qualcomm

 



Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more Click Here.


Today's SPDX General Meeting

Philip Odence
 

Apologies for the late reminder.

With no guest speaker the month, we will try to keep the meeting to 30 minutes.

Phil

 

 

GENERAL MEETING

 

Meeting Time: Thurs, April 6, 8am PDT / 10 am CDT / 11am EDT / 15:00 UTC. http://www.timeanddate.com/worldclock/converter.html


Conf call dial-in:

Join the call: https://www.uberconference.com/katestewart

Optional dial in number: 877-297-7470

Alternate number: 512-910-4433

No PIN needed

 

Administrative Agenda

Attendance

Minutes Approval   http://wiki.spdx.org/view/General_Meeting/Minutes/2017-03-02

 

  

Technical Team Report – Kate/Gary

 

Legal Team Report – Jilayne/Paul

 

Business Team Report – Jack

 

Cross Functional Issues –All

 

 



Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more Click Here.


March SPDX General Meeting Notes

Philip Odence
 

http://wiki.spdx.org/view/General_Meeting/Minutes/2017-03-02

 

 

 

General Meeting/Minutes/2017-03-02

General Meeting‎ | Minutes

  • Attendance: 11
  • Lead by Phil Odence
  • Minutes of Feb meeting approved 

 

Contents

 [hide

Special Presentation- Mark Charlebois / Rashmi Chitrakar, Qualcomm[edit]

  • Mark from corp R&D, Rashmi from the open source group
  • Mark works on Dronecode
    • Goal is to build with Yocto
    • Want to provide good license info
    • At the outset Yocto build only supported SPDX 1.0 and uses FOSSology for scanning
      • Yocto is a distribution that comes with recipes for custom builds 
    • Motivation
      • reducing scan times was key
      • FOSSology was taking as much as 6 days
      • Introducing LiD to address
  • (Deck is available)
  • Yocto
    • has a number of build stages
    • current integration was inserted after patch stage to only scan what’s patched
    • but that doesn’t allow for reusability
    • So, the approach was to scan upstream sources and focus scan on only patches
    • Uses Yocto archiver
  • FOSSology integration
    • Mark was not able to even get it going
    • Old, did not seem well maintained
  • New integration
    • Implements approach to 
    • Leverage newer SPDX capabilities 
      • Relationships between files
      • Usage info (e.g. dynamic library)
    • Allows for parallelizing across machines
    • Can flag discrepancies (e.g. two different licenses declared)
    • Goal
      • create a federated commons of pre-scanned code
      • so, everyone’s work is cut by, say, 90% (as they only need to scan their customer 10%) 
  • LiD
    • Main Features of Scanners
      • They have access to FOSSology tools (Nomos, Monk)
      • Evaluated using Qualcomm code for testing
      • Nomos was pretty good at detecting license language (94%)
      • Monk, only about 25%
      • Used SPDX license list as source for license matching
    • Goal
      • Aiding in license compliance
      • Hope was to generate SPDX
  • Main functions
    • Scans source code to ID license language
    • Natural Language Process “Bag of words” approach
    • Jakarta index shows how well it matches
    • Levenstein measures to determine where to start/end
    • Output- color coded matches (and deviations)
    • Matched about as well as Noms
    • Accuracy
      • Right license
      • Right region
    • Better than Nomos at extracting full text; Monk really fell short
    • Can be tuned
      • Based on LiD Scores (1-perfect)
        • Scores of above .6 were pretty good, but user can adjust
      • Nomos, being REGEX based is very computationally expensive.
  • Will be available on GitHub
    • But available already
  • Q&A
    • What’s going on with Debian?
    • It’s being tested on Debian, not a lot of feedback yet

 

Tech Team Report - Kate[edit]

  • Spec
    • Have been working on reference examples
      • Filling in how to do examples 
    • Spec being converted to docbooks for style
      • Mobile-friendly
    • Getting the spec up on GitHub so changes can be tracked, pull requests, etc
      • Eventually we’ll move there from Bugzilla for issue tracking
    • FacetoFace in Tahoe
      • Jilayne did a great presentation that is available as video, Kate’s as well
      • JSON format discussion
  • Tools
    • Talked through plans at Face to Face

 

Outreach Team Report - Jack[edit]

  • Accepted for Google Summer of Code
    • Starting to get interest
  • Short meeting last week
    • Talked about feedback from Matt’s project surveying companies
    • Need to decide if we will do a survey
    • Jack says we really need to look at the Ecosystem
      • Define user types and what to tell them they should do
      • Need to paint a picture of what success is with SPDX
      • Some feedback from site “I’m a developer, what do I do?”
  • Considering whether we need someone on the outreach team who is more OSS community-focused
    • Perhaps looking at “SPDX lite” (wrong word) sort of approach, and easy way to get started

 

Legal Team Report - Jilayne/Paul[edit]

  • Good meetings at Tahoe
    • 2 hour working session 
      • Action plan for XML conversion
      • How to completely connect the dots and organize upcoming task
  • Today’s call will follow up
  • Brad Edmondson developing deck and presenting to ABA group

 

Attendees[edit]

  • Mark Charlebois, Qualcomm
  • Rashmi Chitrakar, Qualcomm
  • Phil Odence, Black Duck
  • Kate Stewart, Linux Foundation
  • Philippe Ombrédanne- nexB
  • Paul Madick, Dimension Data
  • Jilayne Lovejoy, ARM
  • Jack Manbeck, TI
  • Michael Herzog- nexB
  • Mark Gisi, Wind River 
  • Thomas Steenbergen, HERE

 


[ANNOUNCE] SPDX has been accepted as a mentoring organization for the Google Summer of Code 2017

Philippe Ombredanne
 

A good news came in yesterday: SPDX has been accepted as a mentoring
organization for the Google Summer of Code 2017 thanks to Gary's hard
work. I look forward to contribute as an admin and mentor!

See https://summerofcode.withgoogle.com/organizations/6438746388955136/

Practically we should expect to have a couple or a few students
allocated to work and contribute to SPDX tools and technologies and
this is a great validation and recognition of the project and the
community efforts.

The next important date for the GSOC is March 20, 2017 when students
can start applying and submit their project proposals.

You can see all the accepted orgs here:
https://summerofcode.withgoogle.com/organizations/
The Linux foundation open printing project is also an accepted organization.

As a side note, http://aboutcode.org which is nexB's FOSS master
project has also been accepted as a mentoring organization and we have
several SPDX-related projects ideas there too:
https://github.com/nexB/aboutcode/wiki/GSOC-2017

--
Cordially
Philippe Ombredanne


Thursday SPDX General Meeting Reminder; Guest Speakers Announcement

Philip Odence
 

Joining us for this month’s meeting will be Mark Charlebois and Rashmi Chitraker from Qualcomm. They will talk about current state of License scanning in Yocto, integrating LiD into Yocto and improving the scanning integration with Yocto. They will also talk about the LiD scanner and how it compares to Fossology.

 

 

GENERAL MEETING

 

Meeting Time: Thurs, March 2, 8am PDT / 10 am CDT / 11am EDT / 15:00 UTC. http://www.timeanddate.com/worldclock/converter.html


Conf call dial-in:

Join the call: https://www.uberconference.com/katestewart

Optional dial in number: 877-297-7470

Alternate number: 512-910-4433

No PIN needed

 

Administrative Agenda

Attendance

Minutes Approval  http://wiki.spdx.org/view/General_Meeting/Minutes/2017-02-02

 

Guest Presentation – Mark/Rashmi

 

Technical Team Report – Kate/Gary

 

Legal Team Report – Jilayne/Paul

 

Business Team Report – Jack

 

Cross Functional Issues –All

 

 

 


Thursday SPDX General Meeting Reminder

Philip Odence
 

All, I have a conflict, so Gary will be chairing this month’s meeting. Thanks, Gary!

 

 

GENERAL MEETING

 

Meeting Time: Thurs, Feb 2, 8am PDT / 10 am CDT / 11am EDT / 15:00 UTC. http://www.timeanddate.com/worldclock/converter.html


Conf call dial-in:

Join the call: https://www.uberconference.com/katestewart

Optional dial in number: 877-297-7470

Alternate number: 512-910-4433

No PIN needed

 

Administrative Agenda

Attendance

Minutes Approval  http://wiki.spdx.org/view/General_Meeting/Minutes/2017-01-05

 

Technical Team Report – Kate/Gary

 

Legal Team Report – Jilayne/Paul

 

Business Team Report – Jack

 

Cross Functional Issues –All

 

 

 


Re: Yocto/OE SPDX Presentation at OSLS

Manbeck, Jack
 

Craig,

 

Thanks. Were excited to see what you have.  We would also love to have you join our tooling discussions. Let me follow up with you, as we are juggling the schedule at the moment.

 

Best regards,

 

Jack Manbeck

 

 

From: spdx-bounces@... [mailto:spdx-bounces@...] On Behalf Of Northway, Craig
Sent: Tuesday, January 24, 2017 12:49 PM
To: spdx@...
Cc: Charlebois, Mark
Subject: Yocto/OE SPDX Presentation at OSLS

 

Hi SPDX Team,

 

Mark Charlebois and I will be presenting at OSLS on our recent efforts to produce SPDX to support the Dronecode project. We have started work to integrate one of our internal license scanning tools, LiD, into Yocto/OE based on the existing Fossology bitbake integration. We plan to make our license scanning tool and our Yocto/OE integration available. We'll be presenting both on our scanning tool, and what we've learnt about how to best manage and author recipes to support license scanning and SPDX generation. You'll find details on us and our presentation here:

 

 

I am also keen on joining any relevant SPDX tooling discussions on Thursday of the summit to discuss how we can collaborate further in this space.

 

Thanks,

Craig

 

 


Yocto/OE SPDX Presentation at OSLS

Craig Northway
 

Hi SPDX Team,

Mark Charlebois and I will be presenting at OSLS on our recent efforts to produce SPDX to support the Dronecode project. We have started work to integrate one of our internal license scanning tools, LiD, into Yocto/OE based on the existing Fossology bitbake integration. We plan to make our license scanning tool and our Yocto/OE integration available. We'll be presenting both on our scanning tool, and what we've learnt about how to best manage and author recipes to support license scanning and SPDX generation. You'll find details on us and our presentation here:


I am also keen on joining any relevant SPDX tooling discussions on Thursday of the summit to discuss how we can collaborate further in this space.

Thanks,
Craig



Re: Open Source Leadership Summit (formerly known as Collab Summit)

Kate Stewart
 

Yes.    We've got Thurs(16th)11:15am-5pm reserved for SPDX.

More next week....  :-)

Kate

On Thu, Jan 5, 2017 at 5:54 PM, J Lovejoy <opensource@...> wrote:
Hi Jack,

Kate is still out, but I believe we have a room on Thursday reserved :)


Jilayne


On Jan 5, 2017, at 2:39 PM, Manbeck, Jack <j-manbeck2@...> wrote:

Jilayne,
 
We spoke with Kate about it on the outreach call before the end of the year. She was checking with the Linux Foundation to see what the plans were. I agree a meeting room for one day would be good.
 
-        Jack
 
 
 
From: spdx-bounces@....org [mailto:spdx-bounces@lists.spdx.org] On Behalf Of J Lovejoy
Sent: Thursday, January 05, 2017 2:46 PM
To: SPDX-general
Subject: Open Source Leadership Summit (formerly known as Collab Summit)
 
Hi All,
 
I should have thought to raise this on the General call today, but do we have a room or plan to have some F2F working session at this year’s Open Source Leadership Summit (formerly Collab Summit) - http://events.linuxfoundation.org/events/open-source-leadership-summit on Feb 14-16 in Lake Tahoe, CA?  We usually do, but it’s a better earlier in the year, so not quite on the radar yet!
 
We discussed it briefly on the legal call and agreed it would be good to have a F2F, but not sure what the plan is for having something official set up.  As people need to make travel plans soon, thought I’d reach out via email.  I am planning on being there, FWIW.
 
Cheers,
Jilayne
 

SPDX Legal Team co-lead
opensource@...



_______________________________________________
Spdx mailing list
Spdx@...
https://lists.spdx.org/mailman/listinfo/spdx




--
Kate Stewart
Sr. Director of Strategic Programs,  The Linux Foundation
Mobile: +1.512.657.3669
Email / Google Talk: kstewart@...