Re: [License-discuss] [Infrastructure] Machine readable source of OSI approved licenses?
Sorry for the delayed response, I get these emails in digest form, so Phil was asking me about a response before I had managed to see your question.
Here is the full answer and explanation of process and progress:
We finally came to some agreement as to how much to rely upon the guidelines, how much markup (as little as possible) to implement and how to go about implementing it at (or a bit after) the LF Collab Summit last April.
In terms of work to be done, we (the SPDX Legal Team) would need to review all of the licenses (well over 200) and determine which needed markup and where. Then someone would need to create the actual markup. We tried a couple ways to go about this somewhat tedious process of reviewing the licenses - e.g. divvying up review of the licenses among team members, etc. - and I finally volunteered to do a first pass myself and then bring any issues/questions to the Legal Team (thinking this would be more efficient...) This also allows for some other more aesthetic cleanup, the details of which I won't go into here, but from which OSI or others might also benefit in terms of how text displays on web pages).
Daniel German (of Ninka) had volunteered to do the actual markup (which he started here: https://github.com/dmgerman/spdxTemplates ), but considering he needs input from the SPDX Legal Team to advise him on what should be marked up for each license he is log-jammed for progress by us (well, really, me to be honest). Once all of that is done, a new version of the SPDX License List will be released, which will include the files that have markup.
In my defense :), that I have fallen down on the job may not be all bad as it may enable one big integrated update including some other inter-related initiatives we are working on for the next version of the SPDX License List instead of several incremental ones. In any case, I am very intent on having this wrapped up by or shortly after Collab Summit.
Another thing that might be helpful to understand for your purposes is how the SPDX License List web pages are generated: the "master" list is kept in a spreadsheet and corresponding .txt files for each license - the current version for which is located here: http://git.spdx.org/?p=license-list.git;a=summary From this "raw data," a conversion tool by Gary O'Neall creates the much prettier list and individual license web pages that you actually see here: http://spdx.org/licenses/ I am greatly understating Gary's work here, but that is the general gist of the process and order of things. (Admittedly, this needs to be better explained on the SPDX website - another item on the to-do list.)
Hope that information helps. Happy to discuss/explain more as needed.
SPDX Legal Team co-lead
cc SPDX-Legal list & dmg