Re: Update on project: Validate license cross references

Mark D Baushke <mdb@...>

My comments are in-line. Look for MDB:

On Aug 9, 2020, at 12:33 PM, Smith Tanjong Agbor <stanjongagbor@...> wrote:

After discussing with mentors: Steve and Gary; we thought it wise to seek everyone's opinion on two topics:

1. Change back the isMatch field to Boolean(true/false)
In the previous email thread on this project; Michael Kaelbling suggested that the "isMatch" field value be changed from boolean to text; and the said value should contain the results of the comparison(between the license text in the xml and that in each of the crossref urls). He suggested that values could be:
  • verbatim
  • noassertion – if no test result is available (for invalid links perhaps)
  • todo – no match attempted
  • “” – no match asserted
  • verbatim2 – matches with \r == \r\n == \n
  • verbatim3 – matches “ignoring whitespace differences” reflowed text
  • verbatim4 – matches ignoring decoration (comments, flower-boxes)
  • template – matches template verbatim (see ppalaga’s comment)
  • et cetera as they become available
One of the issues we identified concerning this approach was
a. The above results are not mutually exclusive. Given that they are not mutually exclusive, we might be compelled to store those text values in a list.
ex: isMatch: [verbatim2, verbatim4, etc]
That said, we thought; do we need all that information? Aren't we over-engineering?

b. Is such detailed information necessary? Parsing this will entail knowing all possible values, and any update on this values will require updating the projects that parse this information.

So, we would like to know your thought process on this, and if storing this information is of utmost importance.


My opinion is that the isMatch operator should be true/false only.

I would also favor the addition of another operator with the name isValid for ensuring that the links exist.

If there is a need for the other functionality, then providing other operators may be desirable.

Perhaps isFuzzyMatch or listOfFuzzyMatches would deal with non-verbatim matches...

In the end, it is desirable if a producer of a package utilizing multiple license F/OSS elements is able to determine if the license is not able to comply with the source licenses if it were to be distributed (such as having something built from a GPLv2.0-only + Apache1.1 set of sources).

To get to that level of usefulness, one needs to know which licenses are equivalent via isMatch or other such idioms.


2. Html formatting of the details on the crossrefs
The progress I made on the project also concerned the html template(that is used to generate the spdx website) to display the license crossrefs details.
Here is the 0BSD license on the website(
and Here is the updated license I have locally, with the crossref details:

So the questions that popped up were the following:
  • Do we need all this information displayed on the website?

I do not believe the extra material is needed. However, if it is 'free' and accurate, I do not mind it being present.
  • Do we need the isWayBackLink parameter(wayback links can be identified visually already)

I am not a fan of information being only visible. I know of many people that are visually impaired, be it via being color blind or blind where funky icons hold no meaning whatever.

  • If the url is not valid, we should not make the url clickable(remove the link as an anchor tag)
It is better to not provide a link which is not able to be followed.
  • Can we use an accordion to display url details?
No thank you.
  • Could we use icons to indicate truth values of fields?

ICONS are mostly evil if they are the only providers of information.
They CAN provide additional information for some kinds of users
who are looking for patterns in the visual presentation, but generally
have problems when the display device is not the one used by the
original graphic designer. Hint: Look at the wide variation in display
on the various kinds of mobile devices as compared with a high resolution 
graphics monitor.

So, design experts' ideas are welcome on this topic.

I am NOT a design expert.

These were the two main topics that require your intervention and contributions.

Thank you for asking.

        Be safe, stay healthy,
        -- Mark

Join { to automatically receive all group messages.