Re: Spec recommendation for paren encapsulation? (was: signifigance of nested parenthesis with only ORs?)

David A. Wheeler

Gary O'Neall [mailto:gary@...]:
If we have more than one line for a compound set of licenses, it would be
ambiguous if the text following the first line of a compound license is part
of the license expression or just some other text. To solve this ambiguity,
we introduced the parenthesis requirement for compound licenses only.
When this was discussed, we also considered requiring compound licenses
to be restricted to a single line, but we decided that would be too limiting.
Good to know! But you still don't need parentheses when the entire expression fits on a line (e.g., "MIT OR BSD-3-Clause"), and such expressions are used in the wild anyway, so tools should correctly interpret data in this form. In addition, if the purpose is to unambiguously find the end, you only need parentheses at the outermost layer. That is, "(A OR B OR C)" is more than enough for this case; you don't need "(A OR (B OR C))".

I think the spec already says that "(A OR B OR C)" is the recommended form, not "(A OR (B OR C))", because the spec only says you should encapsulate *license expressions*. The spec *never* says you should parenthesize *compound expressions*. The spec *permits* this, but it *never* says you should do it.

For clarity, I think that the spec should be tweaked to make this a little more obvious. Basically, "license expression" should be changed to "license-expression" (note the additional hyphen) in this sentence:
For the Tag:value format, any license expression that consists of more than one license identifier and/or LicenseRef, should be encapsulated by parentheses: "( )".
I don't think this is a change in meaning, since it's clear that a "license expression" is a "license-expression", but it might help the reader go back and look at the actual syntax rules.

Tools still need to be *able* to parse SDPX license expressions that aren't completely surrounded by parentheses. People are likely to provide data without parentheses, and tools need to be able to correctly parse those cases. Many SPDX license expressions occur outside SPDX files (e.g., in package manager data), and in some of those formats it's not possible to have multi-line data anyway. Even if they do, we're often handling data provided by humans, and those darn humans don't always surround compound expressions with parens.

--- David A. Wheeler

Join to automatically receive all group messages.