Re: Element IDs


Kate Stewart
 

Thanks for summarizing this David and commenting on it further William.

Thanks for the reminder in the call Bob about pointing to the NTIA documents.    Here's the existing white paper where multiple months were spent debating the some of the concepts we were talking today:  

See: 2 Global Software Component Identification
"A namespace is a way to partition names or identifiers. Component names within a namespace must be unique."

"A comprehensive component identification system will need to account for multiple names for the same component, likely through aliases or equivalency relationships."

There is also a good dicussion of of use of PURLs, SWHIDs, CPEs, etc. which are viewed as partial solutions, for component identification. 
  
Current SPDX 2.2 approach supports the above whitepaper guidance, and I think we're pretty much in agreement today we want to keep supporting this for 3.0. 

Thanks,
Kate



On Tue, Aug 3, 2021 at 1:34 PM David Kemp <dk190a@...> wrote:
Kate may be right that we'll need a whitepaper.  But not just yet.  Here is what I heard today:

1) namespaces must be globally unique, and UUIDs in practice collide, in part because they don't have a full 128 bits of uniqueness because they contain built-in structure, and in part because even 128 bits isn't enough.  So just for discussion/whitepaper purposes, assume that namespaces are 256 bit cryptographically-random values.

2) elements are always identified by namespace and local ID, where local means under the control of the namespace owner.  Don't get hung up on what owner means - anybody can become an owner by generating a 256 bit random number for their namespace.

3) Element identifiers are built into the model.  The diagram shows Element having an "idString" property, but what it should have is namespace and local_id properties that together form the primary key for the element.  Sebastian is correct, a separator isn't needed in the model because that only comes into play when serializing Element identifiers.

4) Each serialization of Element identifiers MUST allow them to be unambiguously deserialized back to namespace and local_id properties, which is where the separator comes in.  If the deserialized value doesn't have a namespace then the Element inherits it from its containing document, or if the element doesn't have a document or other source for namespace, the local_id by itself is invalid.

5) (My pet issue) although a namespace owner can choose anything as local_ids, Element also has information such as Class, name, and comment that can be used as a hint/label when serializing Element IDs.

Building namespace and local_id into the model explicitly provides a foundation for discussing and accommodating all use cases.  IdString is an obstacle to that discussion.

Dave

Join {Spdx-tech@lists.spdx.org to automatically receive all group messages.