2022-07-26 Tech Meeting Model Proposal


William Bartholomew (CELA)
 

This PR has the model I proposed at the end of today’s meeting:

https://github.com/spdx/spdx-3-model/pull/25

 

From the PR description:

This is the model I proposed at the end of the meeting. Notable changes:

  • Added the missing inherits from Collection to Element.
  • Renamed Document to SpdxDocument.
  • Added Bundle and moved BOM to inherit from Bundle to remove the implication that BOM's need to be a serialization root.
  • Moved SpdxDocument to inherit from Bundle since they are a special type of bundle.

 

There is an open question on whether the distinction between SpdxDocument and Bundle is necessary, but this proposal doesn't force you to use SpdxDocument while providing a clear migration path from SPDX 2.x.

 

If during the formal definition of the classes it becomes clear we don't need both they can be collapsed together, but for now this would unblock us.

 

 

Regards,

 

William Bartholomew (he/him) – Let’s chat

Principal Security Strategist

Global Cybersecurity Policy – Microsoft

 

My working day may not be your working day. Please don’t feel obliged to reply to this e-mail outside of your normal working hours.

 


David Kemp
 

SpdxDocument has the downloadUrl property that is not included in Bundle, which is why they are not the same.

They are also semantically different.  The collection of elements that make up a BOM/SBOM is serialization-independent and always the same for a given BOM.  The collection of elements that make up an SpdxDocument is arbitrary.  Bundle and SpdxDocument are both Collections, but
  • Bundle is a logical collection of the elements in a BOM or the logical collection of elements that have some purpose/context (e.g., all Identities starting with a-c).
  • SpdxDocument describes a physical collection and lists the elements in a transfer unit file
A Bundle can be split across many SpdxDocuments, or many Bundles can be merged into one SpdxDocument.  Or an SpdxDocument can have elements unrelated to Bundles, such as every element created in the past 5 minutes, generated by a cron job and used to keep Element Stores in sync.

If you make SpdxDocument inherit from Bundle, you destroy the semantic difference between logical and physical collections.  In order to unblock the discussion, you should:
  • show SpdxDocument inherit from Collection
  • show the physical data structure described by SpdxDocument in the Data Structure side of the diagram - it is a single root object containing properties, including transfer unit creation date and actor, other property defaults, an array of [1..*] elementValues and an array of [0..*] spdxDocumentReferences.
Regards,
David


On Tue, Jul 26, 2022 at 3:04 PM William Bartholomew (CELA) via lists.spdx.org <willbar=microsoft.com@...> wrote:

This PR has the model I proposed at the end of today’s meeting:

https://github.com/spdx/spdx-3-model/pull/25

 

From the PR description:

This is the model I proposed at the end of the meeting. Notable changes:

  • Added the missing inherits from Collection to Element.
  • Renamed Document to SpdxDocument.
  • Added Bundle and moved BOM to inherit from Bundle to remove the implication that BOM's need to be a serialization root.
  • Moved SpdxDocument to inherit from Bundle since they are a special type of bundle.

 

There is an open question on whether the distinction between SpdxDocument and Bundle is necessary, but this proposal doesn't force you to use SpdxDocument while providing a clear migration path from SPDX 2.x.

 

If during the formal definition of the classes it becomes clear we don't need both they can be collapsed together, but for now this would unblock us.

 

 

Regards,

 

William Bartholomew (he/him) – Let’s chat

Principal Security Strategist

Global Cybersecurity Policy – Microsoft

 

My working day may not be your working day. Please don’t feel obliged to reply to this e-mail outside of your normal working hours.

 


Dick Brooks
 

In my opinion every artifact that follows the SPDX V n.n spec uses the SPDXDocument base type that all other SPDX artifacts inherit from. Looking at the current model one could infer that elements are not part of an SPDXDocument.

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of David Kemp
Sent: Tuesday, July 26, 2022 8:47 PM
To: William Bartholomew (CELA) <willbar@...>
Cc: spdx-tech@...
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

SpdxDocument has the downloadUrl property that is not included in Bundle, which is why they are not the same.

They are also semantically different.  The collection of elements that make up a BOM/SBOM is serialization-independent and always the same for a given BOM.  The collection of elements that make up an SpdxDocument is arbitrary.  Bundle and SpdxDocument are both Collections, but

  • Bundle is a logical collection of the elements in a BOM or the logical collection of elements that have some purpose/context (e.g., all Identities starting with a-c).
  • SpdxDocument describes a physical collection and lists the elements in a transfer unit file

A Bundle can be split across many SpdxDocuments, or many Bundles can be merged into one SpdxDocument.  Or an SpdxDocument can have elements unrelated to Bundles, such as every element created in the past 5 minutes, generated by a cron job and used to keep Element Stores in sync.

If you make SpdxDocument inherit from Bundle, you destroy the semantic difference between logical and physical collections.  In order to unblock the discussion, you should:

  • show SpdxDocument inherit from Collection
  • show the physical data structure described by SpdxDocument in the Data Structure side of the diagram - it is a single root object containing properties, including transfer unit creation date and actor, other property defaults, an array of [1..*] elementValues and an array of [0..*] spdxDocumentReferences.

Regards,
David

 

On Tue, Jul 26, 2022 at 3:04 PM William Bartholomew (CELA) via lists.spdx.org <willbar=microsoft.com@...> wrote:

This PR has the model I proposed at the end of today’s meeting:

https://github.com/spdx/spdx-3-model/pull/25

 

From the PR description:

This is the model I proposed at the end of the meeting. Notable changes:

  • Added the missing inherits from Collection to Element.
  • Renamed Document to SpdxDocument.
  • Added Bundle and moved BOM to inherit from Bundle to remove the implication that BOM's need to be a serialization root.
  • Moved SpdxDocument to inherit from Bundle since they are a special type of bundle.

 

There is an open question on whether the distinction between SpdxDocument and Bundle is necessary, but this proposal doesn't force you to use SpdxDocument while providing a clear migration path from SPDX 2.x.

 

If during the formal definition of the classes it becomes clear we don't need both they can be collapsed together, but for now this would unblock us.

 

 

Regards,

 

William Bartholomew (he/him) – Let’s chat

Principal Security Strategist

Global Cybersecurity Policy – Microsoft

 

My working day may not be your working day. Please don’t feel obliged to reply to this e-mail outside of your normal working hours.

 


David Kemp
 

Element is the type that all other element types inherit from.  SpdxDocument inherits from Element, as do Bundle, BOM, and SBOM.

SBOM is the root element of a logical collection of elements in a software bill of materials.  The physical format of a software bill of materials could be anything, including a web page or PDF file.  For machine readability and ease of human use, other physical formats are defined, including tag-value, spreadsheet, RDF, JSON, etc.

The canonicalization group is working to define unambiguous algorithmic rules for converting between physical formats, so that a signature of an element or element collection can be computed once, and tools using any physical format can validate that signature.  That unambiguous algorithmic conversion is enabled by a schema that applies to all data syntaxes.  The root of that schema is a "transfer unit" or lower-case "spdx document".  Upper-case SpdxDocument is the logical element type that is metadata about the spdx document file.

Regards,
David



On Wed, Jul 27, 2022 at 8:20 AM Dick Brooks <dick@...> wrote:

In my opinion every artifact that follows the SPDX V n.n spec uses the SPDXDocument base type that all other SPDX artifacts inherit from. Looking at the current model one could infer that elements are not part of an SPDXDocument.

 

Thanks,

 

Dick Brooks



Dick Brooks
 

David,

 

Is it possible to have an Element that is not part of an SPDXDocument and still be a valid SPDX document?

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: David Kemp <dk190a@...>
Sent: Wednesday, July 27, 2022 10:40 AM
To: dick@...
Cc: William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Element is the type that all other element types inherit from.  SpdxDocument inherits from Element, as do Bundle, BOM, and SBOM.

SBOM is the root element of a logical collection of elements in a software bill of materials.  The physical format of a software bill of materials could be anything, including a web page or PDF file.  For machine readability and ease of human use, other physical formats are defined, including tag-value, spreadsheet, RDF, JSON, etc.

The canonicalization group is working to define unambiguous algorithmic rules for converting between physical formats, so that a signature of an element or element collection can be computed once, and tools using any physical format can validate that signature.  That unambiguous algorithmic conversion is enabled by a schema that applies to all data syntaxes.  The root of that schema is a "transfer unit" or lower-case "spdx document".  Upper-case SpdxDocument is the logical element type that is metadata about the spdx document file.

Regards,
David

 

On Wed, Jul 27, 2022 at 8:20 AM Dick Brooks <dick@...> wrote:

In my opinion every artifact that follows the SPDX V n.n spec uses the SPDXDocument base type that all other SPDX artifacts inherit from. Looking at the current model one could infer that elements are not part of an SPDXDocument.

 

Thanks,

 

Dick Brooks

 


David Kemp
 

Dick,

Is it possible to have an Element that is not part of an SPDXDocument and still be a valid SPDX document?

An Element exists in the logical Element graph, and the Element graph can be updated without ever serializing any elements.  So it is possible to have an Element that has never been serialized into any SPDX document.

An SpdxDocument element is metadata about an SPDX document (transfer unit).  A document can exist without ever creating an SpdxDocument element to describe it.  So if I interpreted the question correctly, yes.

If an SpdxDocument element is created to describe a document / transfer unit, it lists every Element that is serialized in that document.

Regards,
David


On Wed, Jul 27, 2022 at 10:51 AM Dick Brooks <dick@...> wrote:

David,

 

Is it possible to have an Element that is not part of an SPDXDocument and still be a valid SPDX document?

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: David Kemp <dk190a@...>
Sent: Wednesday, July 27, 2022 10:40 AM
To: dick@...
Cc: William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Element is the type that all other element types inherit from.  SpdxDocument inherits from Element, as do Bundle, BOM, and SBOM.

SBOM is the root element of a logical collection of elements in a software bill of materials.  The physical format of a software bill of materials could be anything, including a web page or PDF file.  For machine readability and ease of human use, other physical formats are defined, including tag-value, spreadsheet, RDF, JSON, etc.

The canonicalization group is working to define unambiguous algorithmic rules for converting between physical formats, so that a signature of an element or element collection can be computed once, and tools using any physical format can validate that signature.  That unambiguous algorithmic conversion is enabled by a schema that applies to all data syntaxes.  The root of that schema is a "transfer unit" or lower-case "spdx document".  Upper-case SpdxDocument is the logical element type that is metadata about the spdx document file.

Regards,
David

 

On Wed, Jul 27, 2022 at 8:20 AM Dick Brooks <dick@...> wrote:

In my opinion every artifact that follows the SPDX V n.n spec uses the SPDXDocument base type that all other SPDX artifacts inherit from. Looking at the current model one could infer that elements are not part of an SPDXDocument.

 

Thanks,

 

Dick Brooks

 


Gary O'Neall
 

Possible yes – but with limitations and challenges.  The one challenge with having an Element outside of an SPDXDocument is validation.  Once the canonicalization work is done, we may have a verification method that does not require a containing SPDXDocument.  Another challenge is the location of the external SPDX element – without any locator information, finding the element would need to be done out of band.

 

Gary

 

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of David Kemp
Sent: Wednesday, July 27, 2022 9:43 AM
To: dick@...
Cc: William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Dick,

Is it possible to have an Element that is not part of an SPDXDocument and still be a valid SPDX document?

 

An Element exists in the logical Element graph, and the Element graph can be updated without ever serializing any elements.  So it is possible to have an Element that has never been serialized into any SPDX document.

An SpdxDocument element is metadata about an SPDX document (transfer unit).  A document can exist without ever creating an SpdxDocument element to describe it.  So if I interpreted the question correctly, yes.

If an SpdxDocument element is created to describe a document / transfer unit, it lists every Element that is serialized in that document.

Regards,
David

 

 

On Wed, Jul 27, 2022 at 10:51 AM Dick Brooks <dick@...> wrote:

David,

 

Is it possible to have an Element that is not part of an SPDXDocument and still be a valid SPDX document?

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: David Kemp <dk190a@...>
Sent: Wednesday, July 27, 2022 10:40 AM
To: dick@...
Cc: William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Element is the type that all other element types inherit from.  SpdxDocument inherits from Element, as do Bundle, BOM, and SBOM.

SBOM is the root element of a logical collection of elements in a software bill of materials.  The physical format of a software bill of materials could be anything, including a web page or PDF file.  For machine readability and ease of human use, other physical formats are defined, including tag-value, spreadsheet, RDF, JSON, etc.

The canonicalization group is working to define unambiguous algorithmic rules for converting between physical formats, so that a signature of an element or element collection can be computed once, and tools using any physical format can validate that signature.  That unambiguous algorithmic conversion is enabled by a schema that applies to all data syntaxes.  The root of that schema is a "transfer unit" or lower-case "spdx document".  Upper-case SpdxDocument is the logical element type that is metadata about the spdx document file.

Regards,
David

 

On Wed, Jul 27, 2022 at 8:20 AM Dick Brooks <dick@...> wrote:

In my opinion every artifact that follows the SPDX V n.n spec uses the SPDXDocument base type that all other SPDX artifacts inherit from. Looking at the current model one could infer that elements are not part of an SPDXDocument.

 

Thanks,

 

Dick Brooks

 


Dick Brooks
 

Gary,

 

One SPDX Element that is vitally important to our processing of SPDX SBOM’s is the presence of an SPDXVersion element.

 

Under the V 3.0 model, is it possible to have a valid SPDX Document, without an SPDXVersion element?

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: Gary O'Neall <gary@...>
Sent: Wednesday, July 27, 2022 12:48 PM
To: 'David Kemp' <dk190a@...>; dick@...
Cc: 'William Bartholomew (CELA)' <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: RE: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Possible yes – but with limitations and challenges.  The one challenge with having an Element outside of an SPDXDocument is validation.  Once the canonicalization work is done, we may have a verification method that does not require a containing SPDXDocument.  Another challenge is the location of the external SPDX element – without any locator information, finding the element would need to be done out of band.

 

Gary

 

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of David Kemp
Sent: Wednesday, July 27, 2022 9:43 AM
To: dick@...
Cc: William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Dick,

Is it possible to have an Element that is not part of an SPDXDocument and still be a valid SPDX document?

 

An Element exists in the logical Element graph, and the Element graph can be updated without ever serializing any elements.  So it is possible to have an Element that has never been serialized into any SPDX document.

An SpdxDocument element is metadata about an SPDX document (transfer unit).  A document can exist without ever creating an SpdxDocument element to describe it.  So if I interpreted the question correctly, yes.

If an SpdxDocument element is created to describe a document / transfer unit, it lists every Element that is serialized in that document.

Regards,
David

 

 

On Wed, Jul 27, 2022 at 10:51 AM Dick Brooks <dick@...> wrote:

David,

 

Is it possible to have an Element that is not part of an SPDXDocument and still be a valid SPDX document?

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: David Kemp <dk190a@...>
Sent: Wednesday, July 27, 2022 10:40 AM
To: dick@...
Cc: William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Element is the type that all other element types inherit from.  SpdxDocument inherits from Element, as do Bundle, BOM, and SBOM.

SBOM is the root element of a logical collection of elements in a software bill of materials.  The physical format of a software bill of materials could be anything, including a web page or PDF file.  For machine readability and ease of human use, other physical formats are defined, including tag-value, spreadsheet, RDF, JSON, etc.

The canonicalization group is working to define unambiguous algorithmic rules for converting between physical formats, so that a signature of an element or element collection can be computed once, and tools using any physical format can validate that signature.  That unambiguous algorithmic conversion is enabled by a schema that applies to all data syntaxes.  The root of that schema is a "transfer unit" or lower-case "spdx document".  Upper-case SpdxDocument is the logical element type that is metadata about the spdx document file.

Regards,
David

 

On Wed, Jul 27, 2022 at 8:20 AM Dick Brooks <dick@...> wrote:

In my opinion every artifact that follows the SPDX V n.n spec uses the SPDXDocument base type that all other SPDX artifacts inherit from. Looking at the current model one could infer that elements are not part of an SPDXDocument.

 

Thanks,

 

Dick Brooks

 


Gary O'Neall
 

By SPDXVersion, do you mean the version of the SPDX specification?  If so, this is required in SPDX 2.X and I assume will be required in SPDX 3.0.  If you are referring to the package version, I didn’t see this in the current SPDX 3 model – perhaps I’m missing something?  I recall it being discussed in some depth.  It is an optional field in SPDX 2.X.

 

Gary

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of Dick Brooks
Sent: Wednesday, July 27, 2022 9:55 AM
To: 'Gary O'Neall' <gary@...>; 'David Kemp' <dk190a@...>
Cc: 'William Bartholomew (CELA)' <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Gary,

 

One SPDX Element that is vitally important to our processing of SPDX SBOM’s is the presence of an SPDXVersion element.

 

Under the V 3.0 model, is it possible to have a valid SPDX Document, without an SPDXVersion element?

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: Gary O'Neall <gary@...>
Sent: Wednesday, July 27, 2022 12:48 PM
To: 'David Kemp' <dk190a@...>; dick@...
Cc: 'William Bartholomew (CELA)' <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: RE: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Possible yes – but with limitations and challenges.  The one challenge with having an Element outside of an SPDXDocument is validation.  Once the canonicalization work is done, we may have a verification method that does not require a containing SPDXDocument.  Another challenge is the location of the external SPDX element – without any locator information, finding the element would need to be done out of band.

 

Gary

 

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of David Kemp
Sent: Wednesday, July 27, 2022 9:43 AM
To: dick@...
Cc: William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Dick,

Is it possible to have an Element that is not part of an SPDXDocument and still be a valid SPDX document?

 

An Element exists in the logical Element graph, and the Element graph can be updated without ever serializing any elements.  So it is possible to have an Element that has never been serialized into any SPDX document.

An SpdxDocument element is metadata about an SPDX document (transfer unit).  A document can exist without ever creating an SpdxDocument element to describe it.  So if I interpreted the question correctly, yes.

If an SpdxDocument element is created to describe a document / transfer unit, it lists every Element that is serialized in that document.

Regards,
David

 

 

On Wed, Jul 27, 2022 at 10:51 AM Dick Brooks <dick@...> wrote:

David,

 

Is it possible to have an Element that is not part of an SPDXDocument and still be a valid SPDX document?

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: David Kemp <dk190a@...>
Sent: Wednesday, July 27, 2022 10:40 AM
To: dick@...
Cc: William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Element is the type that all other element types inherit from.  SpdxDocument inherits from Element, as do Bundle, BOM, and SBOM.

SBOM is the root element of a logical collection of elements in a software bill of materials.  The physical format of a software bill of materials could be anything, including a web page or PDF file.  For machine readability and ease of human use, other physical formats are defined, including tag-value, spreadsheet, RDF, JSON, etc.

The canonicalization group is working to define unambiguous algorithmic rules for converting between physical formats, so that a signature of an element or element collection can be computed once, and tools using any physical format can validate that signature.  That unambiguous algorithmic conversion is enabled by a schema that applies to all data syntaxes.  The root of that schema is a "transfer unit" or lower-case "spdx document".  Upper-case SpdxDocument is the logical element type that is metadata about the spdx document file.

Regards,
David

 

On Wed, Jul 27, 2022 at 8:20 AM Dick Brooks <dick@...> wrote:

In my opinion every artifact that follows the SPDX V n.n spec uses the SPDXDocument base type that all other SPDX artifacts inherit from. Looking at the current model one could infer that elements are not part of an SPDXDocument.

 

Thanks,

 

Dick Brooks

 


Dick Brooks
 

Gary,

 

I was specifically referring to SPDXVersion in section 6.1 in the V 2.2.2 spec.

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: Gary O'Neall <gary@...>
Sent: Wednesday, July 27, 2022 1:21 PM
To: dick@...; 'David Kemp' <dk190a@...>
Cc: 'William Bartholomew (CELA)' <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: RE: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

By SPDXVersion, do you mean the version of the SPDX specification?  If so, this is required in SPDX 2.X and I assume will be required in SPDX 3.0.  If you are referring to the package version, I didn’t see this in the current SPDX 3 model – perhaps I’m missing something?  I recall it being discussed in some depth.  It is an optional field in SPDX 2.X.

 

Gary

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of Dick Brooks
Sent: Wednesday, July 27, 2022 9:55 AM
To: 'Gary O'Neall' <gary@...>; 'David Kemp' <dk190a@...>
Cc: 'William Bartholomew (CELA)' <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Gary,

 

One SPDX Element that is vitally important to our processing of SPDX SBOM’s is the presence of an SPDXVersion element.

 

Under the V 3.0 model, is it possible to have a valid SPDX Document, without an SPDXVersion element?

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: Gary O'Neall <gary@...>
Sent: Wednesday, July 27, 2022 12:48 PM
To: 'David Kemp' <dk190a@...>; dick@...
Cc: 'William Bartholomew (CELA)' <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: RE: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Possible yes – but with limitations and challenges.  The one challenge with having an Element outside of an SPDXDocument is validation.  Once the canonicalization work is done, we may have a verification method that does not require a containing SPDXDocument.  Another challenge is the location of the external SPDX element – without any locator information, finding the element would need to be done out of band.

 

Gary

 

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of David Kemp
Sent: Wednesday, July 27, 2022 9:43 AM
To: dick@...
Cc: William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Dick,

Is it possible to have an Element that is not part of an SPDXDocument and still be a valid SPDX document?

 

An Element exists in the logical Element graph, and the Element graph can be updated without ever serializing any elements.  So it is possible to have an Element that has never been serialized into any SPDX document.

An SpdxDocument element is metadata about an SPDX document (transfer unit).  A document can exist without ever creating an SpdxDocument element to describe it.  So if I interpreted the question correctly, yes.

If an SpdxDocument element is created to describe a document / transfer unit, it lists every Element that is serialized in that document.

Regards,
David

 

 

On Wed, Jul 27, 2022 at 10:51 AM Dick Brooks <dick@...> wrote:

David,

 

Is it possible to have an Element that is not part of an SPDXDocument and still be a valid SPDX document?

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: David Kemp <dk190a@...>
Sent: Wednesday, July 27, 2022 10:40 AM
To: dick@...
Cc: William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Element is the type that all other element types inherit from.  SpdxDocument inherits from Element, as do Bundle, BOM, and SBOM.

SBOM is the root element of a logical collection of elements in a software bill of materials.  The physical format of a software bill of materials could be anything, including a web page or PDF file.  For machine readability and ease of human use, other physical formats are defined, including tag-value, spreadsheet, RDF, JSON, etc.

The canonicalization group is working to define unambiguous algorithmic rules for converting between physical formats, so that a signature of an element or element collection can be computed once, and tools using any physical format can validate that signature.  That unambiguous algorithmic conversion is enabled by a schema that applies to all data syntaxes.  The root of that schema is a "transfer unit" or lower-case "spdx document".  Upper-case SpdxDocument is the logical element type that is metadata about the spdx document file.

Regards,
David

 

On Wed, Jul 27, 2022 at 8:20 AM Dick Brooks <dick@...> wrote:

In my opinion every artifact that follows the SPDX V n.n spec uses the SPDXDocument base type that all other SPDX artifacts inherit from. Looking at the current model one could infer that elements are not part of an SPDXDocument.

 

Thanks,

 

Dick Brooks

 


David Kemp
 

Keep in mind that SPDX v2 is a schema - it defines the syntax of SPDX data, and it calls the top-level data type SPDXDocument.  There is no such thing as graph elements and edges in SPDX v2.

In contrast, SPDX v3 is a logical graph.  A collection element in that graph is semantically a "collection of elements", but data schemas are needed to define various syntaxes that have "collection-ish" semantics.  There are many schema types that could represent collections, object and array are two basic syntaxes in JSON, and a schema specifies which to use and the details of how to use it.

That's a lot of detail to answer the question, but the point is that in SPDX v2, "SPDXDocument" is a data type.  In SPDX v3 "SpdxDocument" is an element which is metadata about a file containing data. SPDX v3 has elements that describe packages and files and documents (called Package and File and SpdxDocument respectively).  It isn't too hard to keep in mind the difference between a package and the Package Element that describes it.  But for some reason it is nearly impossible to keep in mind the difference between a document and a Document Element that describes it.

So William came up with the brilliant idea of calling a file that contains data a "transfer unit".   Normally it would be called "document", as in Word document or PDF document or XML document or JSON document, but due to that mental block, it is less ambiguous to call the file a "transfer unit".  The Element type that describes a transfer unit file is called "SpdxDocument".  And the schema type that defines the syntax of a transfer unit file is called TransferUnit.

SPDX v2 document syntax: SPDXDocument
SPDX v3 document syntax: TransferUnit
SPDX v3 element metadata about a transfer unit: SpdxDocument.

Clear as mud?

In any case, the transfer unit file contains a specVersion property.  And every logical Element contains a specVersion property.  To repeat a previous email, a proposed syntax for the v3 transfer unit is:

TransferUnit = Record
   1 namespace        IRI
   2 namespaceMap     NamespaceMap optional
   3 createdBy        ElementIRI [1..*]
   4 created          DateTime
   5 specVersion      SemVer                // Default value for all elements serialized in this file
   6 profiles         ProfileIdentifier [1..*]
   7 dataLicense      LicenseId
   8 elementValues    Element [1..*]        // All of the elements serialized in this file
   9 spdxFileId       ElementIRI optional
  10 spdxFileRefs     ElementIRI [0..*]

The transfer unit data has one copy of specVersion that is the default value for all element values that don't override it.

Logical elements always have full IRIs and explicit values for every property.  Serializing elements into data files factors out the common data to save space and make elements easier to read without the extra boilerplate. Deserializing expands the common data back into element values.

Regards,
David

On Wed, Jul 27, 2022 at 1:24 PM Dick Brooks <dick@...> wrote:

Gary,

 

I was specifically referring to SPDXVersion in section 6.1 in the V 2.2.2 spec.

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: Gary O'Neall <gary@...>
Sent: Wednesday, July 27, 2022 1:21 PM
To: dick@...; 'David Kemp' <dk190a@...>
Cc: 'William Bartholomew (CELA)' <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: RE: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

By SPDXVersion, do you mean the version of the SPDX specification?  If so, this is required in SPDX 2.X and I assume will be required in SPDX 3.0.  If you are referring to the package version, I didn’t see this in the current SPDX 3 model – perhaps I’m missing something?  I recall it being discussed in some depth.  It is an optional field in SPDX 2.X.

 

Gary

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of Dick Brooks
Sent: Wednesday, July 27, 2022 9:55 AM
To: 'Gary O'Neall' <gary@...>; 'David Kemp' <dk190a@...>
Cc: 'William Bartholomew (CELA)' <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Gary,

 

One SPDX Element that is vitally important to our processing of SPDX SBOM’s is the presence of an SPDXVersion element.

 

Under the V 3.0 model, is it possible to have a valid SPDX Document, without an SPDXVersion element?

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: Gary O'Neall <gary@...>
Sent: Wednesday, July 27, 2022 12:48 PM
To: 'David Kemp' <dk190a@...>; dick@...
Cc: 'William Bartholomew (CELA)' <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: RE: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Possible yes – but with limitations and challenges.  The one challenge with having an Element outside of an SPDXDocument is validation.  Once the canonicalization work is done, we may have a verification method that does not require a containing SPDXDocument.  Another challenge is the location of the external SPDX element – without any locator information, finding the element would need to be done out of band.

 

Gary

 

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of David Kemp
Sent: Wednesday, July 27, 2022 9:43 AM
To: dick@...
Cc: William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Dick,

Is it possible to have an Element that is not part of an SPDXDocument and still be a valid SPDX document?

 

An Element exists in the logical Element graph, and the Element graph can be updated without ever serializing any elements.  So it is possible to have an Element that has never been serialized into any SPDX document.

An SpdxDocument element is metadata about an SPDX document (transfer unit).  A document can exist without ever creating an SpdxDocument element to describe it.  So if I interpreted the question correctly, yes.

If an SpdxDocument element is created to describe a document / transfer unit, it lists every Element that is serialized in that document.

Regards,
David

 

 

On Wed, Jul 27, 2022 at 10:51 AM Dick Brooks <dick@...> wrote:

David,

 

Is it possible to have an Element that is not part of an SPDXDocument and still be a valid SPDX document?

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: David Kemp <dk190a@...>
Sent: Wednesday, July 27, 2022 10:40 AM
To: dick@...
Cc: William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Element is the type that all other element types inherit from.  SpdxDocument inherits from Element, as do Bundle, BOM, and SBOM.

SBOM is the root element of a logical collection of elements in a software bill of materials.  The physical format of a software bill of materials could be anything, including a web page or PDF file.  For machine readability and ease of human use, other physical formats are defined, including tag-value, spreadsheet, RDF, JSON, etc.

The canonicalization group is working to define unambiguous algorithmic rules for converting between physical formats, so that a signature of an element or element collection can be computed once, and tools using any physical format can validate that signature.  That unambiguous algorithmic conversion is enabled by a schema that applies to all data syntaxes.  The root of that schema is a "transfer unit" or lower-case "spdx document".  Upper-case SpdxDocument is the logical element type that is metadata about the spdx document file.

Regards,
David

 

On Wed, Jul 27, 2022 at 8:20 AM Dick Brooks <dick@...> wrote:

In my opinion every artifact that follows the SPDX V n.n spec uses the SPDXDocument base type that all other SPDX artifacts inherit from. Looking at the current model one could infer that elements are not part of an SPDXDocument.

 

Thanks,

 

Dick Brooks

 


Gary O'Neall
 

Actually – SPDX V2 is an RDF Ontology which is represented as a graph.  We also represent the spec in JSON Schema form – this may be the source of the confusion.  Note that we start with the OWL representation and generate the JSON schema since the RDF OWL document has additional semantic information which can not be represented in the JSON schema.

 

For the graph representation of SPDX version 2, I would suggest looking at the RDF OWL document.

 

In the SPDX V2 RDF Owl document, the SPDX Document is a logical element, not a data type.  You will find in the OWL document that the SPDX document MUST have an IRI and fully specified fields, just as described below.  It is more than just a collection, it also has properties related to its creation.

 

The semantics of SPDX document is a bit different in the proposed 3.0 model – primarily due to a lot of the creation information being moved into the element.  But both the 2.0 and proposed 3.0 SPDX documents contain information about the transfer unit.

 

BTW - I still disagree with renaming any SPDX 2.0 class or property unless there is evidence it is causing confusion or issues in actual implementation, so I would prefer to keep SpdxDocument.

 

Regards,

Gary

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of David Kemp
Sent: Wednesday, July 27, 2022 12:43 PM
To: dick@...
Cc: Gary O'Neall <gary@...>; William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Keep in mind that SPDX v2 is a schema - it defines the syntax of SPDX data, and it calls the top-level data type SPDXDocument.  There is no such thing as graph elements and edges in SPDX v2.

In contrast, SPDX v3 is a logical graph.  A collection element in that graph is semantically a "collection of elements", but data schemas are needed to define various syntaxes that have "collection-ish" semantics.  There are many schema types that could represent collections, object and array are two basic syntaxes in JSON, and a schema specifies which to use and the details of how to use it.

That's a lot of detail to answer the question, but the point is that in SPDX v2, "SPDXDocument" is a data type.  In SPDX v3 "SpdxDocument" is an element which is metadata about a file containing data. SPDX v3 has elements that describe packages and files and documents (called Package and File and SpdxDocument respectively).  It isn't too hard to keep in mind the difference between a package and the Package Element that describes it.  But for some reason it is nearly impossible to keep in mind the difference between a document and a Document Element that describes it.

So William came up with the brilliant idea of calling a file that contains data a "transfer unit".   Normally it would be called "document", as in Word document or PDF document or XML document or JSON document, but due to that mental block, it is less ambiguous to call the file a "transfer unit".  The Element type that describes a transfer unit file is called "SpdxDocument".  And the schema type that defines the syntax of a transfer unit file is called TransferUnit.

SPDX v2 document syntax: SPDXDocument
SPDX v3 document syntax: TransferUnit
SPDX v3 element metadata about a transfer unit: SpdxDocument.

Clear as mud?

In any case, the transfer unit file contains a specVersion property.  And every logical Element contains a specVersion property.  To repeat a previous email, a proposed syntax for the v3 transfer unit is:

TransferUnit = Record
   1 namespace        IRI
   2 namespaceMap     NamespaceMap optional
   3 createdBy        ElementIRI [1..*]
   4 created          DateTime
   5 specVersion      SemVer                // Default value for all elements serialized in this file
   6 profiles         ProfileIdentifier [1..*]
   7 dataLicense      LicenseId
   8 elementValues    Element [1..*]        // All of the elements serialized in this file
   9 spdxFileId       ElementIRI optional
  10 spdxFileRefs     ElementIRI [0..*]

The transfer unit data has one copy of specVersion that is the default value for all element values that don't override it.

Logical elements always have full IRIs and explicit values for every property.  Serializing elements into data files factors out the common data to save space and make elements easier to read without the extra boilerplate. Deserializing expands the common data back into element values.

 

Regards,
David

 

On Wed, Jul 27, 2022 at 1:24 PM Dick Brooks <dick@...> wrote:

Gary,

 

I was specifically referring to SPDXVersion in section 6.1 in the V 2.2.2 spec.

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: Gary O'Neall <gary@...>
Sent: Wednesday, July 27, 2022 1:21 PM
To: dick@...; 'David Kemp' <dk190a@...>
Cc: 'William Bartholomew (CELA)' <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: RE: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

By SPDXVersion, do you mean the version of the SPDX specification?  If so, this is required in SPDX 2.X and I assume will be required in SPDX 3.0.  If you are referring to the package version, I didn’t see this in the current SPDX 3 model – perhaps I’m missing something?  I recall it being discussed in some depth.  It is an optional field in SPDX 2.X.

 

Gary

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of Dick Brooks
Sent: Wednesday, July 27, 2022 9:55 AM
To: 'Gary O'Neall' <gary@...>; 'David Kemp' <dk190a@...>
Cc: 'William Bartholomew (CELA)' <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Gary,

 

One SPDX Element that is vitally important to our processing of SPDX SBOM’s is the presence of an SPDXVersion element.

 

Under the V 3.0 model, is it possible to have a valid SPDX Document, without an SPDXVersion element?

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: Gary O'Neall <gary@...>
Sent: Wednesday, July 27, 2022 12:48 PM
To: 'David Kemp' <dk190a@...>; dick@...
Cc: 'William Bartholomew (CELA)' <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: RE: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Possible yes – but with limitations and challenges.  The one challenge with having an Element outside of an SPDXDocument is validation.  Once the canonicalization work is done, we may have a verification method that does not require a containing SPDXDocument.  Another challenge is the location of the external SPDX element – without any locator information, finding the element would need to be done out of band.

 

Gary

 

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of David Kemp
Sent: Wednesday, July 27, 2022 9:43 AM
To: dick@...
Cc: William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Dick,

Is it possible to have an Element that is not part of an SPDXDocument and still be a valid SPDX document?

 

An Element exists in the logical Element graph, and the Element graph can be updated without ever serializing any elements.  So it is possible to have an Element that has never been serialized into any SPDX document.

An SpdxDocument element is metadata about an SPDX document (transfer unit).  A document can exist without ever creating an SpdxDocument element to describe it.  So if I interpreted the question correctly, yes.

If an SpdxDocument element is created to describe a document / transfer unit, it lists every Element that is serialized in that document.

Regards,
David

 

 

On Wed, Jul 27, 2022 at 10:51 AM Dick Brooks <dick@...> wrote:

David,

 

Is it possible to have an Element that is not part of an SPDXDocument and still be a valid SPDX document?

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: David Kemp <dk190a@...>
Sent: Wednesday, July 27, 2022 10:40 AM
To: dick@...
Cc: William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Element is the type that all other element types inherit from.  SpdxDocument inherits from Element, as do Bundle, BOM, and SBOM.

SBOM is the root element of a logical collection of elements in a software bill of materials.  The physical format of a software bill of materials could be anything, including a web page or PDF file.  For machine readability and ease of human use, other physical formats are defined, including tag-value, spreadsheet, RDF, JSON, etc.

The canonicalization group is working to define unambiguous algorithmic rules for converting between physical formats, so that a signature of an element or element collection can be computed once, and tools using any physical format can validate that signature.  That unambiguous algorithmic conversion is enabled by a schema that applies to all data syntaxes.  The root of that schema is a "transfer unit" or lower-case "spdx document".  Upper-case SpdxDocument is the logical element type that is metadata about the spdx document file.

Regards,
David

 

On Wed, Jul 27, 2022 at 8:20 AM Dick Brooks <dick@...> wrote:

In my opinion every artifact that follows the SPDX V n.n spec uses the SPDXDocument base type that all other SPDX artifacts inherit from. Looking at the current model one could infer that elements are not part of an SPDXDocument.

 

Thanks,

 

Dick Brooks

 


William Bartholomew (CELA)
 

@'David Kemp' I want to clarify that SpdxDocument is independent of File. I can have an SpdxDocument that I can serialize in three different formats (e.g. three files), or I can have an SpdxDocument that I never serialize (e.g. zero files). Inversely, a file may or may not contain an SpdxDocument. Like any other logical element, an SpdxDocument serialized into different formats is still the same SpdxDocument, even though the files are different.

 

 

Regards,

 

William Bartholomew (he/him) – Let’s chat

Principal Security Strategist

Global Cybersecurity Policy – Microsoft

 

My working day may not be your working day. Please don’t feel obliged to reply to this e-mail outside of your normal working hours.

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of Gary O'Neall via lists.spdx.org
Sent: Wednesday, July 27, 2022 1:01 PM
To: 'David Kemp' <dk190a@...>; dick@...
Cc: William Bartholomew (CELA) <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: [EXTERNAL] Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Actually – SPDX V2 is an RDF Ontology which is represented as a graph.  We also represent the spec in JSON Schema form – this may be the source of the confusion.  Note that we start with the OWL representation and generate the JSON schema since the RDF OWL document has additional semantic information which can not be represented in the JSON schema.

 

For the graph representation of SPDX version 2, I would suggest looking at the RDF OWL document.

 

In the SPDX V2 RDF Owl document, the SPDX Document is a logical element, not a data type.  You will find in the OWL document that the SPDX document MUST have an IRI and fully specified fields, just as described below.  It is more than just a collection, it also has properties related to its creation.

 

The semantics of SPDX document is a bit different in the proposed 3.0 model – primarily due to a lot of the creation information being moved into the element.  But both the 2.0 and proposed 3.0 SPDX documents contain information about the transfer unit.

 

BTW - I still disagree with renaming any SPDX 2.0 class or property unless there is evidence it is causing confusion or issues in actual implementation, so I would prefer to keep SpdxDocument.

 

Regards,

Gary

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of David Kemp
Sent: Wednesday, July 27, 2022 12:43 PM
To: dick@...
Cc: Gary O'Neall <gary@...>; William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Keep in mind that SPDX v2 is a schema - it defines the syntax of SPDX data, and it calls the top-level data type SPDXDocument.  There is no such thing as graph elements and edges in SPDX v2.

In contrast, SPDX v3 is a logical graph.  A collection element in that graph is semantically a "collection of elements", but data schemas are needed to define various syntaxes that have "collection-ish" semantics.  There are many schema types that could represent collections, object and array are two basic syntaxes in JSON, and a schema specifies which to use and the details of how to use it.

That's a lot of detail to answer the question, but the point is that in SPDX v2, "SPDXDocument" is a data type.  In SPDX v3 "SpdxDocument" is an element which is metadata about a file containing data. SPDX v3 has elements that describe packages and files and documents (called Package and File and SpdxDocument respectively).  It isn't too hard to keep in mind the difference between a package and the Package Element that describes it.  But for some reason it is nearly impossible to keep in mind the difference between a document and a Document Element that describes it.

So William came up with the brilliant idea of calling a file that contains data a "transfer unit".   Normally it would be called "document", as in Word document or PDF document or XML document or JSON document, but due to that mental block, it is less ambiguous to call the file a "transfer unit".  The Element type that describes a transfer unit file is called "SpdxDocument".  And the schema type that defines the syntax of a transfer unit file is called TransferUnit.

SPDX v2 document syntax: SPDXDocument
SPDX v3 document syntax: TransferUnit
SPDX v3 element metadata about a transfer unit: SpdxDocument.

Clear as mud?

In any case, the transfer unit file contains a specVersion property.  And every logical Element contains a specVersion property.  To repeat a previous email, a proposed syntax for the v3 transfer unit is:

TransferUnit = Record
   1 namespace        IRI
   2 namespaceMap     NamespaceMap optional
   3 createdBy        ElementIRI [1..*]
   4 created          DateTime
   5 specVersion      SemVer                // Default value for all elements serialized in this file
   6 profiles         ProfileIdentifier [1..*]
   7 dataLicense      LicenseId
   8 elementValues    Element [1..*]        // All of the elements serialized in this file
   9 spdxFileId       ElementIRI optional
  10 spdxFileRefs     ElementIRI [0..*]

The transfer unit data has one copy of specVersion that is the default value for all element values that don't override it.

Logical elements always have full IRIs and explicit values for every property.  Serializing elements into data files factors out the common data to save space and make elements easier to read without the extra boilerplate. Deserializing expands the common data back into element values.

 

Regards,
David

 

On Wed, Jul 27, 2022 at 1:24 PM Dick Brooks <dick@...> wrote:

Gary,

 

I was specifically referring to SPDXVersion in section 6.1 in the V 2.2.2 spec.

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: Gary O'Neall <gary@...>
Sent: Wednesday, July 27, 2022 1:21 PM
To: dick@...; 'David Kemp' <dk190a@...>
Cc: 'William Bartholomew (CELA)' <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: RE: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

By SPDXVersion, do you mean the version of the SPDX specification?  If so, this is required in SPDX 2.X and I assume will be required in SPDX 3.0.  If you are referring to the package version, I didn’t see this in the current SPDX 3 model – perhaps I’m missing something?  I recall it being discussed in some depth.  It is an optional field in SPDX 2.X.

 

Gary

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of Dick Brooks
Sent: Wednesday, July 27, 2022 9:55 AM
To: 'Gary O'Neall' <gary@...>; 'David Kemp' <dk190a@...>
Cc: 'William Bartholomew (CELA)' <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Gary,

 

One SPDX Element that is vitally important to our processing of SPDX SBOM’s is the presence of an SPDXVersion element.

 

Under the V 3.0 model, is it possible to have a valid SPDX Document, without an SPDXVersion element?

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: Gary O'Neall <gary@...>
Sent: Wednesday, July 27, 2022 12:48 PM
To: 'David Kemp' <dk190a@...>; dick@...
Cc: 'William Bartholomew (CELA)' <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: RE: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Possible yes – but with limitations and challenges.  The one challenge with having an Element outside of an SPDXDocument is validation.  Once the canonicalization work is done, we may have a verification method that does not require a containing SPDXDocument.  Another challenge is the location of the external SPDX element – without any locator information, finding the element would need to be done out of band.

 

Gary

 

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of David Kemp
Sent: Wednesday, July 27, 2022 9:43 AM
To: dick@...
Cc: William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Dick,

Is it possible to have an Element that is not part of an SPDXDocument and still be a valid SPDX document?

 

An Element exists in the logical Element graph, and the Element graph can be updated without ever serializing any elements.  So it is possible to have an Element that has never been serialized into any SPDX document.

An SpdxDocument element is metadata about an SPDX document (transfer unit).  A document can exist without ever creating an SpdxDocument element to describe it.  So if I interpreted the question correctly, yes.

If an SpdxDocument element is created to describe a document / transfer unit, it lists every Element that is serialized in that document.

Regards,
David

 

 

On Wed, Jul 27, 2022 at 10:51 AM Dick Brooks <dick@...> wrote:

David,

 

Is it possible to have an Element that is not part of an SPDXDocument and still be a valid SPDX document?

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: David Kemp <dk190a@...>
Sent: Wednesday, July 27, 2022 10:40 AM
To: dick@...
Cc: William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Element is the type that all other element types inherit from.  SpdxDocument inherits from Element, as do Bundle, BOM, and SBOM.

SBOM is the root element of a logical collection of elements in a software bill of materials.  The physical format of a software bill of materials could be anything, including a web page or PDF file.  For machine readability and ease of human use, other physical formats are defined, including tag-value, spreadsheet, RDF, JSON, etc.

The canonicalization group is working to define unambiguous algorithmic rules for converting between physical formats, so that a signature of an element or element collection can be computed once, and tools using any physical format can validate that signature.  That unambiguous algorithmic conversion is enabled by a schema that applies to all data syntaxes.  The root of that schema is a "transfer unit" or lower-case "spdx document".  Upper-case SpdxDocument is the logical element type that is metadata about the spdx document file.

Regards,
David

 

On Wed, Jul 27, 2022 at 8:20 AM Dick Brooks <dick@...> wrote:

In my opinion every artifact that follows the SPDX V n.n spec uses the SPDXDocument base type that all other SPDX artifacts inherit from. Looking at the current model one could infer that elements are not part of an SPDXDocument.

 

Thanks,

 

Dick Brooks

 


David Kemp
 

Gary,

Thanks!

From the RDF reference: "The Resource Description Framework (RDF) is a World Wide Web Consortium (W3C) standard originally designed as a data model for metadata  It has come to be used as a general method for description and exchange of graph data."  and "It is based on the idea of making statements about resources."

So both v2 and v3 logical models are graphs where nodes make statements about resources.
A file is a resource, not a statement about a resource.  The structure and allowed values of a file is defined by a schema.

In the SPDX V2 RDF Owl document, the SPDX Document is a logical element, not a data type.  You will find in the OWL document that the SPDX document MUST have an IRI and fully specified fields, just as described below.

Yes, because OWL defines statements about resources.  But schemas, including JSON schema, define data types. The only nodes in the 2.3 graph are the root type plus package, file and snippet.  The data type of v2.3 serialized data forces package, file, and snippet nodes to appear inside a root node.  SPDX v3 allows a more expressive graph because many data types in 2.3 (relationships, annotations, identities, presumably licenses when the license profile is written) have been converted to element types.  Allowing each graph node to be serialized without having to be contained in a collection is another aspect of "more expressive".   SpdxDocument exists as an element type in v3, but the data type of a transfer unit doesn't force a file element to always be accompanied by an SpdxDocument element.  We've identified several use cases where that is unnecessary; both the OWL semantics and the schema data types should allow it when useful but not force it when it isn't needed.

Regards,
David



On Wed, Jul 27, 2022 at 4:01 PM Gary O'Neall <gary@...> wrote:

Actually – SPDX V2 is an RDF Ontology which is represented as a graph.  We also represent the spec in JSON Schema form – this may be the source of the confusion.  Note that we start with the OWL representation and generate the JSON schema since the RDF OWL document has additional semantic information which can not be represented in the JSON schema.

 

For the graph representation of SPDX version 2, I would suggest looking at the RDF OWL document.

 

In the SPDX V2 RDF Owl document, the SPDX Document is a logical element, not a data type.  You will find in the OWL document that the SPDX document MUST have an IRI and fully specified fields, just as described below.  It is more than just a collection, it also has properties related to its creation.

 

The semantics of SPDX document is a bit different in the proposed 3.0 model – primarily due to a lot of the creation information being moved into the element.  But both the 2.0 and proposed 3.0 SPDX documents contain information about the transfer unit.

 

BTW - I still disagree with renaming any SPDX 2.0 class or property unless there is evidence it is causing confusion or issues in actual implementation, so I would prefer to keep SpdxDocument.

 

Regards,

Gary

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of David Kemp
Sent: Wednesday, July 27, 2022 12:43 PM
To: dick@...
Cc: Gary O'Neall <gary@...>; William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Keep in mind that SPDX v2 is a schema - it defines the syntax of SPDX data, and it calls the top-level data type SPDXDocument.  There is no such thing as graph elements and edges in SPDX v2.

In contrast, SPDX v3 is a logical graph.  A collection element in that graph is semantically a "collection of elements", but data schemas are needed to define various syntaxes that have "collection-ish" semantics.  There are many schema types that could represent collections, object and array are two basic syntaxes in JSON, and a schema specifies which to use and the details of how to use it.

That's a lot of detail to answer the question, but the point is that in SPDX v2, "SPDXDocument" is a data type.  In SPDX v3 "SpdxDocument" is an element which is metadata about a file containing data. SPDX v3 has elements that describe packages and files and documents (called Package and File and SpdxDocument respectively).  It isn't too hard to keep in mind the difference between a package and the Package Element that describes it.  But for some reason it is nearly impossible to keep in mind the difference between a document and a Document Element that describes it.

So William came up with the brilliant idea of calling a file that contains data a "transfer unit".   Normally it would be called "document", as in Word document or PDF document or XML document or JSON document, but due to that mental block, it is less ambiguous to call the file a "transfer unit".  The Element type that describes a transfer unit file is called "SpdxDocument".  And the schema type that defines the syntax of a transfer unit file is called TransferUnit.

SPDX v2 document syntax: SPDXDocument
SPDX v3 document syntax: TransferUnit
SPDX v3 element metadata about a transfer unit: SpdxDocument.

Clear as mud?

In any case, the transfer unit file contains a specVersion property.  And every logical Element contains a specVersion property.  To repeat a previous email, a proposed syntax for the v3 transfer unit is:

TransferUnit = Record
   1 namespace        IRI
   2 namespaceMap     NamespaceMap optional
   3 createdBy        ElementIRI [1..*]
   4 created          DateTime
   5 specVersion      SemVer                // Default value for all elements serialized in this file
   6 profiles         ProfileIdentifier [1..*]
   7 dataLicense      LicenseId
   8 elementValues    Element [1..*]        // All of the elements serialized in this file
   9 spdxFileId       ElementIRI optional
  10 spdxFileRefs     ElementIRI [0..*]

The transfer unit data has one copy of specVersion that is the default value for all element values that don't override it.

Logical elements always have full IRIs and explicit values for every property.  Serializing elements into data files factors out the common data to save space and make elements easier to read without the extra boilerplate. Deserializing expands the common data back into element values.

 

Regards,
David

 

On Wed, Jul 27, 2022 at 1:24 PM Dick Brooks <dick@...> wrote:

Gary,

 

I was specifically referring to SPDXVersion in section 6.1 in the V 2.2.2 spec.

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: Gary O'Neall <gary@...>
Sent: Wednesday, July 27, 2022 1:21 PM
To: dick@...; 'David Kemp' <dk190a@...>
Cc: 'William Bartholomew (CELA)' <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: RE: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

By SPDXVersion, do you mean the version of the SPDX specification?  If so, this is required in SPDX 2.X and I assume will be required in SPDX 3.0.  If you are referring to the package version, I didn’t see this in the current SPDX 3 model – perhaps I’m missing something?  I recall it being discussed in some depth.  It is an optional field in SPDX 2.X.

 

Gary

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of Dick Brooks
Sent: Wednesday, July 27, 2022 9:55 AM
To: 'Gary O'Neall' <gary@...>; 'David Kemp' <dk190a@...>
Cc: 'William Bartholomew (CELA)' <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Gary,

 

One SPDX Element that is vitally important to our processing of SPDX SBOM’s is the presence of an SPDXVersion element.

 

Under the V 3.0 model, is it possible to have a valid SPDX Document, without an SPDXVersion element?

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: Gary O'Neall <gary@...>
Sent: Wednesday, July 27, 2022 12:48 PM
To: 'David Kemp' <dk190a@...>; dick@...
Cc: 'William Bartholomew (CELA)' <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: RE: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Possible yes – but with limitations and challenges.  The one challenge with having an Element outside of an SPDXDocument is validation.  Once the canonicalization work is done, we may have a verification method that does not require a containing SPDXDocument.  Another challenge is the location of the external SPDX element – without any locator information, finding the element would need to be done out of band.

 

Gary

 

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of David Kemp
Sent: Wednesday, July 27, 2022 9:43 AM
To: dick@...
Cc: William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Dick,

Is it possible to have an Element that is not part of an SPDXDocument and still be a valid SPDX document?

 

An Element exists in the logical Element graph, and the Element graph can be updated without ever serializing any elements.  So it is possible to have an Element that has never been serialized into any SPDX document.

An SpdxDocument element is metadata about an SPDX document (transfer unit).  A document can exist without ever creating an SpdxDocument element to describe it.  So if I interpreted the question correctly, yes.

If an SpdxDocument element is created to describe a document / transfer unit, it lists every Element that is serialized in that document.

Regards,
David

 

 

On Wed, Jul 27, 2022 at 10:51 AM Dick Brooks <dick@...> wrote:

David,

 

Is it possible to have an Element that is not part of an SPDXDocument and still be a valid SPDX document?

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: David Kemp <dk190a@...>
Sent: Wednesday, July 27, 2022 10:40 AM
To: dick@...
Cc: William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Element is the type that all other element types inherit from.  SpdxDocument inherits from Element, as do Bundle, BOM, and SBOM.

SBOM is the root element of a logical collection of elements in a software bill of materials.  The physical format of a software bill of materials could be anything, including a web page or PDF file.  For machine readability and ease of human use, other physical formats are defined, including tag-value, spreadsheet, RDF, JSON, etc.

The canonicalization group is working to define unambiguous algorithmic rules for converting between physical formats, so that a signature of an element or element collection can be computed once, and tools using any physical format can validate that signature.  That unambiguous algorithmic conversion is enabled by a schema that applies to all data syntaxes.  The root of that schema is a "transfer unit" or lower-case "spdx document".  Upper-case SpdxDocument is the logical element type that is metadata about the spdx document file.

Regards,
David

 

On Wed, Jul 27, 2022 at 8:20 AM Dick Brooks <dick@...> wrote:

In my opinion every artifact that follows the SPDX V n.n spec uses the SPDXDocument base type that all other SPDX artifacts inherit from. Looking at the current model one could infer that elements are not part of an SPDXDocument.

 

Thanks,

 

Dick Brooks

 


Gary O'Neall
 

Hi David,

 

Agree with v3 having a more expressive graph – a definite improvement.  I do recall in early days of SPDX development having an issue where RDF serializers would drop nodes if they were not referenced which helped lead us to having an SPDX Document as a root level node in the graph.  We included this node in the logical as well as the serialization model for RDF.  I have a feeling we may (re)discover the same issue in v3 RDF serialization where we will need a root collection to reference all the elements we intend to serialize.  We could decide to make this root element something not included in the logical model or we could require custom RDF serialization libraries as alternatives.  My current thinking is we have an SPDXDocument in the model for the serializations to “contain” all the elements we wish to serialize (in RDF, it just needs to be referenced).  To make this more flexible, we could expand the types of elements which the SPDXDocument could “contain” (e.g. Relationships).

 

Regards,

Gary

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of David Kemp
Sent: Wednesday, July 27, 2022 3:48 PM
To: Gary O'Neall <gary@...>
Cc: dick@...; William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Gary,

Thanks!

From the RDF reference: "The Resource Description Framework (RDF) is a World Wide Web Consortium (W3C) standard originally designed as a data model for metadata  It has come to be used as a general method for description and exchange of graph data."  and "It is based on the idea of making statements about resources."

So both v2 and v3 logical models are graphs where nodes make statements about resources.
A file is a resource, not a statement about a resource.  The structure and allowed values of a file is defined by a schema.

 

In the SPDX V2 RDF Owl document, the SPDX Document is a logical element, not a data type.  You will find in the OWL document that the SPDX document MUST have an IRI and fully specified fields, just as described below.

 

Yes, because OWL defines statements about resources.  But schemas, including JSON schema, define data types. The only nodes in the 2.3 graph are the root type plus package, file and snippet.  The data type of v2.3 serialized data forces package, file, and snippet nodes to appear inside a root node.  SPDX v3 allows a more expressive graph because many data types in 2.3 (relationships, annotations, identities, presumably licenses when the license profile is written) have been converted to element types.  Allowing each graph node to be serialized without having to be contained in a collection is another aspect of "more expressive".   SpdxDocument exists as an element type in v3, but the data type of a transfer unit doesn't force a file element to always be accompanied by an SpdxDocument element.  We've identified several use cases where that is unnecessary; both the OWL semantics and the schema data types should allow it when useful but not force it when it isn't needed.

 

Regards,
David

 

 

On Wed, Jul 27, 2022 at 4:01 PM Gary O'Neall <gary@...> wrote:

Actually – SPDX V2 is an RDF Ontology which is represented as a graph.  We also represent the spec in JSON Schema form – this may be the source of the confusion.  Note that we start with the OWL representation and generate the JSON schema since the RDF OWL document has additional semantic information which can not be represented in the JSON schema.

 

For the graph representation of SPDX version 2, I would suggest looking at the RDF OWL document.

 

In the SPDX V2 RDF Owl document, the SPDX Document is a logical element, not a data type.  You will find in the OWL document that the SPDX document MUST have an IRI and fully specified fields, just as described below.  It is more than just a collection, it also has properties related to its creation.

 

The semantics of SPDX document is a bit different in the proposed 3.0 model – primarily due to a lot of the creation information being moved into the element.  But both the 2.0 and proposed 3.0 SPDX documents contain information about the transfer unit.

 

BTW - I still disagree with renaming any SPDX 2.0 class or property unless there is evidence it is causing confusion or issues in actual implementation, so I would prefer to keep SpdxDocument.

 

Regards,

Gary

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of David Kemp
Sent: Wednesday, July 27, 2022 12:43 PM
To: dick@...
Cc: Gary O'Neall <gary@...>; William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Keep in mind that SPDX v2 is a schema - it defines the syntax of SPDX data, and it calls the top-level data type SPDXDocument.  There is no such thing as graph elements and edges in SPDX v2.

In contrast, SPDX v3 is a logical graph.  A collection element in that graph is semantically a "collection of elements", but data schemas are needed to define various syntaxes that have "collection-ish" semantics.  There are many schema types that could represent collections, object and array are two basic syntaxes in JSON, and a schema specifies which to use and the details of how to use it.

That's a lot of detail to answer the question, but the point is that in SPDX v2, "SPDXDocument" is a data type.  In SPDX v3 "SpdxDocument" is an element which is metadata about a file containing data. SPDX v3 has elements that describe packages and files and documents (called Package and File and SpdxDocument respectively).  It isn't too hard to keep in mind the difference between a package and the Package Element that describes it.  But for some reason it is nearly impossible to keep in mind the difference between a document and a Document Element that describes it.

So William came up with the brilliant idea of calling a file that contains data a "transfer unit".   Normally it would be called "document", as in Word document or PDF document or XML document or JSON document, but due to that mental block, it is less ambiguous to call the file a "transfer unit".  The Element type that describes a transfer unit file is called "SpdxDocument".  And the schema type that defines the syntax of a transfer unit file is called TransferUnit.

SPDX v2 document syntax: SPDXDocument
SPDX v3 document syntax: TransferUnit
SPDX v3 element metadata about a transfer unit: SpdxDocument.

Clear as mud?

In any case, the transfer unit file contains a specVersion property.  And every logical Element contains a specVersion property.  To repeat a previous email, a proposed syntax for the v3 transfer unit is:

TransferUnit = Record
   1 namespace        IRI
   2 namespaceMap     NamespaceMap optional
   3 createdBy        ElementIRI [1..*]
   4 created          DateTime
   5 specVersion      SemVer                // Default value for all elements serialized in this file
   6 profiles         ProfileIdentifier [1..*]
   7 dataLicense      LicenseId
   8 elementValues    Element [1..*]        // All of the elements serialized in this file
   9 spdxFileId       ElementIRI optional
  10 spdxFileRefs     ElementIRI [0..*]

The transfer unit data has one copy of specVersion that is the default value for all element values that don't override it.

Logical elements always have full IRIs and explicit values for every property.  Serializing elements into data files factors out the common data to save space and make elements easier to read without the extra boilerplate. Deserializing expands the common data back into element values.

 

Regards,
David

 

On Wed, Jul 27, 2022 at 1:24 PM Dick Brooks <dick@...> wrote:

Gary,

 

I was specifically referring to SPDXVersion in section 6.1 in the V 2.2.2 spec.

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: Gary O'Neall <gary@...>
Sent: Wednesday, July 27, 2022 1:21 PM
To: dick@...; 'David Kemp' <dk190a@...>
Cc: 'William Bartholomew (CELA)' <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: RE: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

By SPDXVersion, do you mean the version of the SPDX specification?  If so, this is required in SPDX 2.X and I assume will be required in SPDX 3.0.  If you are referring to the package version, I didn’t see this in the current SPDX 3 model – perhaps I’m missing something?  I recall it being discussed in some depth.  It is an optional field in SPDX 2.X.

 

Gary

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of Dick Brooks
Sent: Wednesday, July 27, 2022 9:55 AM
To: 'Gary O'Neall' <gary@...>; 'David Kemp' <dk190a@...>
Cc: 'William Bartholomew (CELA)' <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Gary,

 

One SPDX Element that is vitally important to our processing of SPDX SBOM’s is the presence of an SPDXVersion element.

 

Under the V 3.0 model, is it possible to have a valid SPDX Document, without an SPDXVersion element?

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: Gary O'Neall <gary@...>
Sent: Wednesday, July 27, 2022 12:48 PM
To: 'David Kemp' <dk190a@...>; dick@...
Cc: 'William Bartholomew (CELA)' <willbar@...>; 'SPDX-list' <spdx-tech@...>
Subject: RE: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Possible yes – but with limitations and challenges.  The one challenge with having an Element outside of an SPDXDocument is validation.  Once the canonicalization work is done, we may have a verification method that does not require a containing SPDXDocument.  Another challenge is the location of the external SPDX element – without any locator information, finding the element would need to be done out of band.

 

Gary

 

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of David Kemp
Sent: Wednesday, July 27, 2022 9:43 AM
To: dick@...
Cc: William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Dick,

Is it possible to have an Element that is not part of an SPDXDocument and still be a valid SPDX document?

 

An Element exists in the logical Element graph, and the Element graph can be updated without ever serializing any elements.  So it is possible to have an Element that has never been serialized into any SPDX document.

An SpdxDocument element is metadata about an SPDX document (transfer unit).  A document can exist without ever creating an SpdxDocument element to describe it.  So if I interpreted the question correctly, yes.

If an SpdxDocument element is created to describe a document / transfer unit, it lists every Element that is serialized in that document.

Regards,
David

 

 

On Wed, Jul 27, 2022 at 10:51 AM Dick Brooks <dick@...> wrote:

David,

 

Is it possible to have an Element that is not part of an SPDXDocument and still be a valid SPDX document?

 

Thanks,

 

Dick Brooks

 

Active Member of the CISA Critical Manufacturing Sector,

Sector Coordinating Council – A Public-Private Partnership

 

Never trust software, always verify and report!

http://www.reliableenergyanalytics.com

Email: dick@...

Tel: +1 978-696-1788

 

From: David Kemp <dk190a@...>
Sent: Wednesday, July 27, 2022 10:40 AM
To: dick@...
Cc: William Bartholomew (CELA) <willbar@...>; SPDX-list <spdx-tech@...>
Subject: Re: [spdx-tech] 2022-07-26 Tech Meeting Model Proposal

 

Element is the type that all other element types inherit from.  SpdxDocument inherits from Element, as do Bundle, BOM, and SBOM.

SBOM is the root element of a logical collection of elements in a software bill of materials.  The physical format of a software bill of materials could be anything, including a web page or PDF file.  For machine readability and ease of human use, other physical formats are defined, including tag-value, spreadsheet, RDF, JSON, etc.

The canonicalization group is working to define unambiguous algorithmic rules for converting between physical formats, so that a signature of an element or element collection can be computed once, and tools using any physical format can validate that signature.  That unambiguous algorithmic conversion is enabled by a schema that applies to all data syntaxes.  The root of that schema is a "transfer unit" or lower-case "spdx document".  Upper-case SpdxDocument is the logical element type that is metadata about the spdx document file.

Regards,
David

 

On Wed, Jul 27, 2022 at 8:20 AM Dick Brooks <dick@...> wrote:

In my opinion every artifact that follows the SPDX V n.n spec uses the SPDXDocument base type that all other SPDX artifacts inherit from. Looking at the current model one could infer that elements are not part of an SPDXDocument.

 

Thanks,

 

Dick Brooks

 


David Kemp
 

Gary,

That sounds strange, but I have no experience with RDF serializers.  Consider a physical junk drawer resource, it contains a paper clip, a button, an a pad of sticky notes.  There is no way to make an RDF statement that describes a list of [paper clip, button, and sticky notes] (3 nodes) without also being forced to create a fourth node for the drawer?

RDF makes a distinction between containers and collections.  Based on one example (#20), the latter appears that it might be an anonymous list of items, i.e. items a and b have IRIs, but the collection of a and b doesn't have an IRI.  Then again, I might be totally confused.  If a collection is not required to be a node, then that is what I'm proposing for SpdxDocument.  And if collection is not a node, the Collection class should not have an open arrow to (be a subclass of) Element, whereas if it were called Container it would.

Regards,
David


On Wed, Jul 27, 2022 at 8:30 PM Gary O'Neall <gary@...> wrote:

Hi David,

 

Agree with v3 having a more expressive graph – a definite improvement.  I do recall in early days of SPDX development having an issue where RDF serializers would drop nodes if they were not referenced which helped lead us to having an SPDX Document as a root level node in the graph.  We included this node in the logical as well as the serialization model for RDF.  I have a feeling we may (re)discover the same issue in v3 RDF serialization where we will need a root collection to reference all the elements we intend to serialize.  We could decide to make this root element something not included in the logical model or we could require custom RDF serialization libraries as alternatives.  My current thinking is we have an SPDXDocument in the model for the serializations to “contain” all the elements we wish to serialize (in RDF, it just needs to be referenced).  To make this more flexible, we could expand the types of elements which the SPDXDocument could “contain” (e.g. Relationships).

 

Regards,

Gary


Dick Brooks
 

David,

I have a different perspective. Using your draw analogy. IMO we are creating a document that lists the items in the draw using element constructs. 

Dick Brooks


On Jul 27, 2022, at 10:43 PM, David Kemp <dk190a@...> wrote:


Gary,

That sounds strange, but I have no experience with RDF serializers.  Consider a physical junk drawer resource, it contains a paper clip, a button, an a pad of sticky notes.  There is no way to make an RDF statement that describes a list of [paper clip, button, and sticky notes] (3 nodes) without also being forced to create a fourth node for the drawer?

RDF makes a distinction between containers and collections.  Based on one example (#20), the latter appears that it might be an anonymous list of items, i.e. items a and b have IRIs, but the collection of a and b doesn't have an IRI.  Then again, I might be totally confused.  If a collection is not required to be a node, then that is what I'm proposing for SpdxDocument.  And if collection is not a node, the Collection class should not have an open arrow to (be a subclass of) Element, whereas if it were called Container it would.

Regards,
David


On Wed, Jul 27, 2022 at 8:30 PM Gary O'Neall <gary@...> wrote:

Hi David,

 

Agree with v3 having a more expressive graph – a definite improvement.  I do recall in early days of SPDX development having an issue where RDF serializers would drop nodes if they were not referenced which helped lead us to having an SPDX Document as a root level node in the graph.  We included this node in the logical as well as the serialization model for RDF.  I have a feeling we may (re)discover the same issue in v3 RDF serialization where we will need a root collection to reference all the elements we intend to serialize.  We could decide to make this root element something not included in the logical model or we could require custom RDF serialization libraries as alternatives.  My current thinking is we have an SPDXDocument in the model for the serializations to “contain” all the elements we wish to serialize (in RDF, it just needs to be referenced).  To make this more flexible, we could expand the types of elements which the SPDXDocument could “contain” (e.g. Relationships).

 

Regards,

Gary


David Kemp
 

Dick,

That is correct.  To follow the RDF example, call it a fruit drawer in a refrigerator, with elements a: Apple, b: Banana, c: Cherry and d: Drawer.

D is the SpdxDocument Element of type Container in both v2 and v3 graphs.

There is also a file that is the serialization of a graph using various syntaxes.  In v2, the JSON serialized data is defined by a JSON schema, and the schema follows the v2 graph where you cannot serialize elements without including a container element.  The serialized file conforms to a schema data type we now call TransferUnit.  A schema is not a graph and a data type is not a graph node (meaning data types are not constrained to be Elements).

V3 should be more flexible than v2.  Elements with the same purpose still exist in v3, but the schema should define TransferUnit as a list, not a container.  I believe the RDF type "collection" is a statement about a list. The list can include a container element or not as appropriate for the use case.  Valid transfer units could carry the following elements:

[a]
[a, d]
[b]
[b, d]
[a, b]
[a, b, d]
[a, b, c]
[a, b, c, d]

In SPDX v2, the only valid transfer units are those that include SpdxDocument element d.  In SPDX v3, any combination of elements should be valid.  It should be possible to serialize Element a. or Elements a and b, with or without Element d.

Regards,
David

On Thu, Jul 28, 2022 at 7:20 AM Dick Brooks <dick@...> wrote:
David,

I have a different perspective. Using your draw analogy. IMO we are creating a document that lists the items in the draw using element constructs. 

Dick Brooks


On Jul 27, 2022, at 10:43 PM, David Kemp <dk190a@...> wrote:


Gary,

That sounds strange, but I have no experience with RDF serializers.  Consider a physical junk drawer resource, it contains a paper clip, a button, an a pad of sticky notes.  There is no way to make an RDF statement that describes a list of [paper clip, button, and sticky notes] (3 nodes) without also being forced to create a fourth node for the drawer?

RDF makes a distinction between containers and collections.  Based on one example (#20), the latter appears that it might be an anonymous list of items, i.e. items a and b have IRIs, but the collection of a and b doesn't have an IRI.  Then again, I might be totally confused.  If a collection is not required to be a node, then that is what I'm proposing for SpdxDocument.  And if collection is not a node, the Collection class should not have an open arrow to (be a subclass of) Element, whereas if it were called Container it would.

Regards,
David


On Wed, Jul 27, 2022 at 8:30 PM Gary O'Neall <gary@...> wrote:

Hi David,

 

Agree with v3 having a more expressive graph – a definite improvement.  I do recall in early days of SPDX development having an issue where RDF serializers would drop nodes if they were not referenced which helped lead us to having an SPDX Document as a root level node in the graph.  We included this node in the logical as well as the serialization model for RDF.  I have a feeling we may (re)discover the same issue in v3 RDF serialization where we will need a root collection to reference all the elements we intend to serialize.  We could decide to make this root element something not included in the logical model or we could require custom RDF serialization libraries as alternatives.  My current thinking is we have an SPDXDocument in the model for the serializations to “contain” all the elements we wish to serialize (in RDF, it just needs to be referenced).  To make this more flexible, we could expand the types of elements which the SPDXDocument could “contain” (e.g. Relationships).

 

Regards,

Gary


David Kemp
 

William,
We're almost on the same page.

On Wed, Jul 27, 2022 at 4:35 PM William Bartholomew (CELA) via lists.spdx.org <willbar=microsoft.com@...> wrote:

@'David Kemp' I want to clarify that SpdxDocument is independent of File. [Yes]  

I can have an SpdxDocument that I can serialize in three different formats (e.g. three files), or I can have an SpdxDocument that I never serialize (e.g. zero files). [Yes]


[dpk] I can serialize an SpdxDocument in ten different files (in one format or multiple formats).  SpdxDocument is a logical element like all other elements.

Inversely, a file may or may not contain an SpdxDocument. [Yes]

Like any other logical element, an SpdxDocument serialized into different formats is still the same SpdxDocument, even though the files are different. [Yes]


[dpk]
* A logical element is a statement about a resource.
* A file is a resource (an instance of a SpdxFile / TransferUnit datatype), not a logical element.
* A datatype instance is identified by its value, not by an IRI.
* An abstract datatype instance is identified by its abstract value, not by an IRI.
* Canonicalization defines how to convert files in many formats to one instance of the SpdxFile / TransferUnit abstract datatype.

* RDF semantics defines collection as a datatype identified by its value, not an element identified by an IRI: "Collections differ from containers in ... allowing applications to determine the exact set of items in the collection."  SpdxDocument is an RDF container with an IRI.  SpdxFile / TransferUnit is an RDF collection, a datatype identified by the value of the elements in the collection, not by the IRI of a container.

Regards,

 

William Bartholomew (he/him) – Let’s chat

Principal Security Strategist

Global Cybersecurity Policy – Microsoft


Regards,
David