Official spdx xml schema?


Nicolaus Weidner
 

Hi all,

I was wondering whether there is any official (or semi-official) schema for the xml file format. My current assumption is that it follows the json schema wherever possible.
I have to say "wherever possible" because some differences cannot be avoided. I am aware of the following:

So, back to the original question: Is there any official definition for the xml schema? Or does one just look at the json schema and if in doubt, look at some examples (which seems dangerous to me, because examples can contain errors...)?

Best,
Nico


Gary O'Neall
 

Hi Nico,

 

I’m glad you asked – I put out a very preliminary XML schema, but I don’t feel qualified to create even a draft of an authoritative schema.

 

We could use some help here.

 

Here’s an issue tracking this in the SPDX Spec: https://github.com/spdx/spdx-spec/issues/615

 

Please feel free to pick this up.

 

I would like to generate the XML schema in a similar fashion to the JSON schema – I can definitely help with the Java code to generate once the schema itself is determined – but feel free to contribute to that as well.

 

Here’s a start at an XSD generator utility: https://github.com/spdx/tools-java/blob/master/src/main/java/org/spdx/tools/schema/OwlToXsd.java.  Of course it will need to be updated once we know what the XSD is to look like.

 

Gary

 

 

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of Nicolaus Weidner via lists.spdx.org
Sent: Monday, October 31, 2022 7:16 AM
To: Spdx-tech@...
Subject: [spdx-tech] Official spdx xml schema?

 

Hi all,

I was wondering whether there is any official (or semi-official) schema for the xml file format. My current assumption is that it follows the json schema wherever possible.
I have to say "wherever possible" because some differences cannot be avoided. I am aware of the following:


So, back to the original question: Is there any official definition for the xml schema? Or does one just look at the json schema and if in doubt, look at some examples (which seems dangerous to me, because examples can contain errors...)?

Best,
Nico


David Kemp
 

Hi Nico,

I've been working on abstract schemas for SPDX v2 and v3.  The rationale for using an abstract schema is that it can mechanically generate concrete schemas for multiple data formats, including JSON, concise (machine-optimized) JSON, CBOR, and XML.  I don't currently have XML encoding rules built into the current tool but they should be fairly easy to create, given examples of the desired output style.

The benefit of using an abstract schema is that encoding rules only need to be developed once and then can be applied to all information models, allowing updates from v2.2 to v2.3 and v3.0 without any serialization-specific work.

Have a look at https://github.com/davaya/spdxv3-template-tool/blob/main/Schemas/spdx-v2_2.jidl to get an idea of the abstract structure - it has a top-level Document type, and a packages property containing multiple PackageInfo elements.  Note that JSON does not have visible types, so type names in the schema are fairly arbitrary.  In XML, element types are visible, so adjusting the type names is possible without affecting JSON data.

I'm interested in applying the information model to both JSON and XML data, since that is its purpose.  If you have some napkin sketch examples of what your preferred XML might look like, that would guide development of encoding rules to generate them while preserving JSON conformance.

Regards,
David


On Mon, Oct 31, 2022 at 2:05 PM Gary O'Neall <gary@...> wrote:

Hi Nico,

 

I’m glad you asked – I put out a very preliminary XML schema, but I don’t feel qualified to create even a draft of an authoritative schema.

 

We could use some help here.

 

Here’s an issue tracking this in the SPDX Spec: https://github.com/spdx/spdx-spec/issues/615

 

Please feel free to pick this up.

 

I would like to generate the XML schema in a similar fashion to the JSON schema – I can definitely help with the Java code to generate once the schema itself is determined – but feel free to contribute to that as well.

 

Here’s a start at an XSD generator utility: https://github.com/spdx/tools-java/blob/master/src/main/java/org/spdx/tools/schema/OwlToXsd.java.  Of course it will need to be updated once we know what the XSD is to look like.

 

Gary

 

 

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of Nicolaus Weidner via lists.spdx.org
Sent: Monday, October 31, 2022 7:16 AM
To: Spdx-tech@...
Subject: [spdx-tech] Official spdx xml schema?

 

Hi all,

I was wondering whether there is any official (or semi-official) schema for the xml file format. My current assumption is that it follows the json schema wherever possible.
I have to say "wherever possible" because some differences cannot be avoided. I am aware of the following:


So, back to the original question: Is there any official definition for the xml schema? Or does one just look at the json schema and if in doubt, look at some examples (which seems dangerous to me, because examples can contain errors...)?

Best,
Nico


Nicolaus Weidner
 

Hi Gary, hi David,

thanks for the pointers! At least it's reassuring to hear I haven't
missed something obvious.

@Gary: I think I need to be a bit careful about not getting involved in
too many topics at once, so I'll pass for now. I'll be happy to chime
in though, and maybe I will have more capacity at some point.
I am also certainly not an expert in xml - what I know is derived
mostly from having to deal with with Maven pom.xml's...

@David: Thanks for the link! I really like the idea of having one
abstract, format-agnostic schema that can then be converted to the
concrete formats. I assume tag/value and rdf are not in scope? Since
those are (to my knowledge) quite different from json/yaml/xml.
I don't have much input for the xml schema so far; the only thing I
noticed are the list representations mentioned in the previous mail.

Regards,
Nico

On Mon, 2022-10-31 at 16:25 -0400, David Kemp wrote:
Hi Nico,

I've been working on abstract schemas for SPDX v2 and v3. The
rationale for using an abstract schema is that it can mechanically
generate concrete schemas for multiple data formats, including JSON,
concise (machine-optimized) JSON, CBOR, and XML. I don't currently
have XML encoding rules built into the current tool but they should
be fairly easy to create, given examples of the desired output style.

The benefit of using an abstract schema is that encoding rules only
need to be developed once and then can be applied to all information
models, allowing updates from v2.2 to v2.3 and v3.0 without any
serialization-specific work.

Have a look at
https://github.com/davaya/spdxv3-template-tool/blob/main/Schemas/spdx-v2_2.jidl
to get an idea of the abstract structure - it has a top-level
Document type, and a packages property containing multiple
PackageInfo elements. Note that JSON does not have visible types, so
type names in the schema are fairly arbitrary. In XML, element types
are visible, so adjusting the type names is possible without
affecting JSON data.

I'm interested in applying the information model to both JSON and XML
data, since that is its purpose. If you have some napkin sketch
examples of what your preferred XML might look like, that would guide
development of encoding rules to generate them while preserving JSON
conformance.

Regards,
David

On Mon, Oct 31, 2022 at 2:05 PM Gary O'Neall <gary@...>
wrote:
Hi Nico,



I’m glad you asked – I put out a very preliminary XML schema, but I
don’t feel qualified to create even a draft of an authoritative
schema.



We could use some help here.



Here’s an issue tracking this in the SPDX Spec:
https://github.com/spdx/spdx-spec/issues/615



Please feel free to pick this up.



I would like to generate the XML schema in a similar fashion to the
JSON schema – I can definitely help with the Java code to generate
once the schema itself is determined – but feel free to contribute
to that as well.



Here’s a start at an XSD generator utility:
https://github.com/spdx/tools-java/blob/master/src/main/java/org/spdx/tools/schema/OwlToXsd.java
. Of course it will need to be updated once we know what the XSD
is to look like.



Gary







From: Spdx-tech@... <Spdx-tech@...> On Behalf
Of Nicolaus Weidner via lists.spdx.org
Sent: Monday, October 31, 2022 7:16 AM
To: Spdx-tech@...
Subject: [spdx-tech] Official spdx xml schema?



Hi all,

I was wondering whether there is any official (or semi-official)
schema for the xml file format. My current assumption is that it
follows the json schema wherever possible.
I have to say "wherever possible" because some differences cannot
be avoided. I am aware of the following:

There needs to be a root element, so there is a top-level
<Document> element containing all properties
List representations look weird to me: For example, one can
currently find multiple successive <packages>...</packages>
elements in one document (e.g.
https://github.com/spdx/tools-java/blob/92e1c5d29eacb0081139b0c05e5e6270b231788c/testResources/SPDXXMLExample-v2.3.spdx.xml#L254-L276
). I would prefer single <package> elements wrapped in an enclosing
<packages> element at this point. Alternatively, at least each
single element should be named using singular even if there is no
enclosing element.

So, back to the original question: Is there any official definition
for the xml schema? Or does one just look at the json schema and if
in doubt, look at some examples (which seems dangerous to me,
because examples can contain errors...)?

Best,
Nico

--
Dr. Nicolaus Weidner * nicolaus.weidner@...
TNG Technology Consulting GmbH, Beta-Str. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Thomas Endres
Aufsichtsratsvorsitzender: Christoph Stock
Sitz: Unterföhring * Amtsgericht München * HRB 135082