Serialization: Ontologies vs Datatypes


David Kemp
 

At the SPDX Serialisation Meeting 2023-03-16:
Sean presented a deck of slides that he and Alexios had created to explain concepts relating to JSON-LD and RDF with regard to SPDX. The presentation covered JSON, JSON-LD, context maps and ontologies.

The discussion included Alexios' toy example of Coordinate, and whether it is a linked (referenceable) item in an ontology or not.
1) SPDX Element instances are referenceable by SpdxId, an IRI
2) Instances of Datatypes are plain values, they do not have reference IDs

A third option was discussed, whether Coordinate could be an ontological element (a subject or object, connected by a predicate in an ontology graph), in which case each instance of a Coordinate would need an IRI that is not an SpdxId in addition to its value (latitude and longitude).

The case of Central Park was discussed, which would be modeled as a "Place" with a name, image, and an ordered list of Coordinates (not a set) that establishes its perimeter.  The southeast corner of the park, 5th Avenue and W. 58th St, has coordinate 40.76376383066618, -73.973564545299.  This instance is not an ontological item because it is also on the perimeter of 5th Avenue that exists independently of Central Park.  The identical coordinate would have at least three different IRIs (as coordinates of one park, two streets, plus innumerable people standing on that corner) if it were considered a referenceable RDF item.  Therefore it is not, it is just a value.

The SPDX model accurately expresses the semantics of Datatypes: CreationInformation, ExternalIdentifier, PositiveIntegerRange, etc, are data types:

These types have value-type / struct semantics - equality is determined by comparing values and they MUST NOT be referenced by name across documents.

We should add a list (like the boundary of Central Park) that cannot be a set to Alexios' toy serialization example, even if SPDX does not have any use cases for an ordered list, to ensure that the modeling methodology is complete.



Gary O'Neall
 

Thanks David for the additional info.

 

I was planning allowing the fields of “data types” as objects in RDF triples in SPDX 3.0.  The difference between Elements and “data types” was whether URI types were required or if the object could be an anonymous/blank node.

 

Is this consistent with the discussion in the Serialization meeting?

 

Gary

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of David Kemp
Sent: Sunday, March 19, 2023 12:07 PM
To: SPDX-list <Spdx-tech@...>
Subject: [spdx-tech] Serialization: Ontologies vs Datatypes

 

At the SPDX Serialisation Meeting 2023-03-16:

Sean presented a deck of slides that he and Alexios had created to explain concepts relating to JSON-LD and RDF with regard to SPDX. The presentation covered JSON, JSON-LD, context maps and ontologies.


The discussion included Alexios' toy example of Coordinate, and whether it is a linked (referenceable) item in an ontology or not.
1) SPDX Element instances are referenceable by SpdxId, an IRI

2) Instances of Datatypes are plain values, they do not have reference IDs

A third option was discussed, whether Coordinate could be an ontological element (a subject or object, connected by a predicate in an ontology graph), in which case each instance of a Coordinate would need an IRI that is not an SpdxId in addition to its value (latitude and longitude).

The case of Central Park was discussed, which would be modeled as a "Place" with a name, image, and an ordered list of Coordinates (not a set) that establishes its perimeter.  The southeast corner of the park, 5th Avenue and W. 58th St, has coordinate 40.76376383066618, -73.973564545299.  This instance is not an ontological item because it is also on the perimeter of 5th Avenue that exists independently of Central Park.  The identical coordinate would have at least three different IRIs (as coordinates of one park, two streets, plus innumerable people standing on that corner) if it were considered a referenceable RDF item.  Therefore it is not, it is just a value.

The SPDX model accurately expresses the semantics of Datatypes: CreationInformation, ExternalIdentifier, PositiveIntegerRange, etc, are data types:

These types have value-type / struct semantics - equality is determined by comparing values and they MUST NOT be referenced by name across documents.


We should add a list (like the boundary of Central Park) that cannot be a set to Alexios' toy serialization example, even if SPDX does not have any use cases for an ordered list, to ensure that the modeling methodology is complete.


David Kemp
 

Gary,

Based on the W3C wiki description, that is consistent.

1) Blank nodes must not have an IRI, like Coordinate and CreationInformation instances
2) Blank nodes may apply to more than one IRI (RDF graph node), like Coordinate and CreationInformation instances

I assume you can assign blank nodes to both an entire datatype and to fields within that datatype at any nesting level.

Regards,
David

On Mon, Mar 20, 2023 at 3:25 PM Gary O'Neall <gary@...> wrote:

Thanks David for the additional info.

 

I was planning allowing the fields of “data types” as objects in RDF triples in SPDX 3.0.  The difference between Elements and “data types” was whether URI types were required or if the object could be an anonymous/blank node.

 

Is this consistent with the discussion in the Serialization meeting?

 

Gary

 

From: Spdx-tech@... <Spdx-tech@...> On Behalf Of David Kemp
Sent: Sunday, March 19, 2023 12:07 PM
To: SPDX-list <Spdx-tech@...>
Subject: [spdx-tech] Serialization: Ontologies vs Datatypes

 

At the SPDX Serialisation Meeting 2023-03-16:

Sean presented a deck of slides that he and Alexios had created to explain concepts relating to JSON-LD and RDF with regard to SPDX. The presentation covered JSON, JSON-LD, context maps and ontologies.


The discussion included Alexios' toy example of Coordinate, and whether it is a linked (referenceable) item in an ontology or not.
1) SPDX Element instances are referenceable by SpdxId, an IRI

2) Instances of Datatypes are plain values, they do not have reference IDs

A third option was discussed, whether Coordinate could be an ontological element (a subject or object, connected by a predicate in an ontology graph), in which case each instance of a Coordinate would need an IRI that is not an SpdxId in addition to its value (latitude and longitude).

The case of Central Park was discussed, which would be modeled as a "Place" with a name, image, and an ordered list of Coordinates (not a set) that establishes its perimeter.  The southeast corner of the park, 5th Avenue and W. 58th St, has coordinate 40.76376383066618, -73.973564545299.  This instance is not an ontological item because it is also on the perimeter of 5th Avenue that exists independently of Central Park.  The identical coordinate would have at least three different IRIs (as coordinates of one park, two streets, plus innumerable people standing on that corner) if it were considered a referenceable RDF item.  Therefore it is not, it is just a value.

The SPDX model accurately expresses the semantics of Datatypes: CreationInformation, ExternalIdentifier, PositiveIntegerRange, etc, are data types:

These types have value-type / struct semantics - equality is determined by comparing values and they MUST NOT be referenced by name across documents.


We should add a list (like the boundary of Central Park) that cannot be a set to Alexios' toy serialization example, even if SPDX does not have any use cases for an ordered list, to ensure that the modeling methodology is complete.