A Novel Ontology for Enhanced SBOM Data Modeling with TOSCA
- Track: Software Bill of Materials (SBOM)
- Room: H.2213
- Day: Sunday
- Start: 11:20
- End: 11:40
- Video only: h2213
- Chat: Join the conversation!
As everyone involved with SBOMs knows, the accurate and efficient recording of data about software components is crucial for security, compliance, and operational efficiency. This presentation introduces a novel data modeling approach that leverages an ontology in RDF terms, similar to the approach used by SPDX. This new model distinguishes between abstract software components and specific software packages, providing a more granular and efficient way to manage software metadata. Traditionally, SBOMs have focused on recording information about software packages, such as "OpenSSL v3.0.1 distributed by Ubuntu" or "OpenSSL v3.1.1 distributed by Debian." However, this approach can lead to redundancy and inefficiencies, particularly when dealing with licensing information and other metadata that is common across multiple versions and distributions of the same software. Our novel ontology introduces the concept of a "component" as an abstract reference to a piece of software, distinct from a "package," which represents a specific version of the software distributed by a particular supplier. By implementing this distinction, our ontology allows for the creation of relationships between different parts of the software ecosystem. For example, both "OpenSSL v3.0.1 distributed by Ubuntu" and "OpenSSL v3.1.1 distributed by Debian" can be linked to the abstract component "OpenSSL." This relationship-based approach not only enhances the clarity and organization of SBOM data but also leads to significant storage savings. Common information, such as the licensing terms of a component, can be stored once and referenced across multiple packages, eliminating redundancy. The practical application of this ontology can be demonstrated by a new tool called TOSCA (The Open Source Component Aggregator) that we have developed. TOSCA demonstrates the power and efficiency of this data model by aggregating and managing millions of data points about open-source components. While the presentation will primarily focus on the ontology and its benefits, TOSCA will be mentioned as a proof of concept that highlights the real-world applicability and advantages of this approach.
Speakers
Alexios Zavras (zvr) |