API Reference

This section documents the public API of OntoCheck, organized by assessment category. Each metric function takes a Turtle file path as input and returns a score or diagnostic report.

Assessment Runners

Functions for running each of the four assessment modes.

Ontology Assessment Runner

Provides runner functions for the four OntoCheck assessment modes:

Mode 1 – Task-agnostic: structural, labeling, accessibility, and

naming-convention metrics applied to a single ontology.

Mode 2 – Task-specific Web ontology: task-based Relevance/Accuracy

validated against a knowledge graph (e.g., DBpedia via LC-QuAD).

Mode 3 – Task-based Scientific: domain ontology assessed against

competency questions encoded as SPARQL queries.

Mode 4 – Cross-Domain: multiple ontologies merged and assessed against

cross-domain competency questions.

run_ontology_assessment(ttl_file, metrics, output_log_file='assessment.log', output_csv_file='assessment_scores.csv')[source]

Run task-agnostic metrics on a single ontology (Mode 1).

Parameters:
  • ttl_file (str) – Path to the input Turtle (.ttl) ontology file.

  • metrics (list of str or str) – Metric names to execute, or "all" to run every metric in METRIC_DISPATCHER.

  • output_log_file (str, optional) – Output log file path.

  • output_csv_file (str, optional) – Output CSV file path.

run_web_ontology_assessment(ttl_file, questions, domain_prefixes, knowledge_graph, domain_ns_fragments=None, metrics=None, output_log_file='assessment.log', output_csv_file='assessment_scores.csv')[source]

Assess a Web ontology against KGQA benchmark queries (Mode 2).

Runs the task-based Relevance/Accuracy assessment using competency queries drawn from a knowledge-graph question-answering benchmark (e.g., LC-QuAD over DBpedia). Optionally runs task-agnostic metrics as well.

Parameters:
  • ttl_file (str) – Path to the ontology Turtle file.

  • questions (str or list of str) – Path to a JSON/Markdown file of SPARQL queries, or a list of raw SPARQL query strings.

  • domain_prefixes (list of str) – Namespace prefixes used in the SPARQL queries (e.g., ["dbo"]).

  • knowledge_graph (str) – Path to the knowledge-graph file (Turtle/RDF) used for validation context.

  • domain_ns_fragments (list of str or None, optional) – Namespace URI fragments to restrict domain-term filtering.

  • metrics (list of str or None, optional) – Task-agnostic metric names to run alongside the task-based assessment. Pass "all" for every available metric.

  • output_log_file (str, optional) – Output log file path.

  • output_csv_file (str, optional) – Output CSV file path.

run_task_based_assessment(ttl_files, questions, domain_prefixes, domain_ns_fragments=None, metrics=None, output_log_file='assessment.log', output_csv_file='assessment_scores.csv')[source]

Assess one or more ontologies against competency questions (Modes 3/4).

When a single ontology is provided this corresponds to Mode 3 (task-based scientific assessment). When multiple ontologies are provided they are merged and evaluated jointly, corresponding to Mode 4 (cross-domain assessment).

Parameters:
  • ttl_files (str or list of str) – Path(s) to Turtle (.ttl) ontology file(s). A single path is accepted and will be wrapped in a list internally.

  • questions (str or list of str) – Path to a JSON/Markdown file of SPARQL queries, or a list of raw SPARQL query strings.

  • domain_prefixes (list of str) – Namespace prefixes used in the SPARQL queries (e.g., ["mds"]).

  • domain_ns_fragments (list of str or None, optional) – Namespace URI fragments to restrict domain-term filtering.

  • metrics (list of str or None, optional) – Task-agnostic metric names to run alongside the task-based assessment. Pass "all" for every available metric.

  • output_log_file (str, optional) – Output log file path.

  • output_csv_file (str, optional) – Output CSV file path.

Returns:

The result dictionary from task_based_metric_v_0_0_1.

Return type:

dict

Task-Based Metric

The underlying Relevance/Accuracy computation used by Modes 2, 3, and 4.

Task-Based Ontology Assessment Metric

Evaluates an ontology against a set of competency questions (encoded as SPARQL queries) by computing term-overlap metrics. For each question set, two scores are produced:

Relevance (Recall) = |T_a intersection T_o| / |T_a| Accuracy (Precision) = |T_a intersection T_o| / |T_o|

where T_a is the set of domain terms referenced in the SPARQL queries (the “task vocabulary”) and T_o is the set of domain terms defined in the ontology.

Questions can be supplied as:
  • A path to a JSON file where each item contains a sparql_query key.

  • A path to a Markdown file with SPARQL queries inside sparql blocks.

  • A plain list of SPARQL query strings.

task_based_metric_v_0_0_1(ttl_file, questions, domain_prefixes, domain_ns_fragments=None)[source]

Compute task-based Relevance and Accuracy for an ontology.

Given an ontology (one or more Turtle files) and a set of competency questions expressed as SPARQL queries, this function computes two term-overlap metrics:

Relevance (Recall) = |T_a intersection T_o| / |T_a| Accuracy (Precision) = |T_a intersection T_o| / |T_o|

where T_a is the union of domain terms extracted from all SPARQL queries and T_o is the set of domain terms defined in the ontology.

Parameters:
  • ttl_file (str, pathlib.Path, or list thereof) – Path(s) to Turtle (.ttl) ontology file(s). A single string or Path is automatically wrapped in a list.

  • questions (str, pathlib.Path, or list of str) –

    The competency questions to evaluate against. Accepted forms:

    • str / Path ending in .json – path to a JSON file where each array element has a sparql_query key.

    • str / Path ending in .md – path to a Markdown file with SPARQL queries inside fenced sparql code blocks.

    • list of str – raw SPARQL query strings.

  • domain_prefixes (list of str) – Namespace prefixes used in the SPARQL queries to identify domain terms (e.g., ["mds"]).

  • domain_ns_fragments (list of str or None, optional) – Sub-strings of namespace URIs used to restrict which ontology terms count as domain-specific. When None, every non-foundational term is included.

Returns:

A dictionary with the following keys:

  • relevance (float): Recall – fraction of task terms present in the ontology.

  • accuracy (float): Precision – fraction of ontology terms referenced by the tasks.

  • T_o_count (int): Number of ontology domain terms.

  • T_a_count (int): Number of unique task terms.

  • intersection (int): Number of terms in both sets.

  • missing_from_onto (set of str): Task terms absent from the ontology.

  • unused_in_onto (set of str): Ontology terms not referenced by any task query.

Return type:

dict

Raises:

ValueError – If questions is not a recognized type (list, JSON path, or Markdown path).

Examples

>>> result = task_based_metric_v_0_0_1(
...     ttl_file="my_ontology.ttl",
...     questions="competency_questions.json",
...     domain_prefixes=["mds"],
...     domain_ns_fragments=["cwrusdle.bitbucket.io/mds"],
... )
>>> print(f"Relevance: {result['relevance']:.2%}")
>>> print(f"Accuracy:  {result['accuracy']:.2%}")

Labeling Metrics

Metrics that quantify the proportion of named classes carrying human-readable identifiers, synonyms, and formal definitions.

ontocheck.check_label

mainLabelCheck_v_0_0_1(ttl_file, show='all', export_template=None)[source]

RDFS Label Coverage Analysis

Analyze an OWL ontology in Turtle (ttl) format to assess the coverage and quality of RDFS labels (rdfs:label) across all named classes

This main function loads an ontology file, identifies all named classes, and provides comprehensive analysis of label coverage with various display options and export capabilities

Definitions

  • Named classes: Classes with URIRef identifiers that are explicitly declared as owl:Class or rdfs:Class, or participate in rdfs:subClassOf relations

  • Valid labels: rdfs:label values that are non-empty strings after whitespace trimming. Empty strings and whitespace-only labels are not counted as valid labels

  • Coverage percentage: The proportion of named classes that have at least one valid rdfs:label

Author: Rishabh Kundu Version: 0.0.1

param ttl_file:

Path to the ontology Turtle (.ttl) file to analyze – input file

type ttl_file:

str

param show:

Display option controlling what information to show: - “all” (default): Shows summary statistics, classes with labels, and classes without labels - “with”: Shows only classes that have rdfs:label - “without”: Shows only classes that lack rdfs:label - “summary”: Shows only summary statistics

type show:

str, optional

param export_template:

Export a CSV template file for classes missing rdfs:label. Provide the desired output filename (e.g. “missing_labels_in_classes.csv”). Default is None (no export)

type export_template:

str, optional

returns:
  • None – This function does not directly return values. It prints analysis results to your terminal/CLI and optionally exports a CSV template file. The function may exit early on errors (file not found, parsing errors, or no classes found)

  • Output Information

  • ——————

  • When executed successfully, the analysis provides

  • - Total number of named classes analyzed

  • - Number of classes with valid rdfs (label properties)

  • - Number of classes lacking valid rdfs (label properties)

  • - Coverage percentage of classes with rdfs (label)

  • - Prefixed class name and full URI/IRI for each class

  • Error Handling

  • ————–

  • - FileNotFoundError (When the specified TTL file cannot be found)

  • - Parsing errors (When the TTL file cannot be parsed as valid Turtle)

  • - Empty ontology (When no named classes are found in the ontology)

Notes

  • Only named classes (URIRef instances) are considered in the analysis

  • Empty strings and whitespace-only labels are treated as missing labels

  • Classes are displayed with both their prefixed name and full URI/IRI

  • show and export_template parameters are set to default values (“all” and None)
    • thus, CSV export request must be explicitly mentioned

Note

Claude AI (Sonnet 4.6) was employed chiefly to support documentation efforts.

Examples

Basic usage (show all):

mainLabelCheck_v_0_0_1(“ontology.ttl”)

Show only summary:

mainLabelCheck_v_0_0_1(“ontology.ttl”, show=”summary”)

Show only classes missing labels:

mainLabelCheck_v_0_0_1(“ontology.ttl”, show=”without”)

Export CSV template for missing labels:

mainLabelCheck_v_0_0_1(“ontology.ttl”, export_template=”missing_labels.csv”)

Export template while showing summary only:
mainLabelCheck_v_0_0_1(“ontology.ttl”, show=”summary”, export_template=”missing_labels.csv”)
  • in the afore a desired export path can also be inserted

ontocheck.altLabelCheck

mainAltLabelCheck_v_0_0_1 metric implementation.

mainAltLabelCheck_v_0_0_1(ttl_file, show='all', export_template=None)[source]

SKOS Alternative Label Coverage Analysis

Analyze an OWL ontology in Turtle (ttl) format to assess the coverage and quality of SKOS alternative labels (skos:altLabel) across all named classes

This main function loads an ontology file, identifies all named classes, and provides comprehensive analysis of alternative label coverage with various display options and export capabilities

Definitions

  • Named classes: Classes with URIRef identifiers that are explicitly declared as owl:Class or rdfs:Class, or participate in rdfs:subClassOf relations

  • Valid altLabels: SKOS altLabel values that are non-empty strings after whitespace trimming. Empty strings and whitespace-only altLabels are not accounted as valid altLabels

  • Coverage percentage: The proportion of named classes that have at least one valid altLabel inclusion

Author: Rishabh Kundu Version: 0.0.1

param ttl_file:

Path to the ontology Turtle (.ttl) file to analyze – input file

type ttl_file:

str

param show:

Display option controlling what information to show: - “all” (default): Shows summary statistics, classes with altLabels, and classes without altLabels - “with”: Shows only classes that have altLabels - “without”: Shows only classes that lack altLabels - “summary”: Shows only summary statistics

type show:

str, optional

param export_template:

Export a Turtle format template file for classes missing altLabels. Provide the desired output filename. Default is None (no export).

type export_template:

str, optional

returns:
  • None – This function does not (directly) return values. It prints analysis results to your terminal/CLI and optionally exports a template file. The function may exit early on errors (file not found, parsing errors, or no classes found)

  • Output Information

  • —————–

  • When executed successfully, the analysis provides

  • - Total number of named classes analyzed

  • - Number of classes with valid altLabel properties

  • - Number of classes lacking valid altLabel properties

  • - Coverage percentage of classes with altLabels

  • - Total count of altLabel instances across all classes

  • - Average number of altLabels per class (for classes with altLabels)

  • - Qualitative assessment based on coverage thresholds

  • Error Handling

  • ————-

  • The function handles several error conditions

  • - FileNotFoundError (When the specified TTL file cannot be found)

  • - Parsing errors (When the TTL file cannot be parsed as valid Turtle)

  • - Empty ontology (When no named classes are found in the ontology)

  • - Template export errors (File I/O issues when exporting templates)

Notes

  • Only named classes (URIRef instances) are considered in the analysis

  • Empty strings and whitespace-only altLabels are filtered out

  • Coverage assessment follows established ontology quality thresholds: ≥80%: Excellent, ≥60%: Good, ≥40%: Moderate, ≥20%: Low, <20%: Very low

  • Classes are displayed with their preferred labels (skos:prefLabel or rdfs:label) when available

  • Template export generates valid Turtle syntax for adding missing altLabels

  • show and export_template function parameters set to default values (all and None, respectively)

Note

Claude AI (Sonnet 4) was employed chiefly to support documentation efforts.

Examples

Basic usage:

python script.py ontology.ttl

Show only summary:

python script.py ontology.ttl –show summary

Export template for missing labels:

python script.py ontology.ttl –export-template missing_labels.ttl

ontocheck.defCheck

mainDefCheck_v_0_0_1 metric implementation.

mainDefCheck_v_0_0_1(ttl_file, show='all', full_definitions=False)[source]

SKOS Definition Coverage Analysis

Analyze an OWL ontology in Turtle (ttl) format and assess the coverage and quality of SKOS definitions (skos:definition) across all named classes

This main function loads an ontology file, identifies all named classes, and provides comprehensive analysis of definition coverage with various display options

Definitions

  • Named classes: Classes with URIRef identifiers that are explicitly declared as owl:Class or rdfs:Class, or participate in rdfs:subClassOf relations

  • Valid definitions: SKOS definition values that exist as properties on classes Empty or missing definitions are identified as gaps in coverage

  • Coverage percentage: The proportion of named classes that have at least one skos:definition property

Author: Rishabh Kundu Version: 0.0.1

param ttl_file:

Path to the ontology Turtle (.ttl) file to analyze

type ttl_file:

str

param show:

Display option controlling what information to show: - “all” (default): Shows summary statistics, classes with definitions, and classes without definitions - “with”: Shows only classes that have definitions - “without”: Shows only classes that lack definitions - “summary”: Shows only summary statistics

type show:

str, optional

param full_definitions:

Show full definitions instead of truncated versions (default: False, truncated to 150 chars)

type full_definitions:

bool, optional

returns:
  • None – This function does not (directly) return values. It prints analysis results to terminal/CLI The function may exit early on errors (file not found, parsing errors, or no classes found)

  • Output Information

  • —————–

  • When executed successfully, the analysis provides

  • - Total number of named classes analyzed

  • - Number of classes with definition properties

  • - Number of classes lacking definition properties

  • - Coverage percentage of classes with definitions

  • - Qualitative assessment based on coverage thresholds

  • Error Handling

  • ————-

  • The function handles several error conditions

  • - FileNotFoundError (When the specified TTL file cannot be found)

  • - Parsing errors (When the TTL file cannot be parsed as valid Turtle)

  • - Empty ontology (When no named classes are found in the ontology)

Notes

  • Only named classes (URIRef instances) are considered in the analysis

  • Uses skos:definition property specifically for definition identification

  • Coverage assessment follows ontology quality thresholds with higher standards than altLabels: ≥90%: Excellent, ≥75%: Good, ≥50%: Moderate, ≥25%: Low, <25%: Very low

  • Classes are displayed with their preferred labels (skos:prefLabel or rdfs:label) when available

  • show and full_definitions function parameters set to default values (all and False, respectively)

Note

Claude AI (Sonnet 4) was employed chiefly to support documentation efforts.

Examples

Basic usage:

python script.py ontology.ttl

Show only summary:

python script.py ontology.ttl –show summary

Show full definitions:

python script.py ontology.ttl –full-definitions

Show only classes without definitions:

python script.py ontology.ttl –show without

Structural Metrics

Metrics that expose orphaned classes, disconnected subgraphs, undeclared domain and range restrictions, and hierarchy chains lacking grounding in upper-level ontologies.

ontocheck.check_for_isolated_elements

check_for_isolated_elements(ttl_file: str)[source]

C1 - Number of isolated elements

Analyze an OWL ontology in Turtle format to identify isolated atomic classes and isolated properties.

Definitions

  • Atomic classes are named classes (with URI) that are NOT constructed classes (i.e., they do not have owl:unionOf, owl:intersectionOf, or owl:complementOf).

  • A class (atomic or constructed with URI) is considered connected if it:
    • participates in rdfs:subClassOf, owl:equivalentClass, or owl:disjointWith relations involving atomic classes, OR

    • is used as domain or range of properties and contains at least one atomic class inside its construction.

  • A property is considered connected if it is related by any of: rdfs:subPropertyOf, owl:inverseOf, owl:propertyDisjointWith, or owl:equivalentProperty.

Author: Van Tran Version: 0.0.1

param ttl_file:

File path to the ontology Turtle (.ttl) file.

type ttl_file:

str

param Prints:

param ——:

param Lists of isolated atomic classes and isolated properties.:

Notes

  • Only named classes explicitly declared as owl:Class are considered.

  • Only properties explicitly declared as owl:ObjectProperty or owl:DatatypeProperty are considered.

  • Relations checked for classes include rdfs:subClassOf, owl:equivalentClass, owl:disjointWith, and usage as domain or range of properties.

  • Relations checked for properties include rdfs:subPropertyOf, owl:inverseOf, owl:propertyDisjointWith, and owl:equivalentProperty.

References

Mc Gurk, S., Abela, C., & Debattista, J. (2017). Towards ontology quality assessment. 4th Workshop on Linked Data Quality (LDQ2017), co-located with the 14th Extended Semantic Web Conference (ESWC), Portorož, 94-106.

ontocheck.count_class_connected_components

count_class_connected_components(ttl_file_path: str) int[source]

C3 - Count number of connected subgraphs

Count the number of connected components in the “class graph” (TBox) of an OWL ontology. The “class graph” is constructed by creating undirected edges between classes that are connected by any of the following OWL predicates:

  • rdfs:subClassOf

  • owl:equivalentClass

  • owl:disjointWith

Nodes in the graph represent named classes (URIRefs) from the ontology. Edges represent these relationships between named classes. Classes involved only in subclass/equivalent/disjointWith axioms pointing to blank nodes (i.e. constructed classes) are excluded.

Author: Van Tran Version: 0.0.1

Parameters:

ttl_file_path (str) – Path to the ontology Turtle (.ttl) file.

Returns:

The number of connected components in the class graph.

Return type:

int

Notes

  • The graph is undirected; directionality of subclass relations is ignored.

  • Only named OWL classes are considered.

  • Classes that participate in subclass/equivalent/disjoint axioms involving blank nodes are excluded.

ontocheck.get_properties_missing_domain_and_range

get_properties_missing_domain_and_range(ttl_file_path: str)[source]

C2 - Missing Domain and Ranges in Properties

Parse an OWL ontology Turtle file and identify object and datatype properties that are missing domain or range declarations.

Author: Van Tran Version: 0.0.1

Parameters:

ttl_file_path (str) – Path to the Turtle (.ttl) file containing the ontology.

Returns:

A dictionary containing:

  • ’count_missing_domain’: int

    Number of properties missing an rdfs:domain declaration.

  • ’properties_missing_domain’: list of rdflib.term.URIRef

    List of properties (URIs) missing an rdfs:domain.

  • ’count_missing_range’: int

    Number of properties missing an rdfs:range declaration.

  • ’properties_missing_range’: list of rdflib.term.URIRef

    List of properties (URIs) missing an rdfs:range.

Return type:

dict

Notes

  • Only properties explicitly typed as owl:ObjectProperty or owl:DatatypeProperty are considered.

References

Mc Gurk, S., Abela, C., & Debattista, J. (2017). Towards ontology quality assessment. 4th Workshop on Linked Data Quality (LDQ2017), co-located with the 14th Extended Semantic Web Conference (ESWC), Portorož, 94-106.

ontocheck.leafNodeCheck

mainLeafNodeCheck_v_0_0_1 metric implementation.

mainLeafNodeCheck_v_0_0_1(ttl_file)[source]

Ontology Leaf Node Analysis

Analyze an OWL ontology in Turtle (ttl) format and identify all leaf nodes in the class hierarchy. Leaf nodes are classes that have no subclasses, representing the most specific classes in the ontology

This main function loads an ontology file, identifies all declared classes, and determines which classes are leaf nodes by finding classes that are never used as superclasses

Definitions

  • Leaf nodes: Classes that have no subclasses, meaning they do not appear as objects in rdfs:subClassOf relationships (or skos:broader)

  • Declared classes: Classes explicitly declared with rdf:type owl:Class or rdfs:Class

  • Hierarchy detection: Uses rdfs:subClassOf relationships to determine class hierarchy (also skos:broader)

Author: Rishabh Kundu Version: 0.0.1

param ttl_file:

Path to the ontology Turtle (.ttl) file to analyze

type ttl_file:

str

returns:
  • None – This function does not (directly) return values. It prints analysis results to terminal/CLI The function may exit early on errors (file not found, parsing errors, or no leaf nodes found)

  • Output Information

  • —————–

  • When executed successfully, the analysis provides

  • - Total number of leaf nodes found

  • - Complete list of leaf nodes with their prefixed names (sorted alphabetically)

  • Error Handling

  • ————-

  • The function handles several error conditions

  • - FileNotFoundError (When the specified TTL file cannot be found)

  • - Parsing errors (When the TTL file cannot be parsed as valid Turtle)

  • - Empty results (When no leaf nodes are found in the ontology)

Notes

  • Only considers explicitly declared classes (rdf:type owl:Class or rdfs:Class)

  • Uses namespace manager for clean URI representation in output

  • Leaf nodes are sorted alphabetically for consistent display

  • Coverage statistics may be implemented in future versions

Note

Claude AI (Sonnet 4) was employed chiefly to support documentation efforts.

References

  • Mc Gurk, S., Abela, C., & Debattista, J. (2017). Towards ontology quality assessment. 4th Workshop on Linked Data Quality (LDQ2017), co-located with the 14th Extended Semantic Web Conference (ESWC), Portorož, 94-106.

Examples

Basic usage:

python script.py ontology.ttl

ontocheck.semanticConnection

mainSemanticConnection_v_0_0_1 metric implementation.

mainSemanticConnection_v_0_0_1(ttl_file)[source]

Ontology Semantic Connection Analysis

Analyze an OWL ontology in Turtle (ttl) format and assess the semantic connection of class hierarchies to established upper-level ontologies (specifically, Common Core Ontology and Basic Formal Ontology)

This main function loads an ontology file, builds the complete class hierarchy, identifies root classes, and determines which hierarchy chains are semantically grounded in higher-level ontologies through naming convention analysis

Definitions

  • Named classes: Classes with URIRef identifiers that are explicitly declared as owl:Class or rdfs:Class, or participate in rdfs:subClassOf relations

  • Class hierarchy: The tree structure of classes connected via rdfs:subClassOf relationships

  • Root classes: Classes that have no parent classes, representing the top level of independent hierarchy trees

  • Semantic connection: Connection to higher-level ontologies (CCO/BFO) determined by URI prefix analysis (cco:, obo:bfo, bfo:)

  • Hierarchy chains: Complete trees of classes rooted at root classes, inheriting the connection status of their root

Author: Rishabh Kundu Version: 0.0.1

param ttl_file:

Path to the ontology Turtle (.ttl) file to analyze

type ttl_file:

str

returns:
  • None – This function does not (directly) return values. It prints comprehensive hierarchy analysis to terminal/CLI. The function may exit early on errors (file not found, parsing errors, no classes found, or no hierarchy relationships found)

  • Output Information

  • —————–

  • When executed successfully, the analysis provides

  • - Total number of named classes

  • - Number of classes with children (parent classes)

  • - Total parent-child relationships

  • - Number of root classes

  • - Number of root classes connected to higher ontologies

  • - Summary of connected vs disconnected hierarchy chains

  • - Complete hierarchical tree view with connection status indicators

  • Error Handling

  • ————-

  • The function handles several error conditions

  • - FileNotFoundError (When the specified TTL file cannot be found)

  • - Parsing errors (When the TTL file cannot be parsed as valid Turtle)

  • - Empty ontology (When no named classes are found)

  • - Missing hierarchy (When no rdfs:subClassOf relationships are found)

Notes

  • Only considers explicitly declared classes and rdfs:subClassOf relationships

  • Connection analysis based on URI prefix patterns (cco:, obo:bfo, bfo:)

  • Provides both statistical summary and detailed tree visualization

  • Includes namespace bindings for common ontology prefixes

Note

Claude AI (Sonnet 4) was employed chiefly to support documentation efforts.

Examples

Basic usage:

python script.py ontology.ttl

Accessibility Metrics

Metrics that verify endpoint reachability, data dump availability, licensing fitness, and external link validity.

ontocheck.check_sparql_accessibility_ttl

check_sparql_accessibility_ttl(ttl_file)[source]

A1 - Accessibility of the SPARQL endpoint and the server

Evaluates the accessibility of SPARQL endpoints referenced in a TTL file by attempting to execute a simple query against each discovered endpoint.

This metric is based on Flemming (2011) quality criteria for Linked Data sources, specifically addressing the availability and accessibility of query interfaces.

Author: Redad Mehdi Version: 0.0.1

Parameters:

ttl_filestr

Path to the Turtle (.ttl) file to analyze

Returns:

float

Ratio of accessible SPARQL endpoints (0.0 to 1.0) - 0.0: No accessible endpoints found - 1.0: All discovered endpoints are accessible

Example:

>>> score = check_sparql_accessibility_ttl('dataset.ttl')
>>> print(f"SPARQL accessibility score: {score}")

References:

Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., & Auer, S. (2015). Quality assessment for Linked Data: A Survey: A systematic literature review and conceptual framework. Semantic Web, 7(1), 63-93.

Flemming, A. (2011). Qualitätsmerkmale von Linked Data-veröffentlichenden Datenquellen. Diplomarbeit (Quality Criteria for Linked Data Sources).

ontocheck.check_rdf_dump_accessibility_ttl

check_rdf_dump_accessibility_ttl(ttl_file)[source]

A2 - RDF dump accessibility

Evaluates the accessibility of RDF data dumps referenced in a TTL file by attempting to access each discovered dump URL via HTTP HEAD requests.

This metric assesses whether the raw RDF data is available for download, which is important for data consumers who need offline access or bulk processing capabilities.

Author: Redad Mehdi Version: 0.0.1

Parameters:

ttl_filestr

Path to the Turtle (.ttl) file to analyze

Returns:

float

Ratio of accessible RDF dumps (0.0 to 1.0) - 0.0: No accessible dumps found - 1.0: All discovered dumps are accessible

Notes:

The function identifies potential dump URLs by looking for common RDF file extensions (.rdf, .ttl, .nt, .n3, .owl, .jsonld) in referenced URLs.

Example:

>>> score = check_rdf_dump_accessibility_ttl('dataset.ttl')
>>> print(f"RDF dump accessibility score: {score}")

References:

Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., & Auer, S. (2015). Quality assessment for Linked Data: A Survey: A systematic literature review and conceptual framework. Semantic Web, 7(1), 63-93.

Flemming, A. (2011). Qualitätsmerkmale von Linked Data-veröffentlichenden Datenquellen. Diplomarbeit (Quality Criteria for Linked Data Sources).

ontocheck.check_human_readable_license_ttl

check_human_readable_license_ttl(ttl_file)[source]

L2 - Human-readable license detection

Detects the presence of human-readable licensing information within a TTL file. This metric evaluates whether the dataset provides clear licensing terms that users can understand without legal expertise.

The function searches for common license-related keywords in both RDF literals and TTL file comments, including references to popular licenses like Creative Commons, GPL, MIT, Apache, and BSD.

Author: Redad Mehdi Version: 0.0.1

Parameters:

ttl_filestr

Path to the Turtle (.ttl) file to analyze

Returns:

int

Binary score (0 or 1) - 0: No human-readable license information found - 1: License-related keywords detected

Notes:

Keywords searched include: ‘license’, ‘licence’, ‘copyright’, ‘terms of use’, ‘creative commons’, ‘GPL’, ‘MIT’, ‘Apache’, ‘BSD’

Example:

>>> score = check_human_readable_license_ttl('dataset.ttl')
>>> if score:
...     print("Human-readable license information found")
... else:
...     print("No license information detected")

References:

Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., & Auer, S. (2015). Quality assessment for Linked Data: A Survey: A systematic literature review and conceptual framework. Semantic Web, 7(1), 63-93.

Hogan, A., Umbrich, J., Harth, A., Cyganiak, R., Polleres, A., & Decker, S. (2012). An empirical survey of Linked Data conformance. Journal of Web Semantics, 14, 14-44.

Naming Convention Metrics

Metrics that detect and flag naming of ontological entities that depart from standard authoring practices.

ontocheck.check_class_name_capital

mainClassNameCapitalCheck_v_0_0_1(ttl_file, show='all', export_template=None)[source]

Class Name Capital Letter Check Analysis

Analyze an OWL ontology in Turtle (ttl) format to assess whether all named classes follow the convention of starting their local name with a capital letter

This main function loads an ontology file, identifies all named classes, and provides comprehensive analysis of capital letter compliance with various display options and export capabilities

Definitions

  • Named classes: Classes with URIRef identifiers that are explicitly declared as owl:Class or rdfs:Class, or participate in rdfs:subClassOf relations

  • Local name: The fragment of a class URI after the last ‘#’ or ‘/’ character. Only this portion is checked, not the full namespace URI

  • Compliant class: A class whose local name begins with an uppercase letter as determined by Python’s str.isupper() check on the first character

  • Coverage percentage: The proportion of named classes whose local name starts with a capital letter

Author: Rishabh Kundu Version: 0.0.1

param ttl_file:

Path to the ontology Turtle (.ttl) file to analyze – input file

type ttl_file:

str

param show:

Display option controlling what information to show: - “all” (default): Shows summary statistics, compliant classes, and non-compliant classes - “with”: Shows only classes whose local name starts with a capital letter - “without”: Shows only classes whose local name does not start with a capital letter - “summary”: Shows only summary statistics

type show:

str, optional

param export_template:

Export a CSV report file for classes that do not start with a capital letter. Provide the desired output filename (e.g. “non_capital_classes.csv”). Default is None (no export)

type export_template:

str, optional

returns:
  • None – This function does not directly return values. It prints analysis results to your terminal/CLI and optionally exports a CSV report file. The function may exit early on errors (file not found, parsing errors, or no classes found)

  • Output Information

  • ——————

  • When executed successfully, the analysis provides

  • - Total no. of named classes analyzed

  • - No. of classes whose local name starts with a capital letter

  • - No. of classes whose local name does not start with a capital letter

  • - Coverage percentage of compliant classes

  • - Prefixed class name, full URI/IRI, and local name for each class

  • Error Handling

  • ————–

  • - FileNotFoundError (When the specified TTL file cannot be found)

  • - Parsing errors (When the TTL file cannot be parsed as valid Turtle)

  • - Empty ontology (When no named classes are found in the ontology)

Notes

  • Only named classes (URIRef instances) are considered in the analysis

  • Only the local name fragment is checked, not the full URI

  • Classes with an empty local name are treated as non-compliant

  • show and export_template parameters are set to default values (“all” and None)
    • thus, CSV export must be explicitly requested

Note

Claude AI (Sonnet 4.6) was employed chiefly to support documentation efforts.

Examples

Basic usage (show all):

mainClassNameCapitalCheck_v_0_0_1(“ontology.ttl”)

Show only summary:

mainClassNameCapitalCheck_v_0_0_1(“ontology.ttl”, show=”summary”)

Show only non-compliant classes:

mainClassNameCapitalCheck_v_0_0_1(“ontology.ttl”, show=”without”)

Export CSV report of non-compliant classes:
mainClassNameCapitalCheck_v_0_0_1(“ontology.ttl”, export_template=”non_capital_classes.csv”)
  • a desired export path can also be inserted in the filename

Export report while showing summary only:
mainClassNameCapitalCheck_v_0_0_1(“ontology.ttl”, show=”summary”, export_template=”non_capital_classes.csv”)
  • a desired export path can also be inserted in the filename

ontocheck.check_class_name_space

mainClassNameSpaceCheck_v_0_0_1(ttl_file, export_template=None)[source]

Class Name Space Check Analysis

Scan an OWL ontology in Turtle (ttl) format for class names containing spaces. Spaces in prefixed class names make a TTL file unparseable, so this function works directly on the raw file text rather than parsing it through rdflib.

Handles both single-line and multi-line class declarations by grouping lines into declaration blocks before checking for spaces.

Author: Rishabh Kundu Version: 0.0.1

Parameters:
  • ttl_file (str) – Path to the ontology Turtle (.ttl) file to analyze – input file

  • export_template (str, optional) – Export a CSV report of all class names containing spaces. Provide the desired output filename (e.g. “classes_with_spaces_in_names.csv”). Default is None (no export)

Returns:

  • None – Prints results to terminal/CLI and optionally exports a CSV report.

  • Output Information

  • ——————

  • When executed successfully, the analysis provides

  • - Total number of class names with spaces detected

  • - The class names with space, line number, and full line text for each

  • Error Handling

  • ————–

  • - FileNotFoundError (When the specified TTL file cannot be found)

Notes

  • Does not use rdflib parsing - works on raw file text so it catches errors that would prevent parsing entirely

  • Detects owl:Class and rdfs:Class declarations

  • Handles multi-line declarations by grouping on ‘.’ block terminators

  • Comments and string literals are handled to avoid false matches

Note

Claude AI (Sonnet 4.6) was employed chiefly to support documentation efforts.

Examples

Basic usage:

mainClassNameSpaceCheck_v_0_0_1(“ontology.ttl”)

Export CSV report:

mainClassNameSpaceCheck_v_0_0_1(“ontology.ttl”, export_template=”classes_with_spaces_in_names.csv”)

ontocheck.spell_check

ontocheck.find_duplicate_labels_from_graph

find_duplicate_labels_from_graph(ttl_file)[source]

IO8 - Semantically Identical Classes

This metric identifies semantically identical classes by checking if two IRIs in an ontology has the same value for rdfs:label or not.

Params

ttl_file (string): path to ttl file

returns:

duplicates (dict)

rtype:

dictionary of URIs of duplicated terms

Author: Van Tran Version: 0.0.1

References

Mc Gurk, S., Abela, C., & Debattista, J. (2017). Towards ontology quality assessment. 4th Workshop on Linked Data Quality (LDQ2017), co-located with the 14th Extended Semantic Web Conference (ESWC), Portorož, 94-106.

Other Modules

ontocheck.cli

OntoCheck Command-Line Interface

Provides a unified CLI for all four OntoCheck assessment modes:

Mode 1 – Task-agnostic (default) Mode 2 – Task-specific Web ontology Mode 3 – Task-based Scientific ontology Mode 4 – Cross-Domain ontology

main()[source]

Entry point for the ontocheck command.

ontocheck.mds_design_check

Module Contents

mainAltLabelCheck_v_0_0_1(ttl_file, show='all', export_template=None)[source]

SKOS Alternative Label Coverage Analysis

Analyze an OWL ontology in Turtle (ttl) format to assess the coverage and quality of SKOS alternative labels (skos:altLabel) across all named classes

This main function loads an ontology file, identifies all named classes, and provides comprehensive analysis of alternative label coverage with various display options and export capabilities

Definitions

  • Named classes: Classes with URIRef identifiers that are explicitly declared as owl:Class or rdfs:Class, or participate in rdfs:subClassOf relations

  • Valid altLabels: SKOS altLabel values that are non-empty strings after whitespace trimming. Empty strings and whitespace-only altLabels are not accounted as valid altLabels

  • Coverage percentage: The proportion of named classes that have at least one valid altLabel inclusion

Author: Rishabh Kundu Version: 0.0.1

param ttl_file:

Path to the ontology Turtle (.ttl) file to analyze – input file

type ttl_file:

str

param show:

Display option controlling what information to show: - “all” (default): Shows summary statistics, classes with altLabels, and classes without altLabels - “with”: Shows only classes that have altLabels - “without”: Shows only classes that lack altLabels - “summary”: Shows only summary statistics

type show:

str, optional

param export_template:

Export a Turtle format template file for classes missing altLabels. Provide the desired output filename. Default is None (no export).

type export_template:

str, optional

returns:
  • None – This function does not (directly) return values. It prints analysis results to your terminal/CLI and optionally exports a template file. The function may exit early on errors (file not found, parsing errors, or no classes found)

  • Output Information

  • —————–

  • When executed successfully, the analysis provides

  • - Total number of named classes analyzed

  • - Number of classes with valid altLabel properties

  • - Number of classes lacking valid altLabel properties

  • - Coverage percentage of classes with altLabels

  • - Total count of altLabel instances across all classes

  • - Average number of altLabels per class (for classes with altLabels)

  • - Qualitative assessment based on coverage thresholds

  • Error Handling

  • ————-

  • The function handles several error conditions

  • - FileNotFoundError (When the specified TTL file cannot be found)

  • - Parsing errors (When the TTL file cannot be parsed as valid Turtle)

  • - Empty ontology (When no named classes are found in the ontology)

  • - Template export errors (File I/O issues when exporting templates)

Notes

  • Only named classes (URIRef instances) are considered in the analysis

  • Empty strings and whitespace-only altLabels are filtered out

  • Coverage assessment follows established ontology quality thresholds: ≥80%: Excellent, ≥60%: Good, ≥40%: Moderate, ≥20%: Low, <20%: Very low

  • Classes are displayed with their preferred labels (skos:prefLabel or rdfs:label) when available

  • Template export generates valid Turtle syntax for adding missing altLabels

  • show and export_template function parameters set to default values (all and None, respectively)

Note

Claude AI (Sonnet 4) was employed chiefly to support documentation efforts.

Examples

Basic usage:

python script.py ontology.ttl

Show only summary:

python script.py ontology.ttl –show summary

Export template for missing labels:

python script.py ontology.ttl –export-template missing_labels.ttl

check_external_data_provider_links_ttl(ttl_file, base_namespace=None)[source]

I2 - Detection of existence and usage of external URIs

Evaluates the degree to which a dataset links to external data providers through properties like owl:sameAs, rdfs:seeAlso, and SKOS mapping properties. This metric assesses the dataset’s integration with the broader Linked Data cloud.

External links enhance data discoverability and enable cross-dataset queries, representing a key principle of Linked Data publishing.

Author: Redad Mehdi Version: 0.0.1

Parameters:

ttl_filestr

Path to the Turtle (.ttl) file to analyze

base_namespacestr, optional

Base namespace of the ontology. If not provided, the function attempts to infer it from the graph’s namespace declarations.

Returns:

float

Ratio of entities with external links (0.0 to 1.0) - 0.0: No entities have external links - 1.0: All entities have at least one external link

Notes:

The function examines the following linking predicates: - owl:sameAs, rdfs:seeAlso - SKOS mapping properties (exactMatch, closeMatch, etc.) - owl:equivalentClass, owl:equivalentProperty - dc:source, foaf:isPrimaryTopicOf

If no base namespace is provided, the function recognizes links to known external data providers like DBpedia, Wikidata, GeoNames, etc.

Example:

>>> score = check_external_data_provider_links_ttl('dataset.ttl',
...                                                 'http://example.org/')
>>> print(f"External linking score: {score:.2f}")

References:

Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., & Auer, S. (2015). Quality assessment for Linked Data: A Survey: A systematic literature review and conceptual framework. Semantic Web, 7(1), 63-93.

Hogan, A., Umbrich, J., Harth, A., Cyganiak, R., Polleres, A., & Decker, S. (2012). An empirical survey of Linked Data conformance. Journal of Web Semantics, 14, 14-44.

check_for_isolated_elements(ttl_file: str)[source]

C1 - Number of isolated elements

Analyze an OWL ontology in Turtle format to identify isolated atomic classes and isolated properties.

Definitions

  • Atomic classes are named classes (with URI) that are NOT constructed classes (i.e., they do not have owl:unionOf, owl:intersectionOf, or owl:complementOf).

  • A class (atomic or constructed with URI) is considered connected if it:
    • participates in rdfs:subClassOf, owl:equivalentClass, or owl:disjointWith relations involving atomic classes, OR

    • is used as domain or range of properties and contains at least one atomic class inside its construction.

  • A property is considered connected if it is related by any of: rdfs:subPropertyOf, owl:inverseOf, owl:propertyDisjointWith, or owl:equivalentProperty.

Author: Van Tran Version: 0.0.1

param ttl_file:

File path to the ontology Turtle (.ttl) file.

type ttl_file:

str

param Prints:

param ——:

param Lists of isolated atomic classes and isolated properties.:

Notes

  • Only named classes explicitly declared as owl:Class are considered.

  • Only properties explicitly declared as owl:ObjectProperty or owl:DatatypeProperty are considered.

  • Relations checked for classes include rdfs:subClassOf, owl:equivalentClass, owl:disjointWith, and usage as domain or range of properties.

  • Relations checked for properties include rdfs:subPropertyOf, owl:inverseOf, owl:propertyDisjointWith, and owl:equivalentProperty.

References

Mc Gurk, S., Abela, C., & Debattista, J. (2017). Towards ontology quality assessment. 4th Workshop on Linked Data Quality (LDQ2017), co-located with the 14th Extended Semantic Web Conference (ESWC), Portorož, 94-106.

check_human_readable_license_ttl(ttl_file)[source]

L2 - Human-readable license detection

Detects the presence of human-readable licensing information within a TTL file. This metric evaluates whether the dataset provides clear licensing terms that users can understand without legal expertise.

The function searches for common license-related keywords in both RDF literals and TTL file comments, including references to popular licenses like Creative Commons, GPL, MIT, Apache, and BSD.

Author: Redad Mehdi Version: 0.0.1

Parameters:

ttl_filestr

Path to the Turtle (.ttl) file to analyze

Returns:

int

Binary score (0 or 1) - 0: No human-readable license information found - 1: License-related keywords detected

Notes:

Keywords searched include: ‘license’, ‘licence’, ‘copyright’, ‘terms of use’, ‘creative commons’, ‘GPL’, ‘MIT’, ‘Apache’, ‘BSD’

Example:

>>> score = check_human_readable_license_ttl('dataset.ttl')
>>> if score:
...     print("Human-readable license information found")
... else:
...     print("No license information detected")

References:

Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., & Auer, S. (2015). Quality assessment for Linked Data: A Survey: A systematic literature review and conceptual framework. Semantic Web, 7(1), 63-93.

Hogan, A., Umbrich, J., Harth, A., Cyganiak, R., Polleres, A., & Decker, S. (2012). An empirical survey of Linked Data conformance. Journal of Web Semantics, 14, 14-44.

check_rdf_dump_accessibility_ttl(ttl_file)[source]

A2 - RDF dump accessibility

Evaluates the accessibility of RDF data dumps referenced in a TTL file by attempting to access each discovered dump URL via HTTP HEAD requests.

This metric assesses whether the raw RDF data is available for download, which is important for data consumers who need offline access or bulk processing capabilities.

Author: Redad Mehdi Version: 0.0.1

Parameters:

ttl_filestr

Path to the Turtle (.ttl) file to analyze

Returns:

float

Ratio of accessible RDF dumps (0.0 to 1.0) - 0.0: No accessible dumps found - 1.0: All discovered dumps are accessible

Notes:

The function identifies potential dump URLs by looking for common RDF file extensions (.rdf, .ttl, .nt, .n3, .owl, .jsonld) in referenced URLs.

Example:

>>> score = check_rdf_dump_accessibility_ttl('dataset.ttl')
>>> print(f"RDF dump accessibility score: {score}")

References:

Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., & Auer, S. (2015). Quality assessment for Linked Data: A Survey: A systematic literature review and conceptual framework. Semantic Web, 7(1), 63-93.

Flemming, A. (2011). Qualitätsmerkmale von Linked Data-veröffentlichenden Datenquellen. Diplomarbeit (Quality Criteria for Linked Data Sources).

check_sparql_accessibility_ttl(ttl_file)[source]

A1 - Accessibility of the SPARQL endpoint and the server

Evaluates the accessibility of SPARQL endpoints referenced in a TTL file by attempting to execute a simple query against each discovered endpoint.

This metric is based on Flemming (2011) quality criteria for Linked Data sources, specifically addressing the availability and accessibility of query interfaces.

Author: Redad Mehdi Version: 0.0.1

Parameters:

ttl_filestr

Path to the Turtle (.ttl) file to analyze

Returns:

float

Ratio of accessible SPARQL endpoints (0.0 to 1.0) - 0.0: No accessible endpoints found - 1.0: All discovered endpoints are accessible

Example:

>>> score = check_sparql_accessibility_ttl('dataset.ttl')
>>> print(f"SPARQL accessibility score: {score}")

References:

Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., & Auer, S. (2015). Quality assessment for Linked Data: A Survey: A systematic literature review and conceptual framework. Semantic Web, 7(1), 63-93.

Flemming, A. (2011). Qualitätsmerkmale von Linked Data-veröffentlichenden Datenquellen. Diplomarbeit (Quality Criteria for Linked Data Sources).

count_class_connected_components(ttl_file_path: str) int[source]

C3 - Count number of connected subgraphs

Count the number of connected components in the “class graph” (TBox) of an OWL ontology. The “class graph” is constructed by creating undirected edges between classes that are connected by any of the following OWL predicates:

  • rdfs:subClassOf

  • owl:equivalentClass

  • owl:disjointWith

Nodes in the graph represent named classes (URIRefs) from the ontology. Edges represent these relationships between named classes. Classes involved only in subclass/equivalent/disjointWith axioms pointing to blank nodes (i.e. constructed classes) are excluded.

Author: Van Tran Version: 0.0.1

Parameters:

ttl_file_path (str) – Path to the ontology Turtle (.ttl) file.

Returns:

The number of connected components in the class graph.

Return type:

int

Notes

  • The graph is undirected; directionality of subclass relations is ignored.

  • Only named OWL classes are considered.

  • Classes that participate in subclass/equivalent/disjoint axioms involving blank nodes are excluded.

mainDefCheck_v_0_0_1(ttl_file, show='all', full_definitions=False)[source]

SKOS Definition Coverage Analysis

Analyze an OWL ontology in Turtle (ttl) format and assess the coverage and quality of SKOS definitions (skos:definition) across all named classes

This main function loads an ontology file, identifies all named classes, and provides comprehensive analysis of definition coverage with various display options

Definitions

  • Named classes: Classes with URIRef identifiers that are explicitly declared as owl:Class or rdfs:Class, or participate in rdfs:subClassOf relations

  • Valid definitions: SKOS definition values that exist as properties on classes Empty or missing definitions are identified as gaps in coverage

  • Coverage percentage: The proportion of named classes that have at least one skos:definition property

Author: Rishabh Kundu Version: 0.0.1

param ttl_file:

Path to the ontology Turtle (.ttl) file to analyze

type ttl_file:

str

param show:

Display option controlling what information to show: - “all” (default): Shows summary statistics, classes with definitions, and classes without definitions - “with”: Shows only classes that have definitions - “without”: Shows only classes that lack definitions - “summary”: Shows only summary statistics

type show:

str, optional

param full_definitions:

Show full definitions instead of truncated versions (default: False, truncated to 150 chars)

type full_definitions:

bool, optional

returns:
  • None – This function does not (directly) return values. It prints analysis results to terminal/CLI The function may exit early on errors (file not found, parsing errors, or no classes found)

  • Output Information

  • —————–

  • When executed successfully, the analysis provides

  • - Total number of named classes analyzed

  • - Number of classes with definition properties

  • - Number of classes lacking definition properties

  • - Coverage percentage of classes with definitions

  • - Qualitative assessment based on coverage thresholds

  • Error Handling

  • ————-

  • The function handles several error conditions

  • - FileNotFoundError (When the specified TTL file cannot be found)

  • - Parsing errors (When the TTL file cannot be parsed as valid Turtle)

  • - Empty ontology (When no named classes are found in the ontology)

Notes

  • Only named classes (URIRef instances) are considered in the analysis

  • Uses skos:definition property specifically for definition identification

  • Coverage assessment follows ontology quality thresholds with higher standards than altLabels: ≥90%: Excellent, ≥75%: Good, ≥50%: Moderate, ≥25%: Low, <25%: Very low

  • Classes are displayed with their preferred labels (skos:prefLabel or rdfs:label) when available

  • show and full_definitions function parameters set to default values (all and False, respectively)

Note

Claude AI (Sonnet 4) was employed chiefly to support documentation efforts.

Examples

Basic usage:

python script.py ontology.ttl

Show only summary:

python script.py ontology.ttl –show summary

Show full definitions:

python script.py ontology.ttl –full-definitions

Show only classes without definitions:

python script.py ontology.ttl –show without

find_duplicate_labels_from_graph(ttl_file)[source]

IO8 - Semantically Identical Classes

This metric identifies semantically identical classes by checking if two IRIs in an ontology has the same value for rdfs:label or not.

Params

ttl_file (string): path to ttl file

returns:

duplicates (dict)

rtype:

dictionary of URIs of duplicated terms

Author: Van Tran Version: 0.0.1

References

Mc Gurk, S., Abela, C., & Debattista, J. (2017). Towards ontology quality assessment. 4th Workshop on Linked Data Quality (LDQ2017), co-located with the 14th Extended Semantic Web Conference (ESWC), Portorož, 94-106.

get_properties_missing_domain_and_range(ttl_file_path: str)[source]

C2 - Missing Domain and Ranges in Properties

Parse an OWL ontology Turtle file and identify object and datatype properties that are missing domain or range declarations.

Author: Van Tran Version: 0.0.1

Parameters:

ttl_file_path (str) – Path to the Turtle (.ttl) file containing the ontology.

Returns:

A dictionary containing:

  • ’count_missing_domain’: int

    Number of properties missing an rdfs:domain declaration.

  • ’properties_missing_domain’: list of rdflib.term.URIRef

    List of properties (URIs) missing an rdfs:domain.

  • ’count_missing_range’: int

    Number of properties missing an rdfs:range declaration.

  • ’properties_missing_range’: list of rdflib.term.URIRef

    List of properties (URIs) missing an rdfs:range.

Return type:

dict

Notes

  • Only properties explicitly typed as owl:ObjectProperty or owl:DatatypeProperty are considered.

References

Mc Gurk, S., Abela, C., & Debattista, J. (2017). Towards ontology quality assessment. 4th Workshop on Linked Data Quality (LDQ2017), co-located with the 14th Extended Semantic Web Conference (ESWC), Portorož, 94-106.

mainLeafNodeCheck_v_0_0_1(ttl_file)[source]

Ontology Leaf Node Analysis

Analyze an OWL ontology in Turtle (ttl) format and identify all leaf nodes in the class hierarchy. Leaf nodes are classes that have no subclasses, representing the most specific classes in the ontology

This main function loads an ontology file, identifies all declared classes, and determines which classes are leaf nodes by finding classes that are never used as superclasses

Definitions

  • Leaf nodes: Classes that have no subclasses, meaning they do not appear as objects in rdfs:subClassOf relationships (or skos:broader)

  • Declared classes: Classes explicitly declared with rdf:type owl:Class or rdfs:Class

  • Hierarchy detection: Uses rdfs:subClassOf relationships to determine class hierarchy (also skos:broader)

Author: Rishabh Kundu Version: 0.0.1

param ttl_file:

Path to the ontology Turtle (.ttl) file to analyze

type ttl_file:

str

returns:
  • None – This function does not (directly) return values. It prints analysis results to terminal/CLI The function may exit early on errors (file not found, parsing errors, or no leaf nodes found)

  • Output Information

  • —————–

  • When executed successfully, the analysis provides

  • - Total number of leaf nodes found

  • - Complete list of leaf nodes with their prefixed names (sorted alphabetically)

  • Error Handling

  • ————-

  • The function handles several error conditions

  • - FileNotFoundError (When the specified TTL file cannot be found)

  • - Parsing errors (When the TTL file cannot be parsed as valid Turtle)

  • - Empty results (When no leaf nodes are found in the ontology)

Notes

  • Only considers explicitly declared classes (rdf:type owl:Class or rdfs:Class)

  • Uses namespace manager for clean URI representation in output

  • Leaf nodes are sorted alphabetically for consistent display

  • Coverage statistics may be implemented in future versions

Note

Claude AI (Sonnet 4) was employed chiefly to support documentation efforts.

References

  • Mc Gurk, S., Abela, C., & Debattista, J. (2017). Towards ontology quality assessment. 4th Workshop on Linked Data Quality (LDQ2017), co-located with the 14th Extended Semantic Web Conference (ESWC), Portorož, 94-106.

Examples

Basic usage:

python script.py ontology.ttl

mainSemanticConnection_v_0_0_1(ttl_file)[source]

Ontology Semantic Connection Analysis

Analyze an OWL ontology in Turtle (ttl) format and assess the semantic connection of class hierarchies to established upper-level ontologies (specifically, Common Core Ontology and Basic Formal Ontology)

This main function loads an ontology file, builds the complete class hierarchy, identifies root classes, and determines which hierarchy chains are semantically grounded in higher-level ontologies through naming convention analysis

Definitions

  • Named classes: Classes with URIRef identifiers that are explicitly declared as owl:Class or rdfs:Class, or participate in rdfs:subClassOf relations

  • Class hierarchy: The tree structure of classes connected via rdfs:subClassOf relationships

  • Root classes: Classes that have no parent classes, representing the top level of independent hierarchy trees

  • Semantic connection: Connection to higher-level ontologies (CCO/BFO) determined by URI prefix analysis (cco:, obo:bfo, bfo:)

  • Hierarchy chains: Complete trees of classes rooted at root classes, inheriting the connection status of their root

Author: Rishabh Kundu Version: 0.0.1

param ttl_file:

Path to the ontology Turtle (.ttl) file to analyze

type ttl_file:

str

returns:
  • None – This function does not (directly) return values. It prints comprehensive hierarchy analysis to terminal/CLI. The function may exit early on errors (file not found, parsing errors, no classes found, or no hierarchy relationships found)

  • Output Information

  • —————–

  • When executed successfully, the analysis provides

  • - Total number of named classes

  • - Number of classes with children (parent classes)

  • - Total parent-child relationships

  • - Number of root classes

  • - Number of root classes connected to higher ontologies

  • - Summary of connected vs disconnected hierarchy chains

  • - Complete hierarchical tree view with connection status indicators

  • Error Handling

  • ————-

  • The function handles several error conditions

  • - FileNotFoundError (When the specified TTL file cannot be found)

  • - Parsing errors (When the TTL file cannot be parsed as valid Turtle)

  • - Empty ontology (When no named classes are found)

  • - Missing hierarchy (When no rdfs:subClassOf relationships are found)

Notes

  • Only considers explicitly declared classes and rdfs:subClassOf relationships

  • Connection analysis based on URI prefix patterns (cco:, obo:bfo, bfo:)

  • Provides both statistical summary and detailed tree visualization

  • Includes namespace bindings for common ontology prefixes

Note

Claude AI (Sonnet 4) was employed chiefly to support documentation efforts.

Examples

Basic usage:

python script.py ontology.ttl

mainClassNameCapitalCheck_v_0_0_1(ttl_file, show='all', export_template=None)[source]

Class Name Capital Letter Check Analysis

Analyze an OWL ontology in Turtle (ttl) format to assess whether all named classes follow the convention of starting their local name with a capital letter

This main function loads an ontology file, identifies all named classes, and provides comprehensive analysis of capital letter compliance with various display options and export capabilities

Definitions

  • Named classes: Classes with URIRef identifiers that are explicitly declared as owl:Class or rdfs:Class, or participate in rdfs:subClassOf relations

  • Local name: The fragment of a class URI after the last ‘#’ or ‘/’ character. Only this portion is checked, not the full namespace URI

  • Compliant class: A class whose local name begins with an uppercase letter as determined by Python’s str.isupper() check on the first character

  • Coverage percentage: The proportion of named classes whose local name starts with a capital letter

Author: Rishabh Kundu Version: 0.0.1

param ttl_file:

Path to the ontology Turtle (.ttl) file to analyze – input file

type ttl_file:

str

param show:

Display option controlling what information to show: - “all” (default): Shows summary statistics, compliant classes, and non-compliant classes - “with”: Shows only classes whose local name starts with a capital letter - “without”: Shows only classes whose local name does not start with a capital letter - “summary”: Shows only summary statistics

type show:

str, optional

param export_template:

Export a CSV report file for classes that do not start with a capital letter. Provide the desired output filename (e.g. “non_capital_classes.csv”). Default is None (no export)

type export_template:

str, optional

returns:
  • None – This function does not directly return values. It prints analysis results to your terminal/CLI and optionally exports a CSV report file. The function may exit early on errors (file not found, parsing errors, or no classes found)

  • Output Information

  • ——————

  • When executed successfully, the analysis provides

  • - Total no. of named classes analyzed

  • - No. of classes whose local name starts with a capital letter

  • - No. of classes whose local name does not start with a capital letter

  • - Coverage percentage of compliant classes

  • - Prefixed class name, full URI/IRI, and local name for each class

  • Error Handling

  • ————–

  • - FileNotFoundError (When the specified TTL file cannot be found)

  • - Parsing errors (When the TTL file cannot be parsed as valid Turtle)

  • - Empty ontology (When no named classes are found in the ontology)

Notes

  • Only named classes (URIRef instances) are considered in the analysis

  • Only the local name fragment is checked, not the full URI

  • Classes with an empty local name are treated as non-compliant

  • show and export_template parameters are set to default values (“all” and None)
    • thus, CSV export must be explicitly requested

Note

Claude AI (Sonnet 4.6) was employed chiefly to support documentation efforts.

Examples

Basic usage (show all):

mainClassNameCapitalCheck_v_0_0_1(“ontology.ttl”)

Show only summary:

mainClassNameCapitalCheck_v_0_0_1(“ontology.ttl”, show=”summary”)

Show only non-compliant classes:

mainClassNameCapitalCheck_v_0_0_1(“ontology.ttl”, show=”without”)

Export CSV report of non-compliant classes:
mainClassNameCapitalCheck_v_0_0_1(“ontology.ttl”, export_template=”non_capital_classes.csv”)
  • a desired export path can also be inserted in the filename

Export report while showing summary only:
mainClassNameCapitalCheck_v_0_0_1(“ontology.ttl”, show=”summary”, export_template=”non_capital_classes.csv”)
  • a desired export path can also be inserted in the filename

mainClassNameSpaceCheck_v_0_0_1(ttl_file, export_template=None)[source]

Class Name Space Check Analysis

Scan an OWL ontology in Turtle (ttl) format for class names containing spaces. Spaces in prefixed class names make a TTL file unparseable, so this function works directly on the raw file text rather than parsing it through rdflib.

Handles both single-line and multi-line class declarations by grouping lines into declaration blocks before checking for spaces.

Author: Rishabh Kundu Version: 0.0.1

Parameters:
  • ttl_file (str) – Path to the ontology Turtle (.ttl) file to analyze – input file

  • export_template (str, optional) – Export a CSV report of all class names containing spaces. Provide the desired output filename (e.g. “classes_with_spaces_in_names.csv”). Default is None (no export)

Returns:

  • None – Prints results to terminal/CLI and optionally exports a CSV report.

  • Output Information

  • ——————

  • When executed successfully, the analysis provides

  • - Total number of class names with spaces detected

  • - The class names with space, line number, and full line text for each

  • Error Handling

  • ————–

  • - FileNotFoundError (When the specified TTL file cannot be found)

Notes

  • Does not use rdflib parsing - works on raw file text so it catches errors that would prevent parsing entirely

  • Detects owl:Class and rdfs:Class declarations

  • Handles multi-line declarations by grouping on ‘.’ block terminators

  • Comments and string literals are handled to avoid false matches

Note

Claude AI (Sonnet 4.6) was employed chiefly to support documentation efforts.

Examples

Basic usage:

mainClassNameSpaceCheck_v_0_0_1(“ontology.ttl”)

Export CSV report:

mainClassNameSpaceCheck_v_0_0_1(“ontology.ttl”, export_template=”classes_with_spaces_in_names.csv”)

mainLabelCheck_v_0_0_1(ttl_file, show='all', export_template=None)[source]

RDFS Label Coverage Analysis

Analyze an OWL ontology in Turtle (ttl) format to assess the coverage and quality of RDFS labels (rdfs:label) across all named classes

This main function loads an ontology file, identifies all named classes, and provides comprehensive analysis of label coverage with various display options and export capabilities

Definitions

  • Named classes: Classes with URIRef identifiers that are explicitly declared as owl:Class or rdfs:Class, or participate in rdfs:subClassOf relations

  • Valid labels: rdfs:label values that are non-empty strings after whitespace trimming. Empty strings and whitespace-only labels are not counted as valid labels

  • Coverage percentage: The proportion of named classes that have at least one valid rdfs:label

Author: Rishabh Kundu Version: 0.0.1

param ttl_file:

Path to the ontology Turtle (.ttl) file to analyze – input file

type ttl_file:

str

param show:

Display option controlling what information to show: - “all” (default): Shows summary statistics, classes with labels, and classes without labels - “with”: Shows only classes that have rdfs:label - “without”: Shows only classes that lack rdfs:label - “summary”: Shows only summary statistics

type show:

str, optional

param export_template:

Export a CSV template file for classes missing rdfs:label. Provide the desired output filename (e.g. “missing_labels_in_classes.csv”). Default is None (no export)

type export_template:

str, optional

returns:
  • None – This function does not directly return values. It prints analysis results to your terminal/CLI and optionally exports a CSV template file. The function may exit early on errors (file not found, parsing errors, or no classes found)

  • Output Information

  • ——————

  • When executed successfully, the analysis provides

  • - Total number of named classes analyzed

  • - Number of classes with valid rdfs (label properties)

  • - Number of classes lacking valid rdfs (label properties)

  • - Coverage percentage of classes with rdfs (label)

  • - Prefixed class name and full URI/IRI for each class

  • Error Handling

  • ————–

  • - FileNotFoundError (When the specified TTL file cannot be found)

  • - Parsing errors (When the TTL file cannot be parsed as valid Turtle)

  • - Empty ontology (When no named classes are found in the ontology)

Notes

  • Only named classes (URIRef instances) are considered in the analysis

  • Empty strings and whitespace-only labels are treated as missing labels

  • Classes are displayed with both their prefixed name and full URI/IRI

  • show and export_template parameters are set to default values (“all” and None)
    • thus, CSV export request must be explicitly mentioned

Note

Claude AI (Sonnet 4.6) was employed chiefly to support documentation efforts.

Examples

Basic usage (show all):

mainLabelCheck_v_0_0_1(“ontology.ttl”)

Show only summary:

mainLabelCheck_v_0_0_1(“ontology.ttl”, show=”summary”)

Show only classes missing labels:

mainLabelCheck_v_0_0_1(“ontology.ttl”, show=”without”)

Export CSV template for missing labels:

mainLabelCheck_v_0_0_1(“ontology.ttl”, export_template=”missing_labels.csv”)

Export template while showing summary only:
mainLabelCheck_v_0_0_1(“ontology.ttl”, show=”summary”, export_template=”missing_labels.csv”)
  • in the afore a desired export path can also be inserted

mainClassSearch_v_0_0_1(ontology_graph_or_path, search_term: str) list[source]

Class Name Substring Search

Evaluates an ontology file to find all class names that contain a specified substring, irrespective of capitalization.

This metric assesses whether certain concepts can be resolved from a semantic or string basis, which is important for identifying overlapping concepts, naming consistency, or potential duplicates (e.g., ‘VoltageRating’ and ‘VoltAgeRange’).

Author: Redad Mehdi Version: 0.0.1

Parameters:

ontology_graph_or_pathstr or rdflib.Graph

Path to the Turtle (.ttl) file to analyze, or a pre-loaded rdflib.Graph object.

search_termstr

The string to search for within the class names (case-insensitive).

Returns:

list of dict

A list containing dictionaries of the matched classes. Each dictionary contains: - ‘class_name’: The extracted local name of the class (str). - ‘uri’: The full URI string of the class (str). Returns an empty list if no matches are found.

Notes:

The function identifies class names by looking at subjects typed as owl:Class or rdfs:Class. It extracts the local name by splitting the URI at the last ‘#’ or ‘/’ character before performing the case-insensitive comparison.

Example:

>>> matches = mainClassSearch_v_0_0_1('dataset.ttl', 'VOLtage')
>>> print(f"Found {len(matches)} matching classes.")
run_ontology_assessment(ttl_file, metrics, output_log_file='assessment.log', output_csv_file='assessment_scores.csv')[source]

Run task-agnostic metrics on a single ontology (Mode 1).

Parameters:
  • ttl_file (str) – Path to the input Turtle (.ttl) ontology file.

  • metrics (list of str or str) – Metric names to execute, or "all" to run every metric in METRIC_DISPATCHER.

  • output_log_file (str, optional) – Output log file path.

  • output_csv_file (str, optional) – Output CSV file path.

run_task_based_assessment(ttl_files, questions, domain_prefixes, domain_ns_fragments=None, metrics=None, output_log_file='assessment.log', output_csv_file='assessment_scores.csv')[source]

Assess one or more ontologies against competency questions (Modes 3/4).

When a single ontology is provided this corresponds to Mode 3 (task-based scientific assessment). When multiple ontologies are provided they are merged and evaluated jointly, corresponding to Mode 4 (cross-domain assessment).

Parameters:
  • ttl_files (str or list of str) – Path(s) to Turtle (.ttl) ontology file(s). A single path is accepted and will be wrapped in a list internally.

  • questions (str or list of str) – Path to a JSON/Markdown file of SPARQL queries, or a list of raw SPARQL query strings.

  • domain_prefixes (list of str) – Namespace prefixes used in the SPARQL queries (e.g., ["mds"]).

  • domain_ns_fragments (list of str or None, optional) – Namespace URI fragments to restrict domain-term filtering.

  • metrics (list of str or None, optional) – Task-agnostic metric names to run alongside the task-based assessment. Pass "all" for every available metric.

  • output_log_file (str, optional) – Output log file path.

  • output_csv_file (str, optional) – Output CSV file path.

Returns:

The result dictionary from task_based_metric_v_0_0_1.

Return type:

dict

run_web_ontology_assessment(ttl_file, questions, domain_prefixes, knowledge_graph, domain_ns_fragments=None, metrics=None, output_log_file='assessment.log', output_csv_file='assessment_scores.csv')[source]

Assess a Web ontology against KGQA benchmark queries (Mode 2).

Runs the task-based Relevance/Accuracy assessment using competency queries drawn from a knowledge-graph question-answering benchmark (e.g., LC-QuAD over DBpedia). Optionally runs task-agnostic metrics as well.

Parameters:
  • ttl_file (str) – Path to the ontology Turtle file.

  • questions (str or list of str) – Path to a JSON/Markdown file of SPARQL queries, or a list of raw SPARQL query strings.

  • domain_prefixes (list of str) – Namespace prefixes used in the SPARQL queries (e.g., ["dbo"]).

  • knowledge_graph (str) – Path to the knowledge-graph file (Turtle/RDF) used for validation context.

  • domain_ns_fragments (list of str or None, optional) – Namespace URI fragments to restrict domain-term filtering.

  • metrics (list of str or None, optional) – Task-agnostic metric names to run alongside the task-based assessment. Pass "all" for every available metric.

  • output_log_file (str, optional) – Output log file path.

  • output_csv_file (str, optional) – Output CSV file path.

task_based_metric_v_0_0_1(ttl_file, questions, domain_prefixes, domain_ns_fragments=None)[source]

Compute task-based Relevance and Accuracy for an ontology.

Given an ontology (one or more Turtle files) and a set of competency questions expressed as SPARQL queries, this function computes two term-overlap metrics:

Relevance (Recall) = |T_a intersection T_o| / |T_a| Accuracy (Precision) = |T_a intersection T_o| / |T_o|

where T_a is the union of domain terms extracted from all SPARQL queries and T_o is the set of domain terms defined in the ontology.

Parameters:
  • ttl_file (str, pathlib.Path, or list thereof) – Path(s) to Turtle (.ttl) ontology file(s). A single string or Path is automatically wrapped in a list.

  • questions (str, pathlib.Path, or list of str) –

    The competency questions to evaluate against. Accepted forms:

    • str / Path ending in .json – path to a JSON file where each array element has a sparql_query key.

    • str / Path ending in .md – path to a Markdown file with SPARQL queries inside fenced sparql code blocks.

    • list of str – raw SPARQL query strings.

  • domain_prefixes (list of str) – Namespace prefixes used in the SPARQL queries to identify domain terms (e.g., ["mds"]).

  • domain_ns_fragments (list of str or None, optional) – Sub-strings of namespace URIs used to restrict which ontology terms count as domain-specific. When None, every non-foundational term is included.

Returns:

A dictionary with the following keys:

  • relevance (float): Recall – fraction of task terms present in the ontology.

  • accuracy (float): Precision – fraction of ontology terms referenced by the tasks.

  • T_o_count (int): Number of ontology domain terms.

  • T_a_count (int): Number of unique task terms.

  • intersection (int): Number of terms in both sets.

  • missing_from_onto (set of str): Task terms absent from the ontology.

  • unused_in_onto (set of str): Ontology terms not referenced by any task query.

Return type:

dict

Raises:

ValueError – If questions is not a recognized type (list, JSON path, or Markdown path).

Examples

>>> result = task_based_metric_v_0_0_1(
...     ttl_file="my_ontology.ttl",
...     questions="competency_questions.json",
...     domain_prefixes=["mds"],
...     domain_ns_fragments=["cwrusdle.bitbucket.io/mds"],
... )
>>> print(f"Relevance: {result['relevance']:.2%}")
>>> print(f"Accuracy:  {result['accuracy']:.2%}")