API Reference
This section documents the public API of OntoCheck, organized by assessment category. Each metric function takes a Turtle file path as input and returns a score or diagnostic report.
Assessment Runners
Functions for running each of the four assessment modes.
Ontology Assessment Runner
Provides runner functions for the four OntoCheck assessment modes:
- Mode 1 – Task-agnostic: structural, labeling, accessibility, and
naming-convention metrics applied to a single ontology.
- Mode 2 – Task-specific Web ontology: task-based Relevance/Accuracy
validated against a knowledge graph (e.g., DBpedia via LC-QuAD).
- Mode 3 – Task-based Scientific: domain ontology assessed against
competency questions encoded as SPARQL queries.
- Mode 4 – Cross-Domain: multiple ontologies merged and assessed against
cross-domain competency questions.
- run_ontology_assessment(ttl_file, metrics, output_log_file='assessment.log', output_csv_file='assessment_scores.csv')[source]
Run task-agnostic metrics on a single ontology (Mode 1).
- run_web_ontology_assessment(ttl_file, questions, domain_prefixes, knowledge_graph, domain_ns_fragments=None, metrics=None, output_log_file='assessment.log', output_csv_file='assessment_scores.csv')[source]
Assess a Web ontology against KGQA benchmark queries (Mode 2).
Runs the task-based Relevance/Accuracy assessment using competency queries drawn from a knowledge-graph question-answering benchmark (e.g., LC-QuAD over DBpedia). Optionally runs task-agnostic metrics as well.
- Parameters:
ttl_file (str) – Path to the ontology Turtle file.
questions (str or list of str) – Path to a JSON/Markdown file of SPARQL queries, or a list of raw SPARQL query strings.
domain_prefixes (list of str) – Namespace prefixes used in the SPARQL queries (e.g.,
["dbo"]).knowledge_graph (str) – Path to the knowledge-graph file (Turtle/RDF) used for validation context.
domain_ns_fragments (list of str or None, optional) – Namespace URI fragments to restrict domain-term filtering.
metrics (list of str or None, optional) – Task-agnostic metric names to run alongside the task-based assessment. Pass
"all"for every available metric.output_log_file (str, optional) – Output log file path.
output_csv_file (str, optional) – Output CSV file path.
- run_task_based_assessment(ttl_files, questions, domain_prefixes, domain_ns_fragments=None, metrics=None, output_log_file='assessment.log', output_csv_file='assessment_scores.csv')[source]
Assess one or more ontologies against competency questions (Modes 3/4).
When a single ontology is provided this corresponds to Mode 3 (task-based scientific assessment). When multiple ontologies are provided they are merged and evaluated jointly, corresponding to Mode 4 (cross-domain assessment).
- Parameters:
ttl_files (str or list of str) – Path(s) to Turtle (.ttl) ontology file(s). A single path is accepted and will be wrapped in a list internally.
questions (str or list of str) – Path to a JSON/Markdown file of SPARQL queries, or a list of raw SPARQL query strings.
domain_prefixes (list of str) – Namespace prefixes used in the SPARQL queries (e.g.,
["mds"]).domain_ns_fragments (list of str or None, optional) – Namespace URI fragments to restrict domain-term filtering.
metrics (list of str or None, optional) – Task-agnostic metric names to run alongside the task-based assessment. Pass
"all"for every available metric.output_log_file (str, optional) – Output log file path.
output_csv_file (str, optional) – Output CSV file path.
- Returns:
The result dictionary from
task_based_metric_v_0_0_1.- Return type:
Task-Based Metric
The underlying Relevance/Accuracy computation used by Modes 2, 3, and 4.
Task-Based Ontology Assessment Metric
Evaluates an ontology against a set of competency questions (encoded as SPARQL queries) by computing term-overlap metrics. For each question set, two scores are produced:
Relevance (Recall) = |T_a intersection T_o| / |T_a| Accuracy (Precision) = |T_a intersection T_o| / |T_o|
where T_a is the set of domain terms referenced in the SPARQL queries (the “task vocabulary”) and T_o is the set of domain terms defined in the ontology.
- Questions can be supplied as:
A path to a JSON file where each item contains a
sparql_querykey.A path to a Markdown file with SPARQL queries inside
sparqlblocks.A plain list of SPARQL query strings.
- task_based_metric_v_0_0_1(ttl_file, questions, domain_prefixes, domain_ns_fragments=None)[source]
Compute task-based Relevance and Accuracy for an ontology.
Given an ontology (one or more Turtle files) and a set of competency questions expressed as SPARQL queries, this function computes two term-overlap metrics:
Relevance (Recall) = |T_a intersection T_o| / |T_a| Accuracy (Precision) = |T_a intersection T_o| / |T_o|
where T_a is the union of domain terms extracted from all SPARQL queries and T_o is the set of domain terms defined in the ontology.
- Parameters:
ttl_file (str, pathlib.Path, or list thereof) – Path(s) to Turtle (.ttl) ontology file(s). A single string or
Pathis automatically wrapped in a list.questions (str, pathlib.Path, or list of str) –
The competency questions to evaluate against. Accepted forms:
str / Path ending in .json – path to a JSON file where each array element has a
sparql_querykey.str / Path ending in .md – path to a Markdown file with SPARQL queries inside fenced
sparqlcode blocks.list of str – raw SPARQL query strings.
domain_prefixes (list of str) – Namespace prefixes used in the SPARQL queries to identify domain terms (e.g.,
["mds"]).domain_ns_fragments (list of str or None, optional) – Sub-strings of namespace URIs used to restrict which ontology terms count as domain-specific. When
None, every non-foundational term is included.
- Returns:
A dictionary with the following keys:
relevance(float): Recall – fraction of task terms present in the ontology.accuracy(float): Precision – fraction of ontology terms referenced by the tasks.T_o_count(int): Number of ontology domain terms.T_a_count(int): Number of unique task terms.intersection(int): Number of terms in both sets.missing_from_onto(set of str): Task terms absent from the ontology.unused_in_onto(set of str): Ontology terms not referenced by any task query.
- Return type:
- Raises:
ValueError – If questions is not a recognized type (list, JSON path, or Markdown path).
Examples
>>> result = task_based_metric_v_0_0_1( ... ttl_file="my_ontology.ttl", ... questions="competency_questions.json", ... domain_prefixes=["mds"], ... domain_ns_fragments=["cwrusdle.bitbucket.io/mds"], ... ) >>> print(f"Relevance: {result['relevance']:.2%}") >>> print(f"Accuracy: {result['accuracy']:.2%}")
Labeling Metrics
Metrics that quantify the proportion of named classes carrying human-readable identifiers, synonyms, and formal definitions.
ontocheck.check_label
- mainLabelCheck_v_0_0_1(ttl_file, show='all', export_template=None)[source]
RDFS Label Coverage Analysis
Analyze an OWL ontology in Turtle (ttl) format to assess the coverage and quality of RDFS labels (rdfs:label) across all named classes
This main function loads an ontology file, identifies all named classes, and provides comprehensive analysis of label coverage with various display options and export capabilities
Definitions
Named classes: Classes with URIRef identifiers that are explicitly declared as owl:Class or rdfs:Class, or participate in rdfs:subClassOf relations
Valid labels: rdfs:label values that are non-empty strings after whitespace trimming. Empty strings and whitespace-only labels are not counted as valid labels
Coverage percentage: The proportion of named classes that have at least one valid rdfs:label
Author: Rishabh Kundu Version: 0.0.1
- param ttl_file:
Path to the ontology Turtle (.ttl) file to analyze – input file
- type ttl_file:
str
- param show:
Display option controlling what information to show: - “all” (default): Shows summary statistics, classes with labels, and classes without labels - “with”: Shows only classes that have rdfs:label - “without”: Shows only classes that lack rdfs:label - “summary”: Shows only summary statistics
- type show:
str, optional
- param export_template:
Export a CSV template file for classes missing rdfs:label. Provide the desired output filename (e.g. “missing_labels_in_classes.csv”). Default is None (no export)
- type export_template:
str, optional
- returns:
None – This function does not directly return values. It prints analysis results to your terminal/CLI and optionally exports a CSV template file. The function may exit early on errors (file not found, parsing errors, or no classes found)
Output Information
——————
When executed successfully, the analysis provides
- Total number of named classes analyzed
- Number of classes with valid rdfs (label properties)
- Number of classes lacking valid rdfs (label properties)
- Coverage percentage of classes with rdfs (label)
- Prefixed class name and full URI/IRI for each class
Error Handling
————–
- FileNotFoundError (When the specified TTL file cannot be found)
- Parsing errors (When the TTL file cannot be parsed as valid Turtle)
- Empty ontology (When no named classes are found in the ontology)
Notes
Only named classes (URIRef instances) are considered in the analysis
Empty strings and whitespace-only labels are treated as missing labels
Classes are displayed with both their prefixed name and full URI/IRI
- show and export_template parameters are set to default values (“all” and None)
thus, CSV export request must be explicitly mentioned
Note
Claude AI (Sonnet 4.6) was employed chiefly to support documentation efforts.
Examples
- Basic usage (show all):
mainLabelCheck_v_0_0_1(“ontology.ttl”)
- Show only summary:
mainLabelCheck_v_0_0_1(“ontology.ttl”, show=”summary”)
- Show only classes missing labels:
mainLabelCheck_v_0_0_1(“ontology.ttl”, show=”without”)
- Export CSV template for missing labels:
mainLabelCheck_v_0_0_1(“ontology.ttl”, export_template=”missing_labels.csv”)
- Export template while showing summary only:
- mainLabelCheck_v_0_0_1(“ontology.ttl”, show=”summary”, export_template=”missing_labels.csv”)
in the afore a desired export path can also be inserted
ontocheck.altLabelCheck
mainAltLabelCheck_v_0_0_1 metric implementation.
- mainAltLabelCheck_v_0_0_1(ttl_file, show='all', export_template=None)[source]
SKOS Alternative Label Coverage Analysis
Analyze an OWL ontology in Turtle (ttl) format to assess the coverage and quality of SKOS alternative labels (skos:altLabel) across all named classes
This main function loads an ontology file, identifies all named classes, and provides comprehensive analysis of alternative label coverage with various display options and export capabilities
Definitions
Named classes: Classes with URIRef identifiers that are explicitly declared as owl:Class or rdfs:Class, or participate in rdfs:subClassOf relations
Valid altLabels: SKOS altLabel values that are non-empty strings after whitespace trimming. Empty strings and whitespace-only altLabels are not accounted as valid altLabels
Coverage percentage: The proportion of named classes that have at least one valid altLabel inclusion
Author: Rishabh Kundu Version: 0.0.1
- param ttl_file:
Path to the ontology Turtle (.ttl) file to analyze – input file
- type ttl_file:
str
- param show:
Display option controlling what information to show: - “all” (default): Shows summary statistics, classes with altLabels, and classes without altLabels - “with”: Shows only classes that have altLabels - “without”: Shows only classes that lack altLabels - “summary”: Shows only summary statistics
- type show:
str, optional
- param export_template:
Export a Turtle format template file for classes missing altLabels. Provide the desired output filename. Default is None (no export).
- type export_template:
str, optional
- returns:
None – This function does not (directly) return values. It prints analysis results to your terminal/CLI and optionally exports a template file. The function may exit early on errors (file not found, parsing errors, or no classes found)
Output Information
—————–
When executed successfully, the analysis provides
- Total number of named classes analyzed
- Number of classes with valid altLabel properties
- Number of classes lacking valid altLabel properties
- Coverage percentage of classes with altLabels
- Total count of altLabel instances across all classes
- Average number of altLabels per class (for classes with altLabels)
- Qualitative assessment based on coverage thresholds
Error Handling
————-
The function handles several error conditions
- FileNotFoundError (When the specified TTL file cannot be found)
- Parsing errors (When the TTL file cannot be parsed as valid Turtle)
- Empty ontology (When no named classes are found in the ontology)
- Template export errors (File I/O issues when exporting templates)
Notes
Only named classes (URIRef instances) are considered in the analysis
Empty strings and whitespace-only altLabels are filtered out
Coverage assessment follows established ontology quality thresholds: ≥80%: Excellent, ≥60%: Good, ≥40%: Moderate, ≥20%: Low, <20%: Very low
Classes are displayed with their preferred labels (skos:prefLabel or rdfs:label) when available
Template export generates valid Turtle syntax for adding missing altLabels
show and export_template function parameters set to default values (all and None, respectively)
Note
Claude AI (Sonnet 4) was employed chiefly to support documentation efforts.
Examples
- Basic usage:
python script.py ontology.ttl
- Show only summary:
python script.py ontology.ttl –show summary
- Export template for missing labels:
python script.py ontology.ttl –export-template missing_labels.ttl
ontocheck.defCheck
mainDefCheck_v_0_0_1 metric implementation.
- mainDefCheck_v_0_0_1(ttl_file, show='all', full_definitions=False)[source]
SKOS Definition Coverage Analysis
Analyze an OWL ontology in Turtle (ttl) format and assess the coverage and quality of SKOS definitions (skos:definition) across all named classes
This main function loads an ontology file, identifies all named classes, and provides comprehensive analysis of definition coverage with various display options
Definitions
Named classes: Classes with URIRef identifiers that are explicitly declared as owl:Class or rdfs:Class, or participate in rdfs:subClassOf relations
Valid definitions: SKOS definition values that exist as properties on classes Empty or missing definitions are identified as gaps in coverage
Coverage percentage: The proportion of named classes that have at least one skos:definition property
Author: Rishabh Kundu Version: 0.0.1
- param ttl_file:
Path to the ontology Turtle (.ttl) file to analyze
- type ttl_file:
str
- param show:
Display option controlling what information to show: - “all” (default): Shows summary statistics, classes with definitions, and classes without definitions - “with”: Shows only classes that have definitions - “without”: Shows only classes that lack definitions - “summary”: Shows only summary statistics
- type show:
str, optional
- param full_definitions:
Show full definitions instead of truncated versions (default: False, truncated to 150 chars)
- type full_definitions:
bool, optional
- returns:
None – This function does not (directly) return values. It prints analysis results to terminal/CLI The function may exit early on errors (file not found, parsing errors, or no classes found)
Output Information
—————–
When executed successfully, the analysis provides
- Total number of named classes analyzed
- Number of classes with definition properties
- Number of classes lacking definition properties
- Coverage percentage of classes with definitions
- Qualitative assessment based on coverage thresholds
Error Handling
————-
The function handles several error conditions
- FileNotFoundError (When the specified TTL file cannot be found)
- Parsing errors (When the TTL file cannot be parsed as valid Turtle)
- Empty ontology (When no named classes are found in the ontology)
Notes
Only named classes (URIRef instances) are considered in the analysis
Uses skos:definition property specifically for definition identification
Coverage assessment follows ontology quality thresholds with higher standards than altLabels: ≥90%: Excellent, ≥75%: Good, ≥50%: Moderate, ≥25%: Low, <25%: Very low
Classes are displayed with their preferred labels (skos:prefLabel or rdfs:label) when available
show and full_definitions function parameters set to default values (all and False, respectively)
Note
Claude AI (Sonnet 4) was employed chiefly to support documentation efforts.
Examples
- Basic usage:
python script.py ontology.ttl
- Show only summary:
python script.py ontology.ttl –show summary
- Show full definitions:
python script.py ontology.ttl –full-definitions
- Show only classes without definitions:
python script.py ontology.ttl –show without
Structural Metrics
Metrics that expose orphaned classes, disconnected subgraphs, undeclared domain and range restrictions, and hierarchy chains lacking grounding in upper-level ontologies.
ontocheck.check_for_isolated_elements
- check_for_isolated_elements(ttl_file: str)[source]
C1 - Number of isolated elements
Analyze an OWL ontology in Turtle format to identify isolated atomic classes and isolated properties.
Definitions
Atomic classes are named classes (with URI) that are NOT constructed classes (i.e., they do not have owl:unionOf, owl:intersectionOf, or owl:complementOf).
- A class (atomic or constructed with URI) is considered connected if it:
participates in rdfs:subClassOf, owl:equivalentClass, or owl:disjointWith relations involving atomic classes, OR
is used as domain or range of properties and contains at least one atomic class inside its construction.
A property is considered connected if it is related by any of: rdfs:subPropertyOf, owl:inverseOf, owl:propertyDisjointWith, or owl:equivalentProperty.
Author: Van Tran Version: 0.0.1
- param ttl_file:
File path to the ontology Turtle (.ttl) file.
- type ttl_file:
str
- param Prints:
- param ——:
- param Lists of isolated atomic classes and isolated properties.:
Notes
Only named classes explicitly declared as owl:Class are considered.
Only properties explicitly declared as owl:ObjectProperty or owl:DatatypeProperty are considered.
Relations checked for classes include rdfs:subClassOf, owl:equivalentClass, owl:disjointWith, and usage as domain or range of properties.
Relations checked for properties include rdfs:subPropertyOf, owl:inverseOf, owl:propertyDisjointWith, and owl:equivalentProperty.
References
Mc Gurk, S., Abela, C., & Debattista, J. (2017). Towards ontology quality assessment. 4th Workshop on Linked Data Quality (LDQ2017), co-located with the 14th Extended Semantic Web Conference (ESWC), Portorož, 94-106.
ontocheck.count_class_connected_components
- count_class_connected_components(ttl_file_path: str) int[source]
C3 - Count number of connected subgraphs
Count the number of connected components in the “class graph” (TBox) of an OWL ontology. The “class graph” is constructed by creating undirected edges between classes that are connected by any of the following OWL predicates:
rdfs:subClassOf
owl:equivalentClass
owl:disjointWith
Nodes in the graph represent named classes (URIRefs) from the ontology. Edges represent these relationships between named classes. Classes involved only in subclass/equivalent/disjointWith axioms pointing to blank nodes (i.e. constructed classes) are excluded.
Author: Van Tran Version: 0.0.1
- Parameters:
ttl_file_path (str) – Path to the ontology Turtle (.ttl) file.
- Returns:
The number of connected components in the class graph.
- Return type:
Notes
The graph is undirected; directionality of subclass relations is ignored.
Only named OWL classes are considered.
Classes that participate in subclass/equivalent/disjoint axioms involving blank nodes are excluded.
ontocheck.get_properties_missing_domain_and_range
- get_properties_missing_domain_and_range(ttl_file_path: str)[source]
C2 - Missing Domain and Ranges in Properties
Parse an OWL ontology Turtle file and identify object and datatype properties that are missing domain or range declarations.
Author: Van Tran Version: 0.0.1
- Parameters:
ttl_file_path (str) – Path to the Turtle (.ttl) file containing the ontology.
- Returns:
A dictionary containing:
- ’count_missing_domain’: int
Number of properties missing an rdfs:domain declaration.
- ’properties_missing_domain’: list of rdflib.term.URIRef
List of properties (URIs) missing an rdfs:domain.
- ’count_missing_range’: int
Number of properties missing an rdfs:range declaration.
- ’properties_missing_range’: list of rdflib.term.URIRef
List of properties (URIs) missing an rdfs:range.
- Return type:
Notes
Only properties explicitly typed as owl:ObjectProperty or owl:DatatypeProperty are considered.
References
Mc Gurk, S., Abela, C., & Debattista, J. (2017). Towards ontology quality assessment. 4th Workshop on Linked Data Quality (LDQ2017), co-located with the 14th Extended Semantic Web Conference (ESWC), Portorož, 94-106.
ontocheck.leafNodeCheck
mainLeafNodeCheck_v_0_0_1 metric implementation.
- mainLeafNodeCheck_v_0_0_1(ttl_file)[source]
Ontology Leaf Node Analysis
Analyze an OWL ontology in Turtle (ttl) format and identify all leaf nodes in the class hierarchy. Leaf nodes are classes that have no subclasses, representing the most specific classes in the ontology
This main function loads an ontology file, identifies all declared classes, and determines which classes are leaf nodes by finding classes that are never used as superclasses
Definitions
Leaf nodes: Classes that have no subclasses, meaning they do not appear as objects in rdfs:subClassOf relationships (or skos:broader)
Declared classes: Classes explicitly declared with rdf:type owl:Class or rdfs:Class
Hierarchy detection: Uses rdfs:subClassOf relationships to determine class hierarchy (also skos:broader)
Author: Rishabh Kundu Version: 0.0.1
- param ttl_file:
Path to the ontology Turtle (.ttl) file to analyze
- type ttl_file:
str
- returns:
None – This function does not (directly) return values. It prints analysis results to terminal/CLI The function may exit early on errors (file not found, parsing errors, or no leaf nodes found)
Output Information
—————–
When executed successfully, the analysis provides
- Total number of leaf nodes found
- Complete list of leaf nodes with their prefixed names (sorted alphabetically)
Error Handling
————-
The function handles several error conditions
- FileNotFoundError (When the specified TTL file cannot be found)
- Parsing errors (When the TTL file cannot be parsed as valid Turtle)
- Empty results (When no leaf nodes are found in the ontology)
Notes
Only considers explicitly declared classes (rdf:type owl:Class or rdfs:Class)
Uses namespace manager for clean URI representation in output
Leaf nodes are sorted alphabetically for consistent display
Coverage statistics may be implemented in future versions
Note
Claude AI (Sonnet 4) was employed chiefly to support documentation efforts.
References
Mc Gurk, S., Abela, C., & Debattista, J. (2017). Towards ontology quality assessment. 4th Workshop on Linked Data Quality (LDQ2017), co-located with the 14th Extended Semantic Web Conference (ESWC), Portorož, 94-106.
Examples
- Basic usage:
python script.py ontology.ttl
ontocheck.semanticConnection
mainSemanticConnection_v_0_0_1 metric implementation.
- mainSemanticConnection_v_0_0_1(ttl_file)[source]
Ontology Semantic Connection Analysis
Analyze an OWL ontology in Turtle (ttl) format and assess the semantic connection of class hierarchies to established upper-level ontologies (specifically, Common Core Ontology and Basic Formal Ontology)
This main function loads an ontology file, builds the complete class hierarchy, identifies root classes, and determines which hierarchy chains are semantically grounded in higher-level ontologies through naming convention analysis
Definitions
Named classes: Classes with URIRef identifiers that are explicitly declared as owl:Class or rdfs:Class, or participate in rdfs:subClassOf relations
Class hierarchy: The tree structure of classes connected via rdfs:subClassOf relationships
Root classes: Classes that have no parent classes, representing the top level of independent hierarchy trees
Semantic connection: Connection to higher-level ontologies (CCO/BFO) determined by URI prefix analysis (cco:, obo:bfo, bfo:)
Hierarchy chains: Complete trees of classes rooted at root classes, inheriting the connection status of their root
Author: Rishabh Kundu Version: 0.0.1
- param ttl_file:
Path to the ontology Turtle (.ttl) file to analyze
- type ttl_file:
str
- returns:
None – This function does not (directly) return values. It prints comprehensive hierarchy analysis to terminal/CLI. The function may exit early on errors (file not found, parsing errors, no classes found, or no hierarchy relationships found)
Output Information
—————–
When executed successfully, the analysis provides
- Total number of named classes
- Number of classes with children (parent classes)
- Total parent-child relationships
- Number of root classes
- Number of root classes connected to higher ontologies
- Summary of connected vs disconnected hierarchy chains
- Complete hierarchical tree view with connection status indicators
Error Handling
————-
The function handles several error conditions
- FileNotFoundError (When the specified TTL file cannot be found)
- Parsing errors (When the TTL file cannot be parsed as valid Turtle)
- Empty ontology (When no named classes are found)
- Missing hierarchy (When no rdfs:subClassOf relationships are found)
Notes
Only considers explicitly declared classes and rdfs:subClassOf relationships
Connection analysis based on URI prefix patterns (cco:, obo:bfo, bfo:)
Provides both statistical summary and detailed tree visualization
Includes namespace bindings for common ontology prefixes
Note
Claude AI (Sonnet 4) was employed chiefly to support documentation efforts.
Examples
- Basic usage:
python script.py ontology.ttl
Accessibility Metrics
Metrics that verify endpoint reachability, data dump availability, licensing fitness, and external link validity.
ontocheck.check_sparql_accessibility_ttl
- check_sparql_accessibility_ttl(ttl_file)[source]
A1 - Accessibility of the SPARQL endpoint and the server
Evaluates the accessibility of SPARQL endpoints referenced in a TTL file by attempting to execute a simple query against each discovered endpoint.
This metric is based on Flemming (2011) quality criteria for Linked Data sources, specifically addressing the availability and accessibility of query interfaces.
Author: Redad Mehdi Version: 0.0.1
Parameters:
- ttl_filestr
Path to the Turtle (.ttl) file to analyze
Returns:
- float
Ratio of accessible SPARQL endpoints (0.0 to 1.0) - 0.0: No accessible endpoints found - 1.0: All discovered endpoints are accessible
Example:
>>> score = check_sparql_accessibility_ttl('dataset.ttl') >>> print(f"SPARQL accessibility score: {score}")
References:
Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., & Auer, S. (2015). Quality assessment for Linked Data: A Survey: A systematic literature review and conceptual framework. Semantic Web, 7(1), 63-93.
Flemming, A. (2011). Qualitätsmerkmale von Linked Data-veröffentlichenden Datenquellen. Diplomarbeit (Quality Criteria for Linked Data Sources).
ontocheck.check_rdf_dump_accessibility_ttl
- check_rdf_dump_accessibility_ttl(ttl_file)[source]
A2 - RDF dump accessibility
Evaluates the accessibility of RDF data dumps referenced in a TTL file by attempting to access each discovered dump URL via HTTP HEAD requests.
This metric assesses whether the raw RDF data is available for download, which is important for data consumers who need offline access or bulk processing capabilities.
Author: Redad Mehdi Version: 0.0.1
Parameters:
- ttl_filestr
Path to the Turtle (.ttl) file to analyze
Returns:
- float
Ratio of accessible RDF dumps (0.0 to 1.0) - 0.0: No accessible dumps found - 1.0: All discovered dumps are accessible
Notes:
The function identifies potential dump URLs by looking for common RDF file extensions (.rdf, .ttl, .nt, .n3, .owl, .jsonld) in referenced URLs.
Example:
>>> score = check_rdf_dump_accessibility_ttl('dataset.ttl') >>> print(f"RDF dump accessibility score: {score}")
References:
Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., & Auer, S. (2015). Quality assessment for Linked Data: A Survey: A systematic literature review and conceptual framework. Semantic Web, 7(1), 63-93.
Flemming, A. (2011). Qualitätsmerkmale von Linked Data-veröffentlichenden Datenquellen. Diplomarbeit (Quality Criteria for Linked Data Sources).
ontocheck.check_human_readable_license_ttl
- check_human_readable_license_ttl(ttl_file)[source]
L2 - Human-readable license detection
Detects the presence of human-readable licensing information within a TTL file. This metric evaluates whether the dataset provides clear licensing terms that users can understand without legal expertise.
The function searches for common license-related keywords in both RDF literals and TTL file comments, including references to popular licenses like Creative Commons, GPL, MIT, Apache, and BSD.
Author: Redad Mehdi Version: 0.0.1
Parameters:
- ttl_filestr
Path to the Turtle (.ttl) file to analyze
Returns:
- int
Binary score (0 or 1) - 0: No human-readable license information found - 1: License-related keywords detected
Notes:
Keywords searched include: ‘license’, ‘licence’, ‘copyright’, ‘terms of use’, ‘creative commons’, ‘GPL’, ‘MIT’, ‘Apache’, ‘BSD’
Example:
>>> score = check_human_readable_license_ttl('dataset.ttl') >>> if score: ... print("Human-readable license information found") ... else: ... print("No license information detected")
References:
Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., & Auer, S. (2015). Quality assessment for Linked Data: A Survey: A systematic literature review and conceptual framework. Semantic Web, 7(1), 63-93.
Hogan, A., Umbrich, J., Harth, A., Cyganiak, R., Polleres, A., & Decker, S. (2012). An empirical survey of Linked Data conformance. Journal of Web Semantics, 14, 14-44.
ontocheck.check_external_data_provider_links_ttl
- check_external_data_provider_links_ttl(ttl_file, base_namespace=None)[source]
I2 - Detection of existence and usage of external URIs
Evaluates the degree to which a dataset links to external data providers through properties like owl:sameAs, rdfs:seeAlso, and SKOS mapping properties. This metric assesses the dataset’s integration with the broader Linked Data cloud.
External links enhance data discoverability and enable cross-dataset queries, representing a key principle of Linked Data publishing.
Author: Redad Mehdi Version: 0.0.1
Parameters:
- ttl_filestr
Path to the Turtle (.ttl) file to analyze
- base_namespacestr, optional
Base namespace of the ontology. If not provided, the function attempts to infer it from the graph’s namespace declarations.
Returns:
- float
Ratio of entities with external links (0.0 to 1.0) - 0.0: No entities have external links - 1.0: All entities have at least one external link
Notes:
The function examines the following linking predicates: - owl:sameAs, rdfs:seeAlso - SKOS mapping properties (exactMatch, closeMatch, etc.) - owl:equivalentClass, owl:equivalentProperty - dc:source, foaf:isPrimaryTopicOf
If no base namespace is provided, the function recognizes links to known external data providers like DBpedia, Wikidata, GeoNames, etc.
Example:
>>> score = check_external_data_provider_links_ttl('dataset.ttl', ... 'http://example.org/') >>> print(f"External linking score: {score:.2f}")
References:
Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., & Auer, S. (2015). Quality assessment for Linked Data: A Survey: A systematic literature review and conceptual framework. Semantic Web, 7(1), 63-93.
Hogan, A., Umbrich, J., Harth, A., Cyganiak, R., Polleres, A., & Decker, S. (2012). An empirical survey of Linked Data conformance. Journal of Web Semantics, 14, 14-44.
Naming Convention Metrics
Metrics that detect and flag naming of ontological entities that depart from standard authoring practices.
ontocheck.check_class_name_capital
- mainClassNameCapitalCheck_v_0_0_1(ttl_file, show='all', export_template=None)[source]
Class Name Capital Letter Check Analysis
Analyze an OWL ontology in Turtle (ttl) format to assess whether all named classes follow the convention of starting their local name with a capital letter
This main function loads an ontology file, identifies all named classes, and provides comprehensive analysis of capital letter compliance with various display options and export capabilities
Definitions
Named classes: Classes with URIRef identifiers that are explicitly declared as owl:Class or rdfs:Class, or participate in rdfs:subClassOf relations
Local name: The fragment of a class URI after the last ‘#’ or ‘/’ character. Only this portion is checked, not the full namespace URI
Compliant class: A class whose local name begins with an uppercase letter as determined by Python’s str.isupper() check on the first character
Coverage percentage: The proportion of named classes whose local name starts with a capital letter
Author: Rishabh Kundu Version: 0.0.1
- param ttl_file:
Path to the ontology Turtle (.ttl) file to analyze – input file
- type ttl_file:
str
- param show:
Display option controlling what information to show: - “all” (default): Shows summary statistics, compliant classes, and non-compliant classes - “with”: Shows only classes whose local name starts with a capital letter - “without”: Shows only classes whose local name does not start with a capital letter - “summary”: Shows only summary statistics
- type show:
str, optional
- param export_template:
Export a CSV report file for classes that do not start with a capital letter. Provide the desired output filename (e.g. “non_capital_classes.csv”). Default is None (no export)
- type export_template:
str, optional
- returns:
None – This function does not directly return values. It prints analysis results to your terminal/CLI and optionally exports a CSV report file. The function may exit early on errors (file not found, parsing errors, or no classes found)
Output Information
——————
When executed successfully, the analysis provides
- Total no. of named classes analyzed
- No. of classes whose local name starts with a capital letter
- No. of classes whose local name does not start with a capital letter
- Coverage percentage of compliant classes
- Prefixed class name, full URI/IRI, and local name for each class
Error Handling
————–
- FileNotFoundError (When the specified TTL file cannot be found)
- Parsing errors (When the TTL file cannot be parsed as valid Turtle)
- Empty ontology (When no named classes are found in the ontology)
Notes
Only named classes (URIRef instances) are considered in the analysis
Only the local name fragment is checked, not the full URI
Classes with an empty local name are treated as non-compliant
- show and export_template parameters are set to default values (“all” and None)
thus, CSV export must be explicitly requested
Note
Claude AI (Sonnet 4.6) was employed chiefly to support documentation efforts.
Examples
- Basic usage (show all):
mainClassNameCapitalCheck_v_0_0_1(“ontology.ttl”)
- Show only summary:
mainClassNameCapitalCheck_v_0_0_1(“ontology.ttl”, show=”summary”)
- Show only non-compliant classes:
mainClassNameCapitalCheck_v_0_0_1(“ontology.ttl”, show=”without”)
- Export CSV report of non-compliant classes:
- mainClassNameCapitalCheck_v_0_0_1(“ontology.ttl”, export_template=”non_capital_classes.csv”)
a desired export path can also be inserted in the filename
- Export report while showing summary only:
- mainClassNameCapitalCheck_v_0_0_1(“ontology.ttl”, show=”summary”, export_template=”non_capital_classes.csv”)
a desired export path can also be inserted in the filename
ontocheck.check_class_name_space
- mainClassNameSpaceCheck_v_0_0_1(ttl_file, export_template=None)[source]
Class Name Space Check Analysis
Scan an OWL ontology in Turtle (ttl) format for class names containing spaces. Spaces in prefixed class names make a TTL file unparseable, so this function works directly on the raw file text rather than parsing it through rdflib.
Handles both single-line and multi-line class declarations by grouping lines into declaration blocks before checking for spaces.
Author: Rishabh Kundu Version: 0.0.1
- Parameters:
- Returns:
None – Prints results to terminal/CLI and optionally exports a CSV report.
Output Information
——————
When executed successfully, the analysis provides
- Total number of class names with spaces detected
- The class names with space, line number, and full line text for each
Error Handling
————–
- FileNotFoundError (When the specified TTL file cannot be found)
Notes
Does not use rdflib parsing - works on raw file text so it catches errors that would prevent parsing entirely
Detects owl:Class and rdfs:Class declarations
Handles multi-line declarations by grouping on ‘.’ block terminators
Comments and string literals are handled to avoid false matches
Note
Claude AI (Sonnet 4.6) was employed chiefly to support documentation efforts.
Examples
- Basic usage:
mainClassNameSpaceCheck_v_0_0_1(“ontology.ttl”)
- Export CSV report:
mainClassNameSpaceCheck_v_0_0_1(“ontology.ttl”, export_template=”classes_with_spaces_in_names.csv”)
ontocheck.spell_check
ontocheck.find_duplicate_labels_from_graph
- find_duplicate_labels_from_graph(ttl_file)[source]
IO8 - Semantically Identical Classes
This metric identifies semantically identical classes by checking if two IRIs in an ontology has the same value for rdfs:label or not.
Params
ttl_file (string): path to ttl file
- returns:
duplicates (dict)
- rtype:
dictionary of URIs of duplicated terms
Author: Van Tran Version: 0.0.1
References
Mc Gurk, S., Abela, C., & Debattista, J. (2017). Towards ontology quality assessment. 4th Workshop on Linked Data Quality (LDQ2017), co-located with the 14th Extended Semantic Web Conference (ESWC), Portorož, 94-106.
ontocheck.class_search
- mainClassSearch_v_0_0_1(ontology_graph_or_path, search_term: str) list[source]
Class Name Substring Search
Evaluates an ontology file to find all class names that contain a specified substring, irrespective of capitalization.
This metric assesses whether certain concepts can be resolved from a semantic or string basis, which is important for identifying overlapping concepts, naming consistency, or potential duplicates (e.g., ‘VoltageRating’ and ‘VoltAgeRange’).
Author: Redad Mehdi Version: 0.0.1
Parameters:
- ontology_graph_or_pathstr or rdflib.Graph
Path to the Turtle (.ttl) file to analyze, or a pre-loaded rdflib.Graph object.
- search_termstr
The string to search for within the class names (case-insensitive).
Returns:
- list of dict
A list containing dictionaries of the matched classes. Each dictionary contains: - ‘class_name’: The extracted local name of the class (str). - ‘uri’: The full URI string of the class (str). Returns an empty list if no matches are found.
Notes:
The function identifies class names by looking at subjects typed as owl:Class or rdfs:Class. It extracts the local name by splitting the URI at the last ‘#’ or ‘/’ character before performing the case-insensitive comparison.
Example:
>>> matches = mainClassSearch_v_0_0_1('dataset.ttl', 'VOLtage') >>> print(f"Found {len(matches)} matching classes.")
Other Modules
ontocheck.cli
OntoCheck Command-Line Interface
Provides a unified CLI for all four OntoCheck assessment modes:
Mode 1 – Task-agnostic (default) Mode 2 – Task-specific Web ontology Mode 3 – Task-based Scientific ontology Mode 4 – Cross-Domain ontology
ontocheck.mds_design_check
Module Contents
- mainAltLabelCheck_v_0_0_1(ttl_file, show='all', export_template=None)[source]
SKOS Alternative Label Coverage Analysis
Analyze an OWL ontology in Turtle (ttl) format to assess the coverage and quality of SKOS alternative labels (skos:altLabel) across all named classes
This main function loads an ontology file, identifies all named classes, and provides comprehensive analysis of alternative label coverage with various display options and export capabilities
Definitions
Named classes: Classes with URIRef identifiers that are explicitly declared as owl:Class or rdfs:Class, or participate in rdfs:subClassOf relations
Valid altLabels: SKOS altLabel values that are non-empty strings after whitespace trimming. Empty strings and whitespace-only altLabels are not accounted as valid altLabels
Coverage percentage: The proportion of named classes that have at least one valid altLabel inclusion
Author: Rishabh Kundu Version: 0.0.1
- param ttl_file:
Path to the ontology Turtle (.ttl) file to analyze – input file
- type ttl_file:
str
- param show:
Display option controlling what information to show: - “all” (default): Shows summary statistics, classes with altLabels, and classes without altLabels - “with”: Shows only classes that have altLabels - “without”: Shows only classes that lack altLabels - “summary”: Shows only summary statistics
- type show:
str, optional
- param export_template:
Export a Turtle format template file for classes missing altLabels. Provide the desired output filename. Default is None (no export).
- type export_template:
str, optional
- returns:
None – This function does not (directly) return values. It prints analysis results to your terminal/CLI and optionally exports a template file. The function may exit early on errors (file not found, parsing errors, or no classes found)
Output Information
—————–
When executed successfully, the analysis provides
- Total number of named classes analyzed
- Number of classes with valid altLabel properties
- Number of classes lacking valid altLabel properties
- Coverage percentage of classes with altLabels
- Total count of altLabel instances across all classes
- Average number of altLabels per class (for classes with altLabels)
- Qualitative assessment based on coverage thresholds
Error Handling
————-
The function handles several error conditions
- FileNotFoundError (When the specified TTL file cannot be found)
- Parsing errors (When the TTL file cannot be parsed as valid Turtle)
- Empty ontology (When no named classes are found in the ontology)
- Template export errors (File I/O issues when exporting templates)
Notes
Only named classes (URIRef instances) are considered in the analysis
Empty strings and whitespace-only altLabels are filtered out
Coverage assessment follows established ontology quality thresholds: ≥80%: Excellent, ≥60%: Good, ≥40%: Moderate, ≥20%: Low, <20%: Very low
Classes are displayed with their preferred labels (skos:prefLabel or rdfs:label) when available
Template export generates valid Turtle syntax for adding missing altLabels
show and export_template function parameters set to default values (all and None, respectively)
Note
Claude AI (Sonnet 4) was employed chiefly to support documentation efforts.
Examples
- Basic usage:
python script.py ontology.ttl
- Show only summary:
python script.py ontology.ttl –show summary
- Export template for missing labels:
python script.py ontology.ttl –export-template missing_labels.ttl
- check_external_data_provider_links_ttl(ttl_file, base_namespace=None)[source]
I2 - Detection of existence and usage of external URIs
Evaluates the degree to which a dataset links to external data providers through properties like owl:sameAs, rdfs:seeAlso, and SKOS mapping properties. This metric assesses the dataset’s integration with the broader Linked Data cloud.
External links enhance data discoverability and enable cross-dataset queries, representing a key principle of Linked Data publishing.
Author: Redad Mehdi Version: 0.0.1
Parameters:
- ttl_filestr
Path to the Turtle (.ttl) file to analyze
- base_namespacestr, optional
Base namespace of the ontology. If not provided, the function attempts to infer it from the graph’s namespace declarations.
Returns:
- float
Ratio of entities with external links (0.0 to 1.0) - 0.0: No entities have external links - 1.0: All entities have at least one external link
Notes:
The function examines the following linking predicates: - owl:sameAs, rdfs:seeAlso - SKOS mapping properties (exactMatch, closeMatch, etc.) - owl:equivalentClass, owl:equivalentProperty - dc:source, foaf:isPrimaryTopicOf
If no base namespace is provided, the function recognizes links to known external data providers like DBpedia, Wikidata, GeoNames, etc.
Example:
>>> score = check_external_data_provider_links_ttl('dataset.ttl', ... 'http://example.org/') >>> print(f"External linking score: {score:.2f}")
References:
Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., & Auer, S. (2015). Quality assessment for Linked Data: A Survey: A systematic literature review and conceptual framework. Semantic Web, 7(1), 63-93.
Hogan, A., Umbrich, J., Harth, A., Cyganiak, R., Polleres, A., & Decker, S. (2012). An empirical survey of Linked Data conformance. Journal of Web Semantics, 14, 14-44.
- check_for_isolated_elements(ttl_file: str)[source]
C1 - Number of isolated elements
Analyze an OWL ontology in Turtle format to identify isolated atomic classes and isolated properties.
Definitions
Atomic classes are named classes (with URI) that are NOT constructed classes (i.e., they do not have owl:unionOf, owl:intersectionOf, or owl:complementOf).
- A class (atomic or constructed with URI) is considered connected if it:
participates in rdfs:subClassOf, owl:equivalentClass, or owl:disjointWith relations involving atomic classes, OR
is used as domain or range of properties and contains at least one atomic class inside its construction.
A property is considered connected if it is related by any of: rdfs:subPropertyOf, owl:inverseOf, owl:propertyDisjointWith, or owl:equivalentProperty.
Author: Van Tran Version: 0.0.1
- param ttl_file:
File path to the ontology Turtle (.ttl) file.
- type ttl_file:
str
- param Prints:
- param ——:
- param Lists of isolated atomic classes and isolated properties.:
Notes
Only named classes explicitly declared as owl:Class are considered.
Only properties explicitly declared as owl:ObjectProperty or owl:DatatypeProperty are considered.
Relations checked for classes include rdfs:subClassOf, owl:equivalentClass, owl:disjointWith, and usage as domain or range of properties.
Relations checked for properties include rdfs:subPropertyOf, owl:inverseOf, owl:propertyDisjointWith, and owl:equivalentProperty.
References
Mc Gurk, S., Abela, C., & Debattista, J. (2017). Towards ontology quality assessment. 4th Workshop on Linked Data Quality (LDQ2017), co-located with the 14th Extended Semantic Web Conference (ESWC), Portorož, 94-106.
- check_human_readable_license_ttl(ttl_file)[source]
L2 - Human-readable license detection
Detects the presence of human-readable licensing information within a TTL file. This metric evaluates whether the dataset provides clear licensing terms that users can understand without legal expertise.
The function searches for common license-related keywords in both RDF literals and TTL file comments, including references to popular licenses like Creative Commons, GPL, MIT, Apache, and BSD.
Author: Redad Mehdi Version: 0.0.1
Parameters:
- ttl_filestr
Path to the Turtle (.ttl) file to analyze
Returns:
- int
Binary score (0 or 1) - 0: No human-readable license information found - 1: License-related keywords detected
Notes:
Keywords searched include: ‘license’, ‘licence’, ‘copyright’, ‘terms of use’, ‘creative commons’, ‘GPL’, ‘MIT’, ‘Apache’, ‘BSD’
Example:
>>> score = check_human_readable_license_ttl('dataset.ttl') >>> if score: ... print("Human-readable license information found") ... else: ... print("No license information detected")
References:
Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., & Auer, S. (2015). Quality assessment for Linked Data: A Survey: A systematic literature review and conceptual framework. Semantic Web, 7(1), 63-93.
Hogan, A., Umbrich, J., Harth, A., Cyganiak, R., Polleres, A., & Decker, S. (2012). An empirical survey of Linked Data conformance. Journal of Web Semantics, 14, 14-44.
- check_rdf_dump_accessibility_ttl(ttl_file)[source]
A2 - RDF dump accessibility
Evaluates the accessibility of RDF data dumps referenced in a TTL file by attempting to access each discovered dump URL via HTTP HEAD requests.
This metric assesses whether the raw RDF data is available for download, which is important for data consumers who need offline access or bulk processing capabilities.
Author: Redad Mehdi Version: 0.0.1
Parameters:
- ttl_filestr
Path to the Turtle (.ttl) file to analyze
Returns:
- float
Ratio of accessible RDF dumps (0.0 to 1.0) - 0.0: No accessible dumps found - 1.0: All discovered dumps are accessible
Notes:
The function identifies potential dump URLs by looking for common RDF file extensions (.rdf, .ttl, .nt, .n3, .owl, .jsonld) in referenced URLs.
Example:
>>> score = check_rdf_dump_accessibility_ttl('dataset.ttl') >>> print(f"RDF dump accessibility score: {score}")
References:
Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., & Auer, S. (2015). Quality assessment for Linked Data: A Survey: A systematic literature review and conceptual framework. Semantic Web, 7(1), 63-93.
Flemming, A. (2011). Qualitätsmerkmale von Linked Data-veröffentlichenden Datenquellen. Diplomarbeit (Quality Criteria for Linked Data Sources).
- check_sparql_accessibility_ttl(ttl_file)[source]
A1 - Accessibility of the SPARQL endpoint and the server
Evaluates the accessibility of SPARQL endpoints referenced in a TTL file by attempting to execute a simple query against each discovered endpoint.
This metric is based on Flemming (2011) quality criteria for Linked Data sources, specifically addressing the availability and accessibility of query interfaces.
Author: Redad Mehdi Version: 0.0.1
Parameters:
- ttl_filestr
Path to the Turtle (.ttl) file to analyze
Returns:
- float
Ratio of accessible SPARQL endpoints (0.0 to 1.0) - 0.0: No accessible endpoints found - 1.0: All discovered endpoints are accessible
Example:
>>> score = check_sparql_accessibility_ttl('dataset.ttl') >>> print(f"SPARQL accessibility score: {score}")
References:
Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., & Auer, S. (2015). Quality assessment for Linked Data: A Survey: A systematic literature review and conceptual framework. Semantic Web, 7(1), 63-93.
Flemming, A. (2011). Qualitätsmerkmale von Linked Data-veröffentlichenden Datenquellen. Diplomarbeit (Quality Criteria for Linked Data Sources).
- count_class_connected_components(ttl_file_path: str) int[source]
C3 - Count number of connected subgraphs
Count the number of connected components in the “class graph” (TBox) of an OWL ontology. The “class graph” is constructed by creating undirected edges between classes that are connected by any of the following OWL predicates:
rdfs:subClassOf
owl:equivalentClass
owl:disjointWith
Nodes in the graph represent named classes (URIRefs) from the ontology. Edges represent these relationships between named classes. Classes involved only in subclass/equivalent/disjointWith axioms pointing to blank nodes (i.e. constructed classes) are excluded.
Author: Van Tran Version: 0.0.1
- Parameters:
ttl_file_path (str) – Path to the ontology Turtle (.ttl) file.
- Returns:
The number of connected components in the class graph.
- Return type:
Notes
The graph is undirected; directionality of subclass relations is ignored.
Only named OWL classes are considered.
Classes that participate in subclass/equivalent/disjoint axioms involving blank nodes are excluded.
- mainDefCheck_v_0_0_1(ttl_file, show='all', full_definitions=False)[source]
SKOS Definition Coverage Analysis
Analyze an OWL ontology in Turtle (ttl) format and assess the coverage and quality of SKOS definitions (skos:definition) across all named classes
This main function loads an ontology file, identifies all named classes, and provides comprehensive analysis of definition coverage with various display options
Definitions
Named classes: Classes with URIRef identifiers that are explicitly declared as owl:Class or rdfs:Class, or participate in rdfs:subClassOf relations
Valid definitions: SKOS definition values that exist as properties on classes Empty or missing definitions are identified as gaps in coverage
Coverage percentage: The proportion of named classes that have at least one skos:definition property
Author: Rishabh Kundu Version: 0.0.1
- param ttl_file:
Path to the ontology Turtle (.ttl) file to analyze
- type ttl_file:
str
- param show:
Display option controlling what information to show: - “all” (default): Shows summary statistics, classes with definitions, and classes without definitions - “with”: Shows only classes that have definitions - “without”: Shows only classes that lack definitions - “summary”: Shows only summary statistics
- type show:
str, optional
- param full_definitions:
Show full definitions instead of truncated versions (default: False, truncated to 150 chars)
- type full_definitions:
bool, optional
- returns:
None – This function does not (directly) return values. It prints analysis results to terminal/CLI The function may exit early on errors (file not found, parsing errors, or no classes found)
Output Information
—————–
When executed successfully, the analysis provides
- Total number of named classes analyzed
- Number of classes with definition properties
- Number of classes lacking definition properties
- Coverage percentage of classes with definitions
- Qualitative assessment based on coverage thresholds
Error Handling
————-
The function handles several error conditions
- FileNotFoundError (When the specified TTL file cannot be found)
- Parsing errors (When the TTL file cannot be parsed as valid Turtle)
- Empty ontology (When no named classes are found in the ontology)
Notes
Only named classes (URIRef instances) are considered in the analysis
Uses skos:definition property specifically for definition identification
Coverage assessment follows ontology quality thresholds with higher standards than altLabels: ≥90%: Excellent, ≥75%: Good, ≥50%: Moderate, ≥25%: Low, <25%: Very low
Classes are displayed with their preferred labels (skos:prefLabel or rdfs:label) when available
show and full_definitions function parameters set to default values (all and False, respectively)
Note
Claude AI (Sonnet 4) was employed chiefly to support documentation efforts.
Examples
- Basic usage:
python script.py ontology.ttl
- Show only summary:
python script.py ontology.ttl –show summary
- Show full definitions:
python script.py ontology.ttl –full-definitions
- Show only classes without definitions:
python script.py ontology.ttl –show without
- find_duplicate_labels_from_graph(ttl_file)[source]
IO8 - Semantically Identical Classes
This metric identifies semantically identical classes by checking if two IRIs in an ontology has the same value for rdfs:label or not.
Params
ttl_file (string): path to ttl file
- returns:
duplicates (dict)
- rtype:
dictionary of URIs of duplicated terms
Author: Van Tran Version: 0.0.1
References
Mc Gurk, S., Abela, C., & Debattista, J. (2017). Towards ontology quality assessment. 4th Workshop on Linked Data Quality (LDQ2017), co-located with the 14th Extended Semantic Web Conference (ESWC), Portorož, 94-106.
- get_properties_missing_domain_and_range(ttl_file_path: str)[source]
C2 - Missing Domain and Ranges in Properties
Parse an OWL ontology Turtle file and identify object and datatype properties that are missing domain or range declarations.
Author: Van Tran Version: 0.0.1
- Parameters:
ttl_file_path (str) – Path to the Turtle (.ttl) file containing the ontology.
- Returns:
A dictionary containing:
- ’count_missing_domain’: int
Number of properties missing an rdfs:domain declaration.
- ’properties_missing_domain’: list of rdflib.term.URIRef
List of properties (URIs) missing an rdfs:domain.
- ’count_missing_range’: int
Number of properties missing an rdfs:range declaration.
- ’properties_missing_range’: list of rdflib.term.URIRef
List of properties (URIs) missing an rdfs:range.
- Return type:
Notes
Only properties explicitly typed as owl:ObjectProperty or owl:DatatypeProperty are considered.
References
Mc Gurk, S., Abela, C., & Debattista, J. (2017). Towards ontology quality assessment. 4th Workshop on Linked Data Quality (LDQ2017), co-located with the 14th Extended Semantic Web Conference (ESWC), Portorož, 94-106.
- mainLeafNodeCheck_v_0_0_1(ttl_file)[source]
Ontology Leaf Node Analysis
Analyze an OWL ontology in Turtle (ttl) format and identify all leaf nodes in the class hierarchy. Leaf nodes are classes that have no subclasses, representing the most specific classes in the ontology
This main function loads an ontology file, identifies all declared classes, and determines which classes are leaf nodes by finding classes that are never used as superclasses
Definitions
Leaf nodes: Classes that have no subclasses, meaning they do not appear as objects in rdfs:subClassOf relationships (or skos:broader)
Declared classes: Classes explicitly declared with rdf:type owl:Class or rdfs:Class
Hierarchy detection: Uses rdfs:subClassOf relationships to determine class hierarchy (also skos:broader)
Author: Rishabh Kundu Version: 0.0.1
- param ttl_file:
Path to the ontology Turtle (.ttl) file to analyze
- type ttl_file:
str
- returns:
None – This function does not (directly) return values. It prints analysis results to terminal/CLI The function may exit early on errors (file not found, parsing errors, or no leaf nodes found)
Output Information
—————–
When executed successfully, the analysis provides
- Total number of leaf nodes found
- Complete list of leaf nodes with their prefixed names (sorted alphabetically)
Error Handling
————-
The function handles several error conditions
- FileNotFoundError (When the specified TTL file cannot be found)
- Parsing errors (When the TTL file cannot be parsed as valid Turtle)
- Empty results (When no leaf nodes are found in the ontology)
Notes
Only considers explicitly declared classes (rdf:type owl:Class or rdfs:Class)
Uses namespace manager for clean URI representation in output
Leaf nodes are sorted alphabetically for consistent display
Coverage statistics may be implemented in future versions
Note
Claude AI (Sonnet 4) was employed chiefly to support documentation efforts.
References
Mc Gurk, S., Abela, C., & Debattista, J. (2017). Towards ontology quality assessment. 4th Workshop on Linked Data Quality (LDQ2017), co-located with the 14th Extended Semantic Web Conference (ESWC), Portorož, 94-106.
Examples
- Basic usage:
python script.py ontology.ttl
- mainSemanticConnection_v_0_0_1(ttl_file)[source]
Ontology Semantic Connection Analysis
Analyze an OWL ontology in Turtle (ttl) format and assess the semantic connection of class hierarchies to established upper-level ontologies (specifically, Common Core Ontology and Basic Formal Ontology)
This main function loads an ontology file, builds the complete class hierarchy, identifies root classes, and determines which hierarchy chains are semantically grounded in higher-level ontologies through naming convention analysis
Definitions
Named classes: Classes with URIRef identifiers that are explicitly declared as owl:Class or rdfs:Class, or participate in rdfs:subClassOf relations
Class hierarchy: The tree structure of classes connected via rdfs:subClassOf relationships
Root classes: Classes that have no parent classes, representing the top level of independent hierarchy trees
Semantic connection: Connection to higher-level ontologies (CCO/BFO) determined by URI prefix analysis (cco:, obo:bfo, bfo:)
Hierarchy chains: Complete trees of classes rooted at root classes, inheriting the connection status of their root
Author: Rishabh Kundu Version: 0.0.1
- param ttl_file:
Path to the ontology Turtle (.ttl) file to analyze
- type ttl_file:
str
- returns:
None – This function does not (directly) return values. It prints comprehensive hierarchy analysis to terminal/CLI. The function may exit early on errors (file not found, parsing errors, no classes found, or no hierarchy relationships found)
Output Information
—————–
When executed successfully, the analysis provides
- Total number of named classes
- Number of classes with children (parent classes)
- Total parent-child relationships
- Number of root classes
- Number of root classes connected to higher ontologies
- Summary of connected vs disconnected hierarchy chains
- Complete hierarchical tree view with connection status indicators
Error Handling
————-
The function handles several error conditions
- FileNotFoundError (When the specified TTL file cannot be found)
- Parsing errors (When the TTL file cannot be parsed as valid Turtle)
- Empty ontology (When no named classes are found)
- Missing hierarchy (When no rdfs:subClassOf relationships are found)
Notes
Only considers explicitly declared classes and rdfs:subClassOf relationships
Connection analysis based on URI prefix patterns (cco:, obo:bfo, bfo:)
Provides both statistical summary and detailed tree visualization
Includes namespace bindings for common ontology prefixes
Note
Claude AI (Sonnet 4) was employed chiefly to support documentation efforts.
Examples
- Basic usage:
python script.py ontology.ttl
- mainClassNameCapitalCheck_v_0_0_1(ttl_file, show='all', export_template=None)[source]
Class Name Capital Letter Check Analysis
Analyze an OWL ontology in Turtle (ttl) format to assess whether all named classes follow the convention of starting their local name with a capital letter
This main function loads an ontology file, identifies all named classes, and provides comprehensive analysis of capital letter compliance with various display options and export capabilities
Definitions
Named classes: Classes with URIRef identifiers that are explicitly declared as owl:Class or rdfs:Class, or participate in rdfs:subClassOf relations
Local name: The fragment of a class URI after the last ‘#’ or ‘/’ character. Only this portion is checked, not the full namespace URI
Compliant class: A class whose local name begins with an uppercase letter as determined by Python’s str.isupper() check on the first character
Coverage percentage: The proportion of named classes whose local name starts with a capital letter
Author: Rishabh Kundu Version: 0.0.1
- param ttl_file:
Path to the ontology Turtle (.ttl) file to analyze – input file
- type ttl_file:
str
- param show:
Display option controlling what information to show: - “all” (default): Shows summary statistics, compliant classes, and non-compliant classes - “with”: Shows only classes whose local name starts with a capital letter - “without”: Shows only classes whose local name does not start with a capital letter - “summary”: Shows only summary statistics
- type show:
str, optional
- param export_template:
Export a CSV report file for classes that do not start with a capital letter. Provide the desired output filename (e.g. “non_capital_classes.csv”). Default is None (no export)
- type export_template:
str, optional
- returns:
None – This function does not directly return values. It prints analysis results to your terminal/CLI and optionally exports a CSV report file. The function may exit early on errors (file not found, parsing errors, or no classes found)
Output Information
——————
When executed successfully, the analysis provides
- Total no. of named classes analyzed
- No. of classes whose local name starts with a capital letter
- No. of classes whose local name does not start with a capital letter
- Coverage percentage of compliant classes
- Prefixed class name, full URI/IRI, and local name for each class
Error Handling
————–
- FileNotFoundError (When the specified TTL file cannot be found)
- Parsing errors (When the TTL file cannot be parsed as valid Turtle)
- Empty ontology (When no named classes are found in the ontology)
Notes
Only named classes (URIRef instances) are considered in the analysis
Only the local name fragment is checked, not the full URI
Classes with an empty local name are treated as non-compliant
- show and export_template parameters are set to default values (“all” and None)
thus, CSV export must be explicitly requested
Note
Claude AI (Sonnet 4.6) was employed chiefly to support documentation efforts.
Examples
- Basic usage (show all):
mainClassNameCapitalCheck_v_0_0_1(“ontology.ttl”)
- Show only summary:
mainClassNameCapitalCheck_v_0_0_1(“ontology.ttl”, show=”summary”)
- Show only non-compliant classes:
mainClassNameCapitalCheck_v_0_0_1(“ontology.ttl”, show=”without”)
- Export CSV report of non-compliant classes:
- mainClassNameCapitalCheck_v_0_0_1(“ontology.ttl”, export_template=”non_capital_classes.csv”)
a desired export path can also be inserted in the filename
- Export report while showing summary only:
- mainClassNameCapitalCheck_v_0_0_1(“ontology.ttl”, show=”summary”, export_template=”non_capital_classes.csv”)
a desired export path can also be inserted in the filename
- mainClassNameSpaceCheck_v_0_0_1(ttl_file, export_template=None)[source]
Class Name Space Check Analysis
Scan an OWL ontology in Turtle (ttl) format for class names containing spaces. Spaces in prefixed class names make a TTL file unparseable, so this function works directly on the raw file text rather than parsing it through rdflib.
Handles both single-line and multi-line class declarations by grouping lines into declaration blocks before checking for spaces.
Author: Rishabh Kundu Version: 0.0.1
- Parameters:
- Returns:
None – Prints results to terminal/CLI and optionally exports a CSV report.
Output Information
——————
When executed successfully, the analysis provides
- Total number of class names with spaces detected
- The class names with space, line number, and full line text for each
Error Handling
————–
- FileNotFoundError (When the specified TTL file cannot be found)
Notes
Does not use rdflib parsing - works on raw file text so it catches errors that would prevent parsing entirely
Detects owl:Class and rdfs:Class declarations
Handles multi-line declarations by grouping on ‘.’ block terminators
Comments and string literals are handled to avoid false matches
Note
Claude AI (Sonnet 4.6) was employed chiefly to support documentation efforts.
Examples
- Basic usage:
mainClassNameSpaceCheck_v_0_0_1(“ontology.ttl”)
- Export CSV report:
mainClassNameSpaceCheck_v_0_0_1(“ontology.ttl”, export_template=”classes_with_spaces_in_names.csv”)
- mainLabelCheck_v_0_0_1(ttl_file, show='all', export_template=None)[source]
RDFS Label Coverage Analysis
Analyze an OWL ontology in Turtle (ttl) format to assess the coverage and quality of RDFS labels (rdfs:label) across all named classes
This main function loads an ontology file, identifies all named classes, and provides comprehensive analysis of label coverage with various display options and export capabilities
Definitions
Named classes: Classes with URIRef identifiers that are explicitly declared as owl:Class or rdfs:Class, or participate in rdfs:subClassOf relations
Valid labels: rdfs:label values that are non-empty strings after whitespace trimming. Empty strings and whitespace-only labels are not counted as valid labels
Coverage percentage: The proportion of named classes that have at least one valid rdfs:label
Author: Rishabh Kundu Version: 0.0.1
- param ttl_file:
Path to the ontology Turtle (.ttl) file to analyze – input file
- type ttl_file:
str
- param show:
Display option controlling what information to show: - “all” (default): Shows summary statistics, classes with labels, and classes without labels - “with”: Shows only classes that have rdfs:label - “without”: Shows only classes that lack rdfs:label - “summary”: Shows only summary statistics
- type show:
str, optional
- param export_template:
Export a CSV template file for classes missing rdfs:label. Provide the desired output filename (e.g. “missing_labels_in_classes.csv”). Default is None (no export)
- type export_template:
str, optional
- returns:
None – This function does not directly return values. It prints analysis results to your terminal/CLI and optionally exports a CSV template file. The function may exit early on errors (file not found, parsing errors, or no classes found)
Output Information
——————
When executed successfully, the analysis provides
- Total number of named classes analyzed
- Number of classes with valid rdfs (label properties)
- Number of classes lacking valid rdfs (label properties)
- Coverage percentage of classes with rdfs (label)
- Prefixed class name and full URI/IRI for each class
Error Handling
————–
- FileNotFoundError (When the specified TTL file cannot be found)
- Parsing errors (When the TTL file cannot be parsed as valid Turtle)
- Empty ontology (When no named classes are found in the ontology)
Notes
Only named classes (URIRef instances) are considered in the analysis
Empty strings and whitespace-only labels are treated as missing labels
Classes are displayed with both their prefixed name and full URI/IRI
- show and export_template parameters are set to default values (“all” and None)
thus, CSV export request must be explicitly mentioned
Note
Claude AI (Sonnet 4.6) was employed chiefly to support documentation efforts.
Examples
- Basic usage (show all):
mainLabelCheck_v_0_0_1(“ontology.ttl”)
- Show only summary:
mainLabelCheck_v_0_0_1(“ontology.ttl”, show=”summary”)
- Show only classes missing labels:
mainLabelCheck_v_0_0_1(“ontology.ttl”, show=”without”)
- Export CSV template for missing labels:
mainLabelCheck_v_0_0_1(“ontology.ttl”, export_template=”missing_labels.csv”)
- Export template while showing summary only:
- mainLabelCheck_v_0_0_1(“ontology.ttl”, show=”summary”, export_template=”missing_labels.csv”)
in the afore a desired export path can also be inserted
- mainClassSearch_v_0_0_1(ontology_graph_or_path, search_term: str) list[source]
Class Name Substring Search
Evaluates an ontology file to find all class names that contain a specified substring, irrespective of capitalization.
This metric assesses whether certain concepts can be resolved from a semantic or string basis, which is important for identifying overlapping concepts, naming consistency, or potential duplicates (e.g., ‘VoltageRating’ and ‘VoltAgeRange’).
Author: Redad Mehdi Version: 0.0.1
Parameters:
- ontology_graph_or_pathstr or rdflib.Graph
Path to the Turtle (.ttl) file to analyze, or a pre-loaded rdflib.Graph object.
- search_termstr
The string to search for within the class names (case-insensitive).
Returns:
- list of dict
A list containing dictionaries of the matched classes. Each dictionary contains: - ‘class_name’: The extracted local name of the class (str). - ‘uri’: The full URI string of the class (str). Returns an empty list if no matches are found.
Notes:
The function identifies class names by looking at subjects typed as owl:Class or rdfs:Class. It extracts the local name by splitting the URI at the last ‘#’ or ‘/’ character before performing the case-insensitive comparison.
Example:
>>> matches = mainClassSearch_v_0_0_1('dataset.ttl', 'VOLtage') >>> print(f"Found {len(matches)} matching classes.")
- run_ontology_assessment(ttl_file, metrics, output_log_file='assessment.log', output_csv_file='assessment_scores.csv')[source]
Run task-agnostic metrics on a single ontology (Mode 1).
- run_task_based_assessment(ttl_files, questions, domain_prefixes, domain_ns_fragments=None, metrics=None, output_log_file='assessment.log', output_csv_file='assessment_scores.csv')[source]
Assess one or more ontologies against competency questions (Modes 3/4).
When a single ontology is provided this corresponds to Mode 3 (task-based scientific assessment). When multiple ontologies are provided they are merged and evaluated jointly, corresponding to Mode 4 (cross-domain assessment).
- Parameters:
ttl_files (str or list of str) – Path(s) to Turtle (.ttl) ontology file(s). A single path is accepted and will be wrapped in a list internally.
questions (str or list of str) – Path to a JSON/Markdown file of SPARQL queries, or a list of raw SPARQL query strings.
domain_prefixes (list of str) – Namespace prefixes used in the SPARQL queries (e.g.,
["mds"]).domain_ns_fragments (list of str or None, optional) – Namespace URI fragments to restrict domain-term filtering.
metrics (list of str or None, optional) – Task-agnostic metric names to run alongside the task-based assessment. Pass
"all"for every available metric.output_log_file (str, optional) – Output log file path.
output_csv_file (str, optional) – Output CSV file path.
- Returns:
The result dictionary from
task_based_metric_v_0_0_1.- Return type:
- run_web_ontology_assessment(ttl_file, questions, domain_prefixes, knowledge_graph, domain_ns_fragments=None, metrics=None, output_log_file='assessment.log', output_csv_file='assessment_scores.csv')[source]
Assess a Web ontology against KGQA benchmark queries (Mode 2).
Runs the task-based Relevance/Accuracy assessment using competency queries drawn from a knowledge-graph question-answering benchmark (e.g., LC-QuAD over DBpedia). Optionally runs task-agnostic metrics as well.
- Parameters:
ttl_file (str) – Path to the ontology Turtle file.
questions (str or list of str) – Path to a JSON/Markdown file of SPARQL queries, or a list of raw SPARQL query strings.
domain_prefixes (list of str) – Namespace prefixes used in the SPARQL queries (e.g.,
["dbo"]).knowledge_graph (str) – Path to the knowledge-graph file (Turtle/RDF) used for validation context.
domain_ns_fragments (list of str or None, optional) – Namespace URI fragments to restrict domain-term filtering.
metrics (list of str or None, optional) – Task-agnostic metric names to run alongside the task-based assessment. Pass
"all"for every available metric.output_log_file (str, optional) – Output log file path.
output_csv_file (str, optional) – Output CSV file path.
- task_based_metric_v_0_0_1(ttl_file, questions, domain_prefixes, domain_ns_fragments=None)[source]
Compute task-based Relevance and Accuracy for an ontology.
Given an ontology (one or more Turtle files) and a set of competency questions expressed as SPARQL queries, this function computes two term-overlap metrics:
Relevance (Recall) = |T_a intersection T_o| / |T_a| Accuracy (Precision) = |T_a intersection T_o| / |T_o|
where T_a is the union of domain terms extracted from all SPARQL queries and T_o is the set of domain terms defined in the ontology.
- Parameters:
ttl_file (str, pathlib.Path, or list thereof) – Path(s) to Turtle (.ttl) ontology file(s). A single string or
Pathis automatically wrapped in a list.questions (str, pathlib.Path, or list of str) –
The competency questions to evaluate against. Accepted forms:
str / Path ending in .json – path to a JSON file where each array element has a
sparql_querykey.str / Path ending in .md – path to a Markdown file with SPARQL queries inside fenced
sparqlcode blocks.list of str – raw SPARQL query strings.
domain_prefixes (list of str) – Namespace prefixes used in the SPARQL queries to identify domain terms (e.g.,
["mds"]).domain_ns_fragments (list of str or None, optional) – Sub-strings of namespace URIs used to restrict which ontology terms count as domain-specific. When
None, every non-foundational term is included.
- Returns:
A dictionary with the following keys:
relevance(float): Recall – fraction of task terms present in the ontology.accuracy(float): Precision – fraction of ontology terms referenced by the tasks.T_o_count(int): Number of ontology domain terms.T_a_count(int): Number of unique task terms.intersection(int): Number of terms in both sets.missing_from_onto(set of str): Task terms absent from the ontology.unused_in_onto(set of str): Ontology terms not referenced by any task query.
- Return type:
- Raises:
ValueError – If questions is not a recognized type (list, JSON path, or Markdown path).
Examples
>>> result = task_based_metric_v_0_0_1( ... ttl_file="my_ontology.ttl", ... questions="competency_questions.json", ... domain_prefixes=["mds"], ... domain_ns_fragments=["cwrusdle.bitbucket.io/mds"], ... ) >>> print(f"Relevance: {result['relevance']:.2%}") >>> print(f"Accuracy: {result['accuracy']:.2%}")