More on Ontology Engineering in the Enterprise

In a previous blog post I discussed ontology engineering with respect to enterprise technical architecture and the use of well understood design patterns in aligning semantic technologies with the traditional. One of the common questions I hear in the semantic technology communities is why continue to create new ontologies and not re-use public domain ontologies and upper ontologies to model most domains. I will attempt to answer this, again from a practical and enterprise technical architecture viewpoint. #### Control When designing a traditional architecture with a relational database near or at the bottom of your tech stack, the relational model is constructed to perfectly deliver the requirements and use-cases of the target system. Control is retained within the realms of the organisation and with the technical and data architects. The use cases and business requirements inform the model, and conversely the data model informs the developers building services upon it to meet those use cases. Control of this data model is thus important in ensuring quality and robustness of the software architecture, and in the ability to react in an efficient and agile way to changing requirements.

So with an ontology driven data architecture where the model is pervasive throughout the tech stack it is equally as important, if not more so, that this control is retained. Modelling purely using public domain upper ontologies as construction blocks or a cohesive similar domain ontology may indeed deliver your business case, but there is some loss of control. A model constructed purely from upper ontology Lego will be less cohesive, harder to validate domain object data operations against (as the RDF will be more generic), and easier for bugs to creep into your software that binds to the ontology as the model is less domain specific. When adopting a cohesive third party domain specific ontology that is a close fit to your own domain, control is lost purely in the fact that the model is not your own to change at will (physically of course you can, but divergence then brings its own issues) . Unless your domain is a perfect match, it is likely you will need to diverge from the model. #### Communication As discussed, with an ontology driven data architecture the model is pervasive throughout the tech stack. As such a cohesive rich domain model that accurately reflects your requirements is easy to communicate. Like well designed and well coded software that is self documenting, a cohesive domain specific ontology, with classes and properties named expressively in your domain and meeting your requirements, is self-informing. A less cohesive ontology constructed from a mash-up of upper ontologies will require significantly more explanatory documentation in communicating the ‘fit’ of the model to the business requirements to internal consumers of the ontology and those external resulting in possible mis-use or ambiguous use of the classes and properties. #### Consumers So what about your consumers who understand Schema.org, or rNews, or foaf etc, why do they need to have to deal with a new schema? Well they don’t.

The fact that your internal architecture has been built upon a rich cohesive ontology that fits your requirements, and provides for robust and easy to maintain enterprise software need not determine (or compromise!) what you want to publish to any given consumer. Marking up content to a specific schema representing your linked data can be done as a post publication (or post content creation) exercise. Your consumers are thus not coupled to the schema you have used to deliver your internal use cases, just as traditionally your consumers should not be coupled to your relational model. You might have consumers that are happy to accept content that is marked up in conformance with your cohesive domain ontology, however should a consumer ask for it, you can map the content markup to Schema.org via an outbound transformation. The late binding ethics of the REST philosophy still apply.

This begs the question of how should a consumer of your semantic content API request content marked up to a different schema, and not just a different schema, but a different ontological model within that schema! A good subject for a follow up blog post maybe.