Press.Net News Ontology

I have recently been helping the Press Association build out a semantic news platform. As part of this we have built a set of ontologies to represent news assets and their relationships to the real world. This will allow news publishers to publish RDF feeds of news in a common or standard way classified and annotated with linked data concepts. I have put together some example RDF to demonstrate their use.

The Press.net news ontology is made up of 5 smaller ontologies that each address a specific modeling use case for news.

The Asset ontology is a very lightweight ontology for referencing news assets (text articles, images and video) and some simple relationships between them.

The Classification Ontology lets us holistically classify an asset in some broad category - e.g. this is a “Politics” article. It lets us categorise our assets against some external taxonomy or vocabulary.

Using a recent BBC article about Steve Jobs resignation and pseudo URIs for stuff instances, here is a simple example of how we would represent a classified news story:

@prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix pna:  <http://data.press.net/ontology/asset/> .
@prefix pnc:  <http://data.press.net/ontology/classification/> .
@prefix xsd:  <http://www.w3.org/2001/XMLSchema#> .
<http://www.bbc.co.uk/news/business-14659127>
  rdf:type pna:Text;
  pna:title "Apple shares fall as Jobs quits"^^xsd:string;
  pna:created "2011-08-25T16:54:00Z"^^xsd:dateTime;
  pnc:isClassifiedBy <http://cv.iptc.org/newscodes/mediatopic/20000170>.

The Tag ontology lets us semantically associate news assets with stuff (a common pattern). Using just two predicates, we can say that a news article is primarily “about” something and “mentions” something else.

The Stuff ontology models tangible and intangible stuff from the real world (things that news assets are about or mention).

Using the tag onotlogy with the previous example we now have:

@prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix pna:  <http://data.press.net/ontology/asset/> .
@prefix pnc:  <http://data.press.net/ontology/classification/> .
@prefix pnt:  <http://data.press.net/ontology/tag/> .
@prefix xsd:  <http://www.w3.org/2001/XMLSchema#> .
<http://www.bbc.co.uk/news/business-14659127>  
  rdf:type pna:Text;
  pna:title "Apple shares fall as Jobs quits"^^xsd:string;
  pna:created "2011-08-25T16:54:00Z"^^xsd:dateTime;
  pnc:isClassifiedBy <http://cv.iptc.org/newscodes/mediatopic/20000170> ;
  pnt:about     <http://some.domain/person/steve_jobs> ;
  pnt:mentions  <http://some.domain/thing/iphone> ;
  pnt:mentions  <http://some.domain/organization/apple> ;
  pnt:mentions  <http://some.domain/organization/samsung_electronics> .

The Event ontology inherits much of its behaviour from this public domain event ontology and acknowledges that the majority of news is typically about real world events rather than directly about the stuff associated with the event. This is a subtle but important difference.

An article that writes about “The Royal Wedding” may mention Prince William and Kate Middleton, and Westminster Abbey. But in fact it is about the Royal Wedding event, and the event has William and Kate as agents/actors, and Westminster Abbey as its location, and of course a datetime. So the news article would be tagged as about the event rather than directly with William, Kate, and Westminster. I imagine there will be a number of news articles authoured about the royal wedding - each of these can all then be associated with the same event. The Event class subEvent property is transitive allowing us to create a hierarchy of events. we can then do some nice aggregations or feeds of news about a specific event or events. Another subtle yet important distinction the Stuff vs Event association gives us is with locations. An article tagged directly as “about” a murder in Los Angeles is not really be about LA, rather about an event that has occurred in LA. An article written about the climate of LA though is genuinely about LA, and is what you would probably want to surface in a search or an aggregation of articles about Los Angeles.

So again using the previous example, using the Event ontology and fleshing the RDF out further we now have:

@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix pna:     <http://data.press.net/ontology/asset/> .
@prefix pns:     <http://data.press.net/ontology/stuff/> .
@prefix pnt:     <http://data.press.net/ontology/tag/> .
@prefix pne:     <http://data.press.net/ontology/event/> .
@prefix pnc:     <http://data.press.net/ontology/classification/> .
@prefix foaf:    <http://xmlns.com/foaf/0.1/> .
@prefix event:   <http://purl.org/NET/c4dm/event.owl#> .
@prefix tl: <http://purl.org/NET/c4dm/timeline.owl#>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.

<http://www.bbc.co.uk/news/business-14659127>  
  rdf:type pna:Text;
  pna:title "Apple shares fall as Jobs quits"^^xsd:string;
  pna:created "2011-08-25T16:54:00Z"^^xsd:dateTime;
  pnc:isClassifiedBy <http://cv.iptc.org/newscodes/mediatopic/20000170> ;
  pnc:isClassifiedBy <http://cv.iptc.org/newscodes/mediatopic/20000225> ;
  pnt:about <http://some.domain/event/steve-jobs-resigns> ;
  pnt:mentions <http://some.domain/thing/iphone> ;
  pnt:mentions <http://some.domain/organization/samsung_electronics> ;
  pnt:mentions <http://some.domain/organization/htc> .

<http://some.domain/event/steve-jobs-resigns>
    rdf:type pne:Event ;
    event:agent <http://some.domain/person/steve_jobs> ;
    event:agent <http://some.domain/person/tim_cook> ;
    event:agent <http://some.domain/organization/apple> ;
    event:time [
        a tl:Instant;
        tl:at "2011-08-25T12:00:00"^^xsd:dateTime;
        ];
    pne:title "Steve Jobs Resigns From Apple" .
    
<http://some.domain/person/tim_cook>
    rdf:type pns:Person ;
    pns:name "Tim Cook" ;
    foaf:givenName "Tim" ;
    foaf:familyName "Cook" ;
    pns:notablyAssociatedWith <http://some.domain/organization/apple> .
    
<http://some.domain/person/steve_jobs>
    rdf:type pns:Person ;
    pns:name "Steve Jobs" ;
    foaf:givenName "Steve" ;
    foaf:familyName "Job" ;
    pns:notablyAssociatedWith <http://some.domain/organization/apple>;
    pns:notablyAssociatedWith <http://some.domain/person/tim_cook>.

<http://some.domain/organization/apple>
    rdf:type pns:Organization ;
    pns:name "Apple Computer" .

<http://cv.iptc.org/newscodes/mediatopic/20000170>
    rdf:type pnc:Classification .

<http://cv.iptc.org/newscodes/mediatopic/20000225>    
    rdf:type pnc:Classification .