DEIM Forum 2011 B3-4 Linked Data 112 8610 2 1 1 E-mail: {onishi.kanako,koba}@is.ocha.ac.jp Web Linked Data Linked Data Linked Data Web Linked Data DBpedia WordNet Abstract An Information Enhancement Technique by using Linked Data Kanako ONISHI and Ichiro KOBAYASHI Ochanomizu University, Graduate School of Humanities and Sciences, Advanced Sciences Ostuka 2 1 1, Bunkyo-ku, Tokyo, 112 8610 Japan E-mail: {onishi.kanako,koba}@is.ocha.ac.jp Recently, it has been a big issue of how we effectively utilize the huge amount of Web documents. With the background of this issue, many fundamental technologies such as search engines, recommendation systems etc. have so far been developed. In particular, as the next promising technology, Semantic Web technology has been widely noticed and nowadays many application systems based on the technology have been getting developed. In this paper, we propose a new information extraction technique from the Web which can provide a user with the information related to user s interests by using the resources of Semantic Web framework, especially Linked Data in this study. Key words Semantic Web Linked Data Information Enhancement DBpedia WordNet 1. Web 1998 Tim Berners-Lee [1] HTML XML XML Schema RDF RDF Schema OWL Tim Berners-Lee Linked Data [2] [3] Linked Data 800 Linked Data Geonames 1 MusicBrainz 2 WordNet [4] Dbpedia [5] DBpedia Wikipedia Web RDF URI URI Linked Data Web 1 http://www.geonames.org/ 2 http://musicbrainz.org/
Web 2. Linked Data Swoogle [6] Watson 3 SWME 4 Sindicec [7] Linked Data BBC BBC Linked Data DBpedia MusicBrainz [8] [9] DBpedia Flickr Google Image [10] Bernhard DBpedia [11] Linked Data Aastrand DBpedia [12] DBpedia Mobile [13] GPS DBpedia Linked Data [14] Linked Data [15] Tokyo Tokyo URI RDF RDF Domain, Property, Range 1 ➁ WordNet 1 ➂ URI RDF 1 ➃ RDF SPARQL 5 1 ➄ Linked Data 1 ➅ 1 ➆ RDF SPARQL 1 ➇ Linked Data 1 I 1 II 3. 1 D 3. 1 Web 1 ➀ URI 3 http://kmi-web05.open.ac.uk/watsonwui/ 4 http://swse.deri.org/ 1 5 http://www.w3.org/tr/rdf-sparql-query/
D N = {n 1, n 2,..., n i} opennlp 6 N n D f D(n) n D pos(n) = {p 1, p 2,..., p fd (i)} D n W (n) W (n) = 1 (f D(n) 1) α n ( p i+1 p i p) 2 (1) i=1 n D n n D (1) 1 f D (n) 1 α α 3 p n D X p = X f D W (n) n (n) W (n) n D n D 3. 2 D n k D n l n k K(n k, n l ) K(n k, n l ) = I(n k, n l ) f D(n l ) (2) I(n k, n l ) n k n l I(n k, n l ) = log p(n k, n l ) p(n k )p(n l ) (n k n l ) (3) (3) p(n k, n l ) p(n k ) p(n l ) p(n k, n l ) = f D(n k, n l ), p(n k ) = f D(n k ) X X, p(n l) = f D(n l ) X X D f D(n k, n l ) n k n l I(n k, n l ) n k n l 6 http://incubator.apache.org/opennlp/ n k K(n k, n l ) K(n k, n l ) N K(n k, n l ) n l 3. 3 WordNet Synset Synset WordNet Japan {Japanese Islands, Japanese Archipelago, Nippon, Nihon} Synset 3. 4 3. 5 T ij (i = {0, 1, 2}; 0 < = j < = m 1) {0, 1} i 0 1 2 Dmain Property Range j n RDF T 00 T 01 T 0m 1 T ij = T 10 T 12 T 1m 1 (4) T 20 T 23 T 2m 1 n RDF Domain T 0j = 1 Property T 1j = 1 Range T 2j = 1 T 1 T 1 1 Property ( T 1j > = T0j ) ( T 1j > = T2j ) T 1 2 ( T 0j > = T1j ) ( T 0j > = T2j ) ( T 0j < α) ( T 2j > = T0j ) ( T 2j > = T1j ) ( T 2j < α) T
α 40 α T = 0 T T = 0 K(n k, n l ) n l T n T = 0 W (n) n D Algorithm 1 1: for C = {c 1,..., c i } do 2: R = {r 1,..., r j } 3: for R do 4: T 5: if T then 6: Property 7: else if T then 8: Domain Range 9: else if then 10: S = {s 1,..., s j } 11: for S do 12: T 13: if T then 14: Property 15: else if T then 16: Domain Range 17: end if 18: end for 19: end if 20: end for 21: end for 22: if then 23: 24: end if 3. 6 Linked Data DBpedia RDF SPARQL DBpedia 7 Property R Property P D SPARQL <R> <P>?hasValue } 7 http://dbpedia.org/sparql 2 SELECT?isValueOf WHERE {?isvalueof <P> <R> } Domain Range D R R Property D SELECT?property WHERE { <R>?property <R > } SELECT?property WHERE { <R >?property <R> } P P Domain Range D SELECT?isValue?hasValue WHERE {?isvalue <P>?hasValue } 3. 7 DBpedia 2 Resource1 P P Domain Range SPARQL Linked Data
3 Wikipedia Supermarine Spitfire 1 W (n) top 10 W (n) spitfire 4 10.546 fighter 4 47.953 supermarine 2 49.000 aircraft 3 107.125 british 2 729.000 war 2 1156.000 chief 2 1936.000 designer 2 1936.000 mitchell 2 3844.000 speed 2 4356.000 2 R α β R Resource1 2 Resource1 B B α spitfire spitfire r air 5.010 4.317 3. 8 spitfire war 4.145 P spitfire speed 4.145 spitfire supermarine 4.145 spitfire british 4.145 spitfire front 3.624 spitfire line 3.624 <P> rdfs:label?hasvalue } L rine Spitfire.rdf 11 L of isvalue(domain) is hasvalue(range). 4. 1 4. Wikipedia Supermarine Spitfire 8 3 W (n) W (n) 10 1 W (n) spitfire DBpedia Spitfire 9 Spitfire dbpedia-owl:wikipageredirects Spitfire.rdf 10 <rdf:description rdf:about= http://dbpedia.org/resource/spitfire > <dbpedia-owl:wikipageredirects rdf:resource= http://dbpedia.org/resource/supermarine Spitfire /> Spitfire Supermarine Spitfire Spitfire RDF Superma- 2 K(n k, n l ) top 10 K(n k, n l ) spitfire fighter 6.673 spitfire aircraft 5.575 K(n k, n l ) K(n k, n l ) 10 2 spitfire K(n k, n l ) Fighter/fighter Supermarine Spitfire.rdf Domain Property Range fighter <rdf:description rdf:about= http://dbpedia.org/resource/spitfire fighter > <dbpedia-owl:wikipageredirects xmlns:dbpedia-owl= http://dbpedia.org/ontology/ rdf:resource= http://dbpedia.org/resource/supermarine Spitfire /> <rdf:description rdf:about= http://dbpedia.org/resource/kampfgeschwader 200 > <dbpedia-owl:aircraftfighter xmlns:dbpedia-owl= http://dbpedia.org/ontology/ rdf:resource= http://dbpedia.org/resource/supermarine Spitfire /> 8 http://en.wikipedia.org/wiki/supermarine Spitfire 9 http://dbpedia.org/resource/spitfire/ 10 http://dbpedia.org/data/spitfire.rdf <rdf:description 11 http://dbpedia.org/data/supermarine Spitfire.rdf
rdf:about= http://dbpedia.org/resource/supermarine Spitfire > <dcterms:subject xmlns:dcterms= http://purl.org/dc/terms/ rdf:resource= http://dbpedia.org/resource/ Category:British fighter aircraft 1930-1939 /> RDF fighter 1 fighter Synset combatant, battler, belligerent, scrapper, fighter aircraft, attack aircraft, champion, hero, paladin RDF K(n k, n l ) aircraft RDF W (n) fighter spitfire fighter fighter K(n k, n l ) Supermarine Spitfire.rdf Domain Property Range fighter 14 T 0 0 0 0 0 0 0 1 0 0 0 0 1 1 T = 1 1 1 0 0 0 1 1 1 1 1 1 0 1 0 0 0 1 1 1 0 0 0 0 0 0 0 0 ( T 1j > = T0j ) ( T 1j > = T2j ) dbpedia-owl:aircraftfighter dbpedia-owl:aircraftfighter Supermarine Spitfire aircraftfighter Domain Range SPARQL SELECT?isValue WHERE {?isvalue dbpedia-owl:aircraftfighter dbpedia:supermarine Spitfire } Wikipedia Supermarine Spitfire 3 dbpedia:supermarine Spitfire dbpedia-owl:aircraftfighter?hasvalue } speed 4 spitfire Supermarine Spitfire.rdf Domain Property Range speed 2 SPARQL T 0 0 T = 1 1 dbpedia-owl:aircraftfighter rdfs:label?hasvalue 0 0 } ( T 1j > = T0j) ( T 1j > = aircraftfighter aircraft T2j) fighter 4 aircraft fighter of No. 452 Squadron RAAF is Supermarine Spitfire. aircraft fighter of No. 80 Wing RAAF is Supermarine Spitfire. aircraft fighter of No. 303 Polish Fighter Squadron is Supermarine Spitfire. aircraft fighter of No. 453 Squadron RAAF is Supermarine Spitfire. aircraftfighter SPARQL SELECT?isValue?hasValue WHERE {?isvalue dbpedia-owl:aircraftfighter?hasvalue } aircraft fighter of Syrian Air Force is Mikoyan MiG-29. aircraft fighter of Royal Malaysian Air Force is Mikoyan MiG-29. aircraft fighter of SFR Yugoslav Air Force is Mikoyan MiG-29. aircraft fighter of Serbian Air Force and Air Defence is Mikoyan MiG-29. 4. 2 dbpprop:maxspeedmain dbpprop:maxspeedalt
dbpprop:maxspeedmain dbpprop:maxspeedalt Supermarine Spitfire dbpprop:maxspeedmain dbpprop:maxspeedalt Domain Range dbpprop:maxspeedmain SPARQL dbpedia:supermarine Spitfire dbpprop:maxspeedmain?hasvalue} SELECT?isValue WHERE {?isvalue dbpprop:maxspeedmain dbpedia:supermarine Spitfire} Wikipedia DBpedia WordNet 5. max speed main of Supermarine Spitfire is 22680.0. maxspeedmain Linked Data SPARQL SELECT?isValue?hasValue WHERE {?isvalue dbpprop:maxspeedmain?hasvalue} URI DBpedia SPARQL Supermarine Spitfire dcterms:subject WordNet Synset Linked dbpedia:supermarine Spitfire dcterms:subject?hasvalue} Category:1938 introductions Category:Propeller aircraft Category:Single-engine aircraft Category:Carrier-based aircraft Category:British fighter aircraft 1930-1939 Category:Supermarine aircraft Category:Low wing aircraft Category:Propeller aircraft Category:Propeller aircraft dbpprop:maxspeedmain SPARQL SELECT?isValue?hasValue WHERE {?isvalue dbpprop:maxspeedmain?hasvalue.?isvalue dcterms:subject <http://dbpedia.org/resource/ Category:Propeller aircraft> } max speed main of Hawker Hurricaneis 340 mph. max speed main of Macchi C.200 is 504.0. max speed main of Macchi C.205 is 640.0. max speed main of Aichi D3A is 430.0. dbpprop:maxspeedalt 2011 2 14 Data URI RDF Linked Data SPARQL Linked Data DBpedia Ontology
[1] Berners-Lee, Semantic Web Road map(1998), http://www.w3.org/designissues/semantic.html [2] Berners-Lee, T. 2006. Linked data design issues. In http://www.w3.org/designissues/linkeddata.html. [3] Bizer, C.; Cyganiak, R.; and Heath, T. 2007. How to publish linked data on the web. In http://www4.wiwiss.fuberlin.de/bizer/pub/linkeddatatutorial/. [4] Miller, G. A. 1995. Wordnet: a lexical database for english. Commun. ACM 38:39-41. [5] Auer, S.; Bizer, C.; Kobilarov, G.; Lehmann, J.; Cyganiak, R.; and Ives, Z. 2007. Dbpedia: a nucleus for a web of open data. In Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference, ISWC 07/ASWC 07, 722-735. Berlin, Heidelberg: Springer-Verlag. [6] Ding, L.; Finin, T.; Joshi, A.; Pan, R.; Cost, R. S.; Peng, Y.; Reddivari, P.; Doshi, V.; and Sachs, J. 2004. Swoogle: a search and metadata engine for the semantic web. In Proceedings of the thirteenth ACM international conference on Information and knowledge management, CIKM 04, 652-659. New York, NY, USA: ACM. [7] Oren, E.; Delbru, R.; Catasta, M.; Cyganiak, R.; Stenzhorn, H.; and Tummarello, G. 2008. Sindice.com; a document;oriented lookup index for open linked data. Int. J. Metadata Semant. Ontologies 3:37-52. [8] Kobilarov, G.; Scott, T.; Raimond, Y.; Oliver, S.; Sizemore, C.; Smethurst, M.; Bizer, C.; and Lee, R. 2009. Media meets semantic web? how the bbc uses dbpedia and linked data to make connections. In Proceedings of the 6th European Semantic Web Conference on The Semantic Web: Research and Applications, ESWC 2009 Heraklion, 723-737. Berlin, Heidelberg: Springer-Verlag. [9] Waitelonis, J., and Sack, H. 2009. Towards exploratory video search using linked data. In Proceedings of the 2009 11th IEEE International Symposium on Multimedia, ISM 09, 540-545. Washington, DC, USA: IEEE Computer Society. [10] Vallet, D.; Cantador, I.; and Jose, J. M. 2010. Exploiting external knowledge to improve video retrieval. In Proceedings of the international conference on Multimedia information retrieval, MIR 10, 101-110. New York, NY, USA: ACM. [11] Haslhofer, B.; Momeni, E.; Gay, M.; and Simon, R. 2010. Augmenting europeana content with linked data resources. In Proceedings of the 6th International Conference on Semantic Systems, I-SEMANTICS 10, 40:1-40:3. New York, NY, USA: ACM. [12] Aastrand, G.; Celebi, R.; and Sauermann, L. 2010. Using linked open data to bootstrap corporate knowledge manage- ment in the organik project. In Proceedings of the 6th Inter- national Conference on Semantic Systems, I-SEMANTICS 10, 18:1-18:8.. New York, NY, USA: ACM. [13] Christian, B., and Christian, B. 2008. Dbpedia mobile: A location-enabled linked data browser. In 1stWorkshop about Linked Data on the Web. [14] de León, A.; Saquicela, V.; Vilches, L. M.; Villazón- Terrazas, B.; Priyatna, F.; and Corcho, O. 2010. Geographical linked data: a spanish use case. In Proceedings of the 6th International Conference on Semantic Systems, I- SEMAN- TICS 10, 36:1-36:3. New York, NY, USA: ACM. [15] Iwazume, M.; Kaneiwa, K.; Zettsu, K.; Nakanishi, T.; Kidawara, Y.; and Kiyoki, Y. 2008. Kc3 browser: semantic mash-up and link-free browsing. In Proceeding of the 17th international conference on World Wide Web, WWW 08, 1209-1210. New York, NY, USA: ACM.