Jump to content

2019 talk:Technology outreach & innovation/GlobalFactSync

Add topic
From Wikimania
partial notes on the presentation
  • Key goals: connect infoboxes across language wikipedias to Wikidata, benefiting from the several sources
  • Knowledge integration and lined data technologies (AKSW/KILT) Center at the Institute for Applied Informatics (InfA) in Leipzig, a DBpedia project
  • presenter was Johannes Frey
  • aid synchronization of Infobox facts across Wikipedias
  • enable import/upstream connections of facts from Wikipedia to Wikidata with references
  • compare contradictions across these sources and enrichment of missing information
  • GFS = GlobalFactSync here
  • example: Wikidata was not update for the release date or publication date of "Boys Don't Cry", a Moulin Rouge song, Q030something
  • there is a GFS Data Browser, at https://global.dbpedia.org which shows an aggregated view of values and their sources for an attribute
  • it can give a template for an infobox based on what it sees across these systems about the topic/attributes
data flow for GFS
  • DBpedia and reference extraction from Wikipedias
  • cleans and merge the data, and fuse with Wikidata
  • import into MongoDB and GFS Data Browser
  • DBpedia started in 2007 as a crowd-sourced effort to semi-automatically extract structured RDF information which could be queried on the Web in SPARQL
  • a new mission since 2018: Databus Platform to integrate with other data
  • DBpedia is a large scale multilingual knowledge base
  • for more see https://meta.wikimedia.org/wiki/Grants:Project/DBpedia/GlobalFactSyncRE
  • RDF has subject, predicate, and object
  • see mappings.dbpedia.org and http://dbpedia.org/ontology/* properties
  • DBpedia has 8 ontologies, including DBO and ...

their Reference extraction software can get citations from inboxes. it can parse de, en, es, fr, it, nl, pl, pt, ru, and sv

  • they offer or help offer DBpedia services: fact extractions, reference extractions in JSON or TSV, a MongoDB query endpoin with the entire GFS data for lookups and analytical queries