lorenz_1PhD Student
Agile Knowledge Engineering and Semantic Web (AKSW)
University of Leipzig

Profiles: LinkedInGoogle Scholar, DBLP

 

Hainstraße 11, 04109 Leipzig
buehmann@informatik.uni-leipzig.de
Phone: +49-341-9732292

 

 

Short CV


Lorenz Bühmann is a PhD Student at the University of Leipzig. Lorenz’s research interests are in the area of Structured Machine Learning.

Research Interests


  • Ontology Learning
  • Ontology Debugging
  • Reasoning
  • Question Answering
  • Natural Language Generation
  • Big Structured Machine Learning

Publications


2017

  • J. Lehmann, G. Sejdiu, L. Bühmann, P. Westphal, C. Stadler, I. Ermilov, S. Bin, N. Chakraborty, M. Saleem, A. N. Ngonga, and H. Jabeen, “Distributed Semantic Analytics using the SANSA Stack,” in Proceedings of 16th International Semantic Web Conference – Resources Track (ISWC’2017), 2017.
    [BibTeX] [Abstract] [Download PDF]
    Over the past decade, vast amounts of machine-readable structured information have become available through the automation of research processes as well as the increasing popularity of knowledge graphs and semantic technologies. A major research challenge today is to perform scalable analysis of large-scale knowledge graphs to facilitate applications like link prediction, knowledge base completion and question answering. Most analytics approaches, which scale horizontally (i.e., can be executed in a distributed environment) work on simple feature-vector-based input rather than more expressive knowledge structures. On the other hand, analytics methods which exploit expressive structures usually do not scale well to very large knowledge bases. This software framework paper describes the ongoing project Semantic Analytics Stack (SANSA) which supports expressive and scalable semantic analytics by providing functionality for distributed in-memory computing for RDF data. The library provides APIs for RDF storage, querying using SPARQL and forward chaining inference. It includes several machine learning algorithms for RDF knowledge graphs. The article describes the vision, architecture and use cases of SANSA.

    @InProceedings{lehmann-2017-sansa-iswc,
    Title = {Distributed {S}emantic {A}nalytics using the {SANSA} {S}tack},
    Author = {Lehmann, Jens and Sejdiu, Gezim and B\"uhmann, Lorenz and Westphal, Patrick and Stadler, Claus and Ermilov, Ivan and Bin, Simon and Chakraborty, Nilesh and Saleem, Muhammad and Ngonga, Axel-Cyrille Ngomo and Jabeen, Hajira},
    Booktitle = {Proceedings of 16th International Semantic Web Conference - Resources Track (ISWC'2017)},
    Year = {2017},
    Abstract = {Over the past decade, vast amounts of machine-readable structured information have become available through the automation of research processes as well as the increasing popularity of knowledge graphs and semantic technologies. A major research challenge today is to perform scalable analysis of large-scale knowledge graphs to facilitate applications like link prediction, knowledge base completion and question answering. Most analytics approaches, which scale horizontally (i.e., can be executed in a distributed environment) work on simple feature-vector-based input rather than more expressive knowledge structures. On the other hand, analytics methods which exploit expressive structures usually do not scale well to very large knowledge bases. This software framework paper describes the ongoing project Semantic Analytics Stack (SANSA) which supports expressive and scalable semantic analytics by providing functionality for distributed in-memory computing for RDF data. The library provides APIs for RDF storage, querying using SPARQL and forward chaining inference. It includes several machine learning algorithms for RDF knowledge graphs. The article describes the vision, architecture and use cases of SANSA.},
    Added-at = {2017-07-17T14:46:26.000+0200},
    Biburl = {https://www.bibsonomy.org/bibtex/21ae18ac13750f9cf74227fe0a7c50104/aksw},
    Interhash = {eb99dff0ce6a9cdbce2c4cbea115fbee},
    Intrahash = {1ae18ac13750f9cf74227fe0a7c50104},
    Keywords = {2017 bde buehmann chakraborty group_aksw iermilov lehmann ngonga saleem sbin sejdiu stadler westphal},
    Owner = {iermilov},
    Timestamp = {2017-07-17T14:46:26.000+0200},
    Url = {http://svn.aksw.org/papers/2017/ISWC_SANSA_SoftwareFramework/public.pdf}
    }

  • I. Ermilov, J. Lehmann, G. Sejdiu, L. Bühmann, P. Westphal, C. Stadler, S. Bin, N. Chakraborty, H. Petzka, M. Saleem, A. N. Ngonga, and H. Jabeen, “The Tale of Sansa Spark,” in Proceedings of 16th International Semantic Web Conference, Poster & Demos, 2017.
    [BibTeX] [Download PDF]
    @InProceedings{iermilov-2017-sansa-iswc-demo,
    Title = {The {T}ale of {S}ansa {S}park},
    Author = {Ermilov, Ivan and Lehmann, Jens and Sejdiu, Gezim and B\"uhmann, Lorenz and Westphal, Patrick and Stadler, Claus and Bin, Simon and Chakraborty, Nilesh and Petzka, Henning and Saleem, Muhammad and Ngonga, Axel-Cyrille Ngomo and Jabeen, Hajira},
    Booktitle = {Proceedings of 16th International Semantic Web Conference, Poster \& Demos},
    Year = {2017},
    Added-at = {2017-08-31T16:24:45.000+0200},
    Biburl = {https://www.bibsonomy.org/bibtex/2f9b5a69afa4755944984ae63f59ad146/aksw},
    Interhash = {ebabfe08f697304b399c9b6b89f2829e},
    Intrahash = {f9b5a69afa4755944984ae63f59ad146},
    Keywords = {2017 bde buehmann chakraborty group_aksw iermilov lehmann mole ngonga saleem sbin sejdiu stadler westphal},
    Owner = {iermilov},
    Timestamp = {2017-08-31T16:24:45.000+0200},
    Url = {http://jens-lehmann.org/files/2017/iswc_pd_sansa.pdf}
    }

2016

  • G. Rizzo, N. Fanizzi, J. Lehmann, and L. Bühmann, “Integrating New Refinement Operators in Terminological Decision Trees Learning,” in Knowledge Engineering and Knowledge Management: 20th International Conference, EKAW 2016, Bologna, Italy, November 19-23, 2016, Proceedings, 2016, pp. 511-526.
    [BibTeX] [Download PDF]
    @InProceedings{rizzo2016integrating,
    Title = {Integrating New Refinement Operators in Terminological Decision Trees Learning},
    Author = {Rizzo, Giuseppe and Fanizzi, Nicola and Lehmann, Jens and B{\"u}hmann, Lorenz},
    Booktitle = {Knowledge Engineering and Knowledge Management: 20th International Conference, EKAW 2016, Bologna, Italy, November 19-23, 2016, Proceedings},
    Year = {2016},
    Organization = {Springer},
    Pages = {511--526},
    Keywords = {lehmann MOLE group_aksw sys:relevantFor:infai sys:relevantFor:bis 2016 buehmann},
    Url = {http://jens-lehmann.org/files/2016/ekaw_terminological_decision_trees.pdf}
    }

  • S. Bin, L. Bühmann, J. Lehmann, and A. {Ngonga Ngomo}, “Towards SPARQL-Based Induction for Large-Scale RDF Data sets,” in ECAI 2016 – Proceedings of the 22nd European Conference on Artificial Intelligence, 2016, pp. 1551-1552. doi:10.3233/978-1-61499-672-9-1551
    [BibTeX] [Download PDF]
    @InProceedings{sparqllearner,
    Title = {Towards {SPARQL}-Based Induction for Large-Scale {RDF} Data sets},
    Author = {Bin, Simon and B{\"u}hmann, Lorenz and Lehmann, Jens and {Ngonga Ngomo}, Axel-Cyrille},
    Booktitle = {ECAI 2016 - Proceedings of the 22nd European Conference on Artificial Intelligence},
    Year = {2016},
    Editor = {Kaminka, Gal A. and Fox, Maria and Bouquet, Paolo and H{\"u}llermeier, Eyke and Dignum, Virginia and Dignum, Frank and van Harmelen, Frank},
    Pages = {1551--1552},
    Publisher = {IOS Press},
    Series = {Frontiers in Artificial Intelligence and Applications},
    Volume = {285},
    Doi = {10.3233/978-1-61499-672-9-1551},
    ISBN = {978-1-61499-672-9},
    Keywords = {2016 sbin buehmann lehmann ngonga sake group_aksw dllearner},
    Language = {English},
    Url = {http://svn.aksw.org/papers/2016/ECAI_SPARQL_Learner/public.pdf}
    }

  • L. Bühmann, J. Lehmann, and P. Westphal, “DL-Learner – A framework for inductive learning on the Semantic Web,” Web Semantics: Science, Services and Agents on the World Wide Web, vol. 39, pp. 15-24, 2016. doi:http://dx.doi.org/10.1016/j.websem.2016.06.001
    [BibTeX] [Abstract] [Download PDF]
    Abstract In this system paper, we describe the DL-Learner framework, which supports supervised machine learning using \{OWL\} and \{RDF\} for background knowledge representation. It can be beneficial in various data and schema analysis tasks with applications in different standard machine learning scenarios, e.g. in the life sciences, as well as Semantic Web specific applications such as ontology learning and enrichment. Since its creation in 2007, it has become the main \{OWL\} and RDF-based software framework for supervised structured machine learning and includes several algorithm implementations, usage examples and has applications building on top of the framework. The article gives an overview of the framework with a focus on algorithms and use cases.

    @Article{Buehmann2016,
    Title = {DL-Learner - A framework for inductive learning on the Semantic Web },
    Author = {Lorenz B{\"u}hmann and Jens Lehmann and Patrick Westphal},
    Journal = {Web Semantics: Science, Services and Agents on the World Wide Web },
    Year = {2016},
    Pages = {15 - 24},
    Volume = {39},
    Abstract = {Abstract In this system paper, we describe the DL-Learner framework, which supports supervised machine learning using \{OWL\} and \{RDF\} for background knowledge representation. It can be beneficial in various data and schema analysis tasks with applications in different standard machine learning scenarios, e.g. in the life sciences, as well as Semantic Web specific applications such as ontology learning and enrichment. Since its creation in 2007, it has become the main \{OWL\} and RDF-based software framework for supervised structured machine learning and includes several algorithm implementations, usage examples and has applications building on top of the framework. The article gives an overview of the framework with a focus on algorithms and use cases.},
    Doi = {http://dx.doi.org/10.1016/j.websem.2016.06.001},
    ISSN = {1570-8268},
    Keywords = {dllearner group_aksw group_mole mole buehmann lehmann westphal dllearner sys:relevantFor:infai sys:relevantFor:bis sys:relevantFor:lmol MOLE},
    Owner = {me},
    Timestamp = {2016.10.13},
    Url = {http://www.sciencedirect.com/science/article/pii/S157082681630018X}
    }

2015

  • R. Usbeck, A. {Ngonga Ngomo}, L. Bühmann, and C. Unger, “HAWK – Hybrid Question Answering over Linked Data,” in 12th Extended Semantic Web Conference, Portorož, Slovenia, 31st May – 4th June 2015, 2015.
    [BibTeX] [Download PDF]
    @InProceedings{HAWK_2015,
    Title = {{HAWK} - Hybrid {Q}uestion {A}nswering over Linked Data},
    Author = {Usbeck, Ricardo and {Ngonga Ngomo}, Axel-Cyrille and B{\"u}hmann, Lorenz and Unger, Christina},
    Booktitle = {12th Extended Semantic Web Conference, Portoro{\v{z}}, Slovenia, 31st May - 4th June 2015},
    Year = {2015},
    Bdsk-url-1 = {http://svn.aksw.org/papers/2015/ESWC_HAWK/public.pdf},
    Keywords = {sys:relevantFor:infai sys:relevantFor:bis ngonga simba buehmann usbeck group_aksw hawk},
    Url = {http://svn.aksw.org/papers/2015/ESWC_HAWK/public.pdf}
    }

  • D. Gerber, D. Esteves, J. Lehmann, L. Bühmann, R. Usbeck, A. Ngonga Ngomo, and R. Speck, “DeFacto – Temporal and Multilingual Deep Fact Validation,” Web Semantics: Science, Services and Agents on the World Wide Web, 2015.
    [BibTeX] [Abstract] [Download PDF]
    One of the main tasks when creating and maintaining knowledge bases is to validate facts and provide sources for them in order to ensure correctness and traceability of the provided knowledge. So far, this task is often addressed by human curators in a three-step process: issuing appropriate keyword queries for the statement to check using standard search engines, retrieving potentially relevant documents and screening those documents for relevant content. The drawbacks of this process are manifold. Most importantly, it is very time-consuming as the experts have to carry out several search processes and must often read several documents.In this article, we present DeFacto (Deep Fact Validation) – an algorithm able to validate facts by finding trustworthy sources for them on the Web. DeFacto aims to provide an effective way of validating facts by supplying the user with relevant excerpts of web pages as well as useful additional information including a score for the confidence DeFacto has in the correctness of the input fact. To achieve this goal, DeFacto collects and combines evidence from web pages written in several languages. In addition, DeFacto provides support for facts with a temporal scope, i.e., it can estimate in which time frame a fact was valid. Given that the automatic evaluation of facts has not been paid much attention to so far, generic benchmarks for evaluating these frameworks were not previously available. We thus also present a generic evaluation framework for fact checking and make it publicly available.

    @Article{gerber2015,
    Title = {De{F}acto - {T}emporal and {M}ultilingual {D}eep {F}act {V}alidation},
    Author = {Daniel Gerber and Diego Esteves and Jens Lehmann and Lorenz B{\"u}hmann and Ricardo Usbeck and Axel-Cyrille {Ngonga Ngomo} and Ren{\'e} Speck},
    Journal = {Web Semantics: Science, Services and Agents on the World Wide Web},
    Year = {2015},
    Abstract = {One of the main tasks when creating and maintaining knowledge bases is to validate facts and provide sources for them in order to ensure correctness and traceability of the provided knowledge. So far, this task is often addressed by human curators in a three-step process: issuing appropriate keyword queries for the statement to check using standard search engines, retrieving potentially relevant documents and screening those documents for relevant content. The drawbacks of this process are manifold. Most importantly, it is very time-consuming as the experts have to carry out several search processes and must often read several documents.In this article, we present DeFacto (Deep Fact Validation) - an algorithm able to validate facts by finding trustworthy sources for them on the Web. DeFacto aims to provide an effective way of validating facts by supplying the user with relevant excerpts of web pages as well as useful additional information including a score for the confidence DeFacto has in the correctness of the input fact. To achieve this goal, DeFacto collects and combines evidence from web pages written in several languages. In addition, DeFacto provides support for facts with a temporal scope, i.e., it can estimate in which time frame a fact was valid. Given that the automatic evaluation of facts has not been paid much attention to so far, generic benchmarks for evaluating these frameworks were not previously available. We thus also present a generic evaluation framework for fact checking and make it publicly available.},
    Bdsk-url-1 = {http://svn.aksw.org/papers/2015/JWS_DeFacto/public.pdf},
    Keywords = {2015 group_aksw simba diesel defacto lehmann esteves gerber usbeck speck ngonga geoknow buehmann},
    Url = {http://svn.aksw.org/papers/2015/JWS_DeFacto/public.pdf}
    }

  • L. Bühmann, R. Usbeck, and A. {Ngonga Ngomo}, “ASSESS — Automatic Self-Assessment Using Linked Data,” in International Semantic Web Conference (ISWC), 2015.
    [BibTeX] [Download PDF]
    @InProceedings{ASSESS_iswc_2015,
    Title = {{ASSESS --- Automatic Self-Assessment Using Linked Data}},
    Author = {B{\"u}hmann, Lorenz and Usbeck, Ricardo and {Ngonga Ngomo}, Axel-Cyrille},
    Booktitle = {International Semantic Web Conference (ISWC)},
    Year = {2015},
    Keywords = {sys:relevantFor:infai sys:relevantFor:bis assess ngonga simba buehmann mole usbeck group_aksw},
    Url = {http://svn.aksw.org/papers/2015/ISWC_ASSESS/public.pdf}
    }

2014

  • J. Lehmann, N. Fanizzi, L. Bühmann, and C. d’Amato, “Concept Learning,” in Perspectives on Ontology Learning, J. Lehmann and J. Voelker, Eds., AKA / IOS Press, 2014, pp. 71-91.
    [BibTeX] [Download PDF]
    @InCollection{pol_concept_learning,
    Title = {Concept Learning},
    Author = {Jens Lehmann and Nicola Fanizzi and Lorenz B{\"u}hmann and Claudia d'Amato},
    Booktitle = {Perspectives on Ontology Learning},
    Publisher = {AKA / IOS Press},
    Year = {2014},
    Editor = {Jens Lehmann and Johanna Voelker},
    Pages = {71-91},
    Bdsk-url-1 = {http://jens-lehmann.org/files/2014/pol_concept_learning.pdf},
    Keywords = {2014 group_aksw dllearner MOLE sys:relevantFor:infai sys:relevantFor:bis ys:relevantFor:gold gold lehmann buehmann},
    Owner = {jl},
    Timestamp = {2014.04.12},
    Url = {http://jens-lehmann.org/files/2014/pol_concept_learning.pdf}
    }

  • A. Rula, M. Palmonari, A. N. Ngomo, D. Gerber, J. Lehmann, and L. Bühmann, “Hybrid Acquisition of Temporal Scopes for RDF Data,” in Proc. of the Extended Semantic Web Conference 2014, 2014.
    [BibTeX] [Download PDF]
    @InProceedings{eswc_temporal_scopes,
    Title = {Hybrid Acquisition of Temporal Scopes for {RDF} Data},
    Author = {Anisa Rula and Matteo Palmonari and Axel-Cyrille Ngonga Ngomo and Daniel Gerber and Jens Lehmann and Lorenz B{\"u}hmann},
    Booktitle = {Proc. of the Extended Semantic Web Conference 2014},
    Year = {2014},
    Bdsk-url-1 = {http://jens-lehmann.org/files/2014/eswc_temporal_scoping.pdf},
    Keywords = {group_aksw MOLE SIMBA sys:relevantFor:infai sys:relevantFor:bis sys:relevantFor:lod2 sys:relevantFor:geoknow lod2page 2014 lehmann ngonga gerber buehmann},
    Owner = {jl},
    Timestamp = {2014.04.12},
    Url = {http://jens-lehmann.org/files/2014/eswc_temporal_scoping.pdf}
    }

  • J. Lehmann and L. Bühmann, “Linked Data Reasoning,” in Linked Enterprise Data, X.media press, 2014.
    [BibTeX]
    @InCollection{led_reasoning,
    Title = {Linked Data Reasoning},
    Author = {Jens Lehmann and Lorenz B{\"u}hmann},
    Booktitle = {Linked Enterprise Data},
    Publisher = {X.media press},
    Year = {2014},
    Keywords = {2014 lehmann group_aksw group_mole sys:relevantFor:imole MOLE buehmann}
    }

  • S. Hellmann, V. Bryl, L. Bühmann, M. Dojchinovski, D. Kontokostas, J. Lehmann, U. Milošević, P. Petrovski, V. Svátek, M. Stanojević, and O. Zamazal, “Knowledge Base Creation, Enrichment and Repair,” in Linked Open Data–Creating Knowledge Out of Interlinked Data, Springer, 2014, pp. 45-69.
    [BibTeX] [Download PDF]
    @InCollection{lod2_wp3,
    Title = {Knowledge Base Creation, Enrichment and Repair},
    Author = {Hellmann, Sebastian and Bryl, Volha and B{\"u}hmann, Lorenz and Dojchinovski, Milan and Kontokostas, Dimitris and Lehmann, Jens and Milo{\v{s}}evi{\'c}, Uro{\v{s}} and Petrovski, Petar and Sv{\'a}tek, Vojt{\v{e}}ch and Stanojevi{\'c}, Mladen and Zamazal, Ondrej},
    Booktitle = {Linked Open Data--Creating Knowledge Out of Interlinked Data},
    Publisher = {Springer},
    Year = {2014},
    Pages = {45--69},
    Keywords = {2014 group_aksw group_mole mole hellmann kilt sys:relevantFor:infai sys:relevantFor:bis peer-reviewed lehmann MOLE dojchinovski kontokostas buehmann},
    Url = {http://link.springer.com/chapter/10.1007%2F978-3-319-09846-3_3}
    }

  • L. Bühmann, R. Usbeck, A. Ngonga Ngomo, M. Saleem, A. Both, V. Crescenzi, P. Merialdo, and D. Qiu, “Web-Scale Extension of RDF Knowledge Bases from Templated Websites,” in International Semantic Web Conference (ISWC), 2014.
    [BibTeX] [Download PDF]
    @InProceedings{rex2014,
    Title = {Web-Scale Extension of {RDF} Knowledge Bases from Templated Websites},
    Author = {Lorenz B{\"u}hmann and Ricardo Usbeck and Axel-Cyrille {Ngonga Ngomo} and Muhammad Saleem and Andreas Both and Valter Crescenzi and Paolo Merialdo and Disheng Qiu},
    Booktitle = {International Semantic Web Conference (ISWC)},
    Year = {2014},
    Bdsk-url-1 = {http://svn.aksw.org/papers/2014/ISWC_REX/public.pdf},
    Keywords = {group_aksw simba sys:relevantFor:infai sys:relevantFor:bis rex ngonga saleem buehmann usbeck extraction rdf},
    Owner = {ngonga},
    Timestamp = {2014.07.07},
    Url = {http://svn.aksw.org/papers/2014/ISWC_REX/public.pdf}
    }

  • L. Bühmann, D. Fleischhacker, J. Lehmann, A. Melo, and J. Völker, “Inductive Lexical Learning of Class Expressions,” in Knowledge Engineering and Knowledge Management, 2014, pp. 42-53. doi:10.1007/978-3-319-13704-9_4
    [BibTeX] [Download PDF]
    @InProceedings{Buehmann2014,
    Title = {Inductive Lexical Learning of Class Expressions},
    Author = {B{\"u}hmann, Lorenz and Fleischhacker, Daniel and Lehmann, Jens and Melo, Andre and V{\"o}lker, Johanna},
    Booktitle = {Knowledge Engineering and Knowledge Management},
    Year = {2014},
    Editor = {Janowicz, Krzysztof and Schlobach, Stefan and Lambrix, Patrick and Hyv{\"o}nen, Eero},
    Pages = {42-53},
    Publisher = {Springer International Publishing},
    Series = {Lecture Notes in Computer Science},
    Volume = {8876},
    Bdsk-url-1 = {http://dx.doi.org/10.1007/978-3-319-13704-9_4},
    Doi = {10.1007/978-3-319-13704-9_4},
    ISBN = {978-3-319-13703-2},
    Keywords = {2014 group_aksw event_ekaw group_mole mole buehmann lehmann dllearner ore sys:relevantFor:infai sys:relevantFor:bis sys:relevantFor:lmol MOLE},
    Language = {English},
    Owner = {lorenz},
    Timestamp = {2014.11.23},
    Url = {http://dx.doi.org/10.1007/978-3-319-13704-9_4}
    }

2013

  • A. Ngonga Ngomo, L. Bühmann, C. Unger, J. Lehmann, and D. Gerber., “Sorry, I don’t speak SPARQL — Translating SPARQL Queries into Natural Language,” in Proceedings of WWW, 2013.
    [BibTeX] [Download PDF]
    @InProceedings{NGO+13a,
    Title = {Sorry, I don't speak SPARQL --- Translating SPARQL Queries into Natural Language},
    Author = {Axel-Cyrille {Ngonga Ngomo} and Lorenz B{\"u}hmann and Christina Unger and Jens Lehmann and Daniel Gerber.},
    Booktitle = {Proceedings of WWW},
    Year = {2013},
    Bdsk-url-1 = {http://jens-lehmann.org/files/2013/www_sparql2nl.pdf},
    Keywords = {2013 group_aksw SIMBA MOLE sys:relevantFor:infai sys:relevantFor:bis sys:relevantFor:lod2 ngonga lehmann buehmann gerber bioasq},
    Owner = {ngonga},
    Timestamp = {2013.03.09},
    Url = {http://jens-lehmann.org/files/2013/www_sparql2nl.pdf}
    }

  • A. Zaveri, D. Kontokostas, M. A. Sherif, L. Bühmann, M. Morsey, S. Auer, and J. Lehmann, “User-driven Quality Evaluation of DBpedia,” in Proceedings of 9th International Conference on Semantic Systems, I-SEMANTICS ’13, Graz, Austria, September 4-6, 2013, 2013, pp. 97-104.
    [BibTeX] [Abstract] [Download PDF]
    Linked Open Data (LOD) comprises of an unprecedented volume of structured datasets on the Web. However, these datasets are of varying quality ranging from extensively curated datasets to crowdsourced and even extracted data of relatively low quality. We present a methodology for assessing the quality of linked data resources, which comprises of a manual and a semi-automatic process. The first phase includes the detection of common quality problems and their representation in a quality problem taxonomy. In the manual process, the second phase comprises of the evaluation of a large number of individual resources, according to the quality problem taxonomy via crowdsourcing. This process is accompanied by a tool wherein a user assesses an individual resource and evaluates each fact for correctness. The semi-automatic process involves the generation and verification of schema axioms. We report the results obtained by applying this methodology to DBpedia. We identified 17 data quality problem types and 58 users assessed a total of 521 resources. Overall, 11.93\% of the evaluated DBpedia triples were identified to have some quality issues. Applying the semi-automatic component yielded a total of 222,982 triples that have a high probability to be incorrect. In particular, we found that problems such as object values being incorrectly extracted, irrelevant extraction of information and broken links were the most recurring quality problems. With this study, we not only aim to assess the quality of this sample of DBpedia resources but also adopt an agile methodology to improve the quality in future versions by regularly providing feedback to the DBpedia maintainers.

    @InProceedings{zaveri2013,
    Title = {User-driven Quality Evaluation of DBpedia},
    Author = {Amrapali Zaveri and Dimitris Kontokostas and Mohamed Ahmed Sherif and Lorenz B\"uhmann and Mohamed Morsey and S\"oren Auer and Jens Lehmann},
    Booktitle = {Proceedings of 9th International Conference on Semantic Systems, I-SEMANTICS '13, Graz, Austria, September 4-6, 2013},
    Year = {2013},
    Pages = {97-104},
    Publisher = {ACM},
    Abstract = {Linked Open Data (LOD) comprises of an unprecedented volume of structured datasets on the Web. However, these datasets are of varying quality ranging from extensively curated datasets to crowdsourced and even extracted data of relatively low quality. We present a methodology for assessing the quality of linked data resources, which comprises of a manual and a semi-automatic process. The first phase includes the detection of common quality problems and their representation in a quality problem taxonomy. In the manual process, the second phase comprises of the evaluation of a large number of individual resources, according to the quality problem taxonomy via crowdsourcing. This process is accompanied by a tool wherein a user assesses an individual resource and evaluates each fact for correctness. The semi-automatic process involves the generation and verification of schema axioms. We report the results obtained by applying this methodology to DBpedia. We identified 17 data quality problem types and 58 users assessed a total of 521 resources. Overall, 11.93\% of the evaluated DBpedia triples were identified to have some quality issues. Applying the semi-automatic component yielded a total of 222,982 triples that have a high probability to be incorrect. In particular, we found that problems such as object values being incorrectly extracted, irrelevant extraction of information and broken links were the most recurring quality problems. With this study, we not only aim to assess the quality of this sample of DBpedia resources but also adopt an agile methodology to improve the quality in future versions by regularly providing feedback to the DBpedia maintainers.},
    Bdsk-url-1 = {http://svn.aksw.org/papers/2013/ISemantics_DBpediaDQ/public.pdf},
    Date-modified = {2015-02-06 06:56:39 +0000},
    Ee = {http://doi.acm.org/10.1145/2506182.2506195},
    Keywords = {zaveri sherif morsey buemann kontokostas auer lehmann group_aksw sys:relevantFor:infai sys:relevantFor:bis sys:relevantFor:lod2 lod2page 2013 event_I-Semantics dbpediadq sys:relevantFor:geoknow topic_QualityAnalysis dataquality MOLE buehmann},
    Owner = {soeren},
    Timestamp = {2013.06.01},
    Url = {http://svn.aksw.org/papers/2013/ISemantics_DBpediaDQ/public.pdf}
    }

  • A. Ngonga Ngomo, L. Bühmann, C. Unger, J. Lehmann, and D. Gerber, “SPARQL2NL – Verbalizing SPARQL queries,” in Proc. of WWW 2013 Demos, 2013, pp. 329-332.
    [BibTeX] [Download PDF]
    @InProceedings{sparql2nl-demo,
    Title = {SPARQL2NL - Verbalizing SPARQL queries},
    Author = {Axel-Cyrille {Ngonga Ngomo} and Lorenz B{\"u}hmann and Christina Unger and Jens Lehmann and Daniel Gerber},
    Booktitle = {Proc. of WWW 2013 Demos},
    Year = {2013},
    Pages = {329-332},
    Bdsk-url-1 = {http://jens-lehmann.org/files/2013/www_demo_sparql2nl.pdf},
    Keywords = {2013 MOLE group_aksw lehmann ngonga buehmann sys:relevantFor:infai sys:relevantFor:bis sys:relevantFor:lod2 sys:relevantFor:geoknow topic_Exploration lod2page geoknow peer-reviewed bioasq},
    Owner = {jl},
    Timestamp = {2013.04.27},
    Url = {http://jens-lehmann.org/files/2013/www_demo_sparql2nl.pdf}
    }

  • K. Höffner, C. Unger, L. Bühmann, J. Lehmann, A. N. Ngomo, D. Gerber, and P. Cimiano, “User Interface for a Template Based Question Answering System,” in Proceedings of the 4th Conference on Knowledge Engineering and Semantic Web, 2013, pp. 258-264.
    [BibTeX] [Download PDF]
    @InProceedings{hoeffner-2013-kesw,
    Title = {User Interface for a Template Based {Q}uestion {A}nswering System},
    Author = {Konrad H{\"o}ffner and Christina Unger and Lorenz B{\"u}hmann and Jens Lehmann and Axel-Cyrille Ngonga Ngomo and Daniel Gerber and Phillip Cimiano},
    Booktitle = {Proceedings of the 4th Conference on Knowledge Engineering and Semantic Web},
    Year = {2013},
    Pages = {258-264},
    Bdsk-url-1 = {http://svn.aksw.org/papers/2013/KESW_AutoSparqlTbsl_Demo/public.pdf},
    Ee = {http://dx.doi.org/10.1007/978-3-642-41360-5_21},
    Keywords = {group_aksw SIMBA MOLE sys:relevantFor:infai sys:relevantFor:bis sys:relevantFor:geoknow ngonga topic_Search topic_Querying lehmann hoeffner 2013 autosparql tbsl buehmann},
    Url = {http://svn.aksw.org/papers/2013/KESW_AutoSparqlTbsl_Demo/public.pdf}
    }

  • D. Gerber, A. Ngonga Ngomo, S. Hellmann, T. Soru, L. Bühmann, and R. Usbeck, “Real-time RDF extraction from unstructured data streams,” in Proceedings of ISWC, 2013.
    [BibTeX] [Download PDF]
    @InProceedings{GER+13,
    Title = {Real-time {RDF} extraction from unstructured data streams},
    Author = {Daniel Gerber and Axel-Cyrille {Ngonga Ngomo} and Sebastian Hellmann and Tommaso Soru and Lorenz B{\"u}hmann and Ricardo Usbeck},
    Booktitle = {Proceedings of ISWC},
    Year = {2013},
    Bdsk-url-1 = {https://www.researchgate.net/publication/256977222_Real-time_RDF_extraction_from_unstructured_data_streams},
    Keywords = {sys:relevantFor:infai sys:relevantFor:bis ngonga gerber simba hellmann kilt soru buehmann usbeck group_aksw kilt},
    Owner = {ngonga},
    Timestamp = {2013.07.05},
    Url = {https://www.researchgate.net/publication/256977222_Real-time_RDF_extraction_from_unstructured_data_streams}
    }

  • L. Bühmann and J. Lehmann, “Pattern Based Knowledge Base Enrichment,” in The Semantic Web — ISWC 2013, H. Alani, L. Kagal, A. Fokoue, P. Groth, C. Biemann, J. Parreira, L. Aroyo, N. Noy, C. Welty, and K. Janowicz, Eds., Springer Berlin Heidelberg, 2013, vol. 8218, pp. 33-48. doi:10.1007/978-3-642-41335-3_3
    [BibTeX] [Download PDF]
    @InCollection{pattern_enrichment,
    Title = {Pattern Based Knowledge Base Enrichment},
    Author = {B{\"u}hmann, Lorenz and Lehmann, Jens},
    Booktitle = {The Semantic Web -- ISWC 2013},
    Publisher = {Springer Berlin Heidelberg},
    Year = {2013},
    Editor = {Alani, Harith and Kagal, Lalana and Fokoue, Achille and Groth, Paul and Biemann, Chris and Parreira, JosianeXavier and Aroyo, Lora and Noy, Natasha and Welty, Chris and Janowicz, Krzysztof},
    Pages = {33-48},
    Series = {Lecture Notes in Computer Science},
    Volume = {8218},
    Bdsk-url-1 = {http://svn.aksw.org/papers/2013/ISWC_Pattern_Enrichment/public.pdf},
    Bdsk-url-2 = {http://dx.doi.org/10.1007/978-3-642-41335-3_3},
    Doi = {10.1007/978-3-642-41335-3_3},
    ISBN = {978-3-642-41334-6},
    Keywords = {buehmann lehmann group_aksw group_mole MOLE sys:relevantFor:infai sys:relevantFor:bis sys:relevantFor:lod2 lod2page 2013 event_ISWC dllearner ore sys:relevantFor:geoknow topic_Enrichment},
    Language = {English},
    Owner = {lorenz},
    Timestamp = {2014.11.24},
    Url = {http://svn.aksw.org/papers/2013/ISWC_Pattern_Enrichment/public.pdf}
    }

2012

  • C. Unger, L. Bühmann, J. Lehmann, A. Ngonga Ngomo, D. Gerber, and P. Cimiano, “Template-based Question Answering over RDF data,” in Proceedings of the 21st international conference on World Wide Web, 2012, pp. 639-648.
    [BibTeX] [Download PDF]
    @InProceedings{unger2012template,
    Title = {Template-based {Q}uestion {A}nswering over {RDF} data},
    Author = {Unger, Christina and B{\"u}hmann, Lorenz and Lehmann, Jens and Ngonga Ngomo, Axel-Cyrille and Gerber, Daniel and Cimiano, Philipp},
    Booktitle = {Proceedings of the 21st international conference on World Wide Web},
    Year = {2012},
    Pages = {639--648},
    Bdsk-url-1 = {http://jens-lehmann.org/files/2012/tbsl_www.pdf},
    Keywords = {2012 group_aksw SIMBA sys:relevantFor:infai boa sys:relevantFor:bis ngonga lehmann geber MOLE buehmann autosparql},
    Owner = {ngonga},
    Url = {http://jens-lehmann.org/files/2012/tbsl_www.pdf}
    }

  • J. Lehmann, T. Furche, G. Grasso, A. Ngonga Ngomo, C. Schallhart, A. Sellers, C. Unger, L. Bühmann, D. Gerber, K. Höffner, D. Liu, and S. Auer, “DEQA: Deep Web Extraction for Question Answering,” in Proceedings of ISWC, 2012.
    [BibTeX] [Download PDF]
    @InProceedings{Lehmann2012,
    Title = {DEQA: Deep Web Extraction for Question Answering},
    Author = {Jens Lehmann and Tim Furche and Giovanni Grasso and Axel-Cyrille {Ngonga Ngomo} and Christian Schallhart and Andrew Sellers and Christina Unger and Lorenz B{\"u}hmann and Daniel Gerber and Konrad H{\"o}ffner and David Liu and S{\"o}ren Auer},
    Booktitle = {Proceedings of ISWC},
    Year = {2012},
    Bdsk-url-1 = {http://jens-lehmann.org/files/2012/iswc_deqa.pdf},
    Date-modified = {2012-12-02 12:51:46 +0000},
    Keywords = {2012 group_aksw SIMBA MOLE sys:relevantFor:infai sys:relevantFor:bis ngonga lehmann gerber buehmann boa auer hoeffner limes},
    Owner = {ngonga},
    Timestamp = {2012.09.18},
    Url = {http://jens-lehmann.org/files/2012/iswc_deqa.pdf}
    }

  • J. Lehmann, T. Furche, G. Grasso, A. Ngonga Ngomo, C. Schallhart, A. Sellers, C. Unger, L. Bühmann, D. Gerber, K. Höffner, D. Liu, and S. Auer, “DEQA: Deep Web Extraction for Question Answering,” in Proceedings of ISWC, 2012.
    [BibTeX] [Download PDF]
    @InProceedings{LEH+12b,
    Title = {DEQA: Deep Web Extraction for {Q}uestion {A}nswering},
    Author = {Jens Lehmann and Tim Furche and Giovanni Grasso and Axel-Cyrille {Ngonga Ngomo} and Christian Schallhart and Andrew Sellers and Christina Unger and Lorenz B{\"u}hmann and Daniel Gerber and Konrad H{\"o}ffner and David Liu and S{\"o}ren Auer},
    Booktitle = {Proceedings of ISWC},
    Year = {2012},
    Bdsk-url-1 = {http://jens-lehmann.org/files/2012/iswc_deqa.pdf},
    Date-modified = {2012-12-02 12:51:46 +0000},
    Keywords = {2012 group_aksw SIMBA MOLE sys:relevantFor:infai sys:relevantFor:bis ngonga lehmann gerber buehmann boa auer hoeffner limes},
    Owner = {ngonga},
    Timestamp = {2012.09.18},
    Url = {http://jens-lehmann.org/files/2012/iswc_deqa.pdf}
    }

  • L. Bühmann and J. Lehmann, “Universal OWL Axiom Enrichment for Large Knowledge Bases,” in Proceedings of EKAW 2012, 2012, pp. 57-71.
    [BibTeX] [Download PDF]
    @InProceedings{Buhmann2012,
    Title = {Universal {OWL} Axiom Enrichment for Large Knowledge Bases},
    Author = {Lorenz B{\"u}hmann and Jens Lehmann},
    Booktitle = {Proceedings of EKAW 2012},
    Year = {2012},
    Pages = {57--71},
    Publisher = {Springer},
    Bdsk-url-1 = {http://jens-lehmann.org/files/2012/ekaw_enrichment.pdf},
    Date-modified = {2012-12-02 13:07:03 +0000},
    Keywords = {2012 group_aksw event_ekaw group_mole mole buehmann lehmann MOLE dllearner ore sys:relevantFor:infai sys:relevantFor:bis sys:relevantFor:lod2 lod2page peer-reviewed},
    Owner = {jl},
    Timestamp = {2012.07.18},
    Url = {http://jens-lehmann.org/files/2012/ekaw_enrichment.pdf}
    }

  • S. Auer, L. Bühmann, C. Dirschl, O. Erling, M. Hausenblas, R. Isele, J. Lehmann, M. Martin, P. N. Mendes, B. van Nuffelen, C. Stadler, S. Tramp, and H. Williams, “Managing the life-cycle of Linked Data with the LOD2 Stack,” in Proceedings of International Semantic Web Conference (ISWC 2012), 2012.
    [BibTeX] [Download PDF]
    @InProceedings{Auer+ISWC-2012,
    Title = {Managing the life-cycle of Linked Data with the {LOD2} Stack},
    Author = {S\"{o}ren Auer and Lorenz B{\"u}hmann and Christian Dirschl and Orri Erling and Michael Hausenblas and Robert Isele and Jens Lehmann and Michael Martin and Pablo N. Mendes and Bert van Nuffelen and Claus Stadler and Sebastian Tramp and Hugh Williams},
    Booktitle = {Proceedings of International Semantic Web Conference (ISWC 2012)},
    Year = {2012},
    Note = {22\% acceptance rate},
    Bdsk-url-1 = {http://svn.aksw.org/lod2/Paper/ISWC2012-InUse_LOD2-Stack/public.pdf},
    Date-modified = {2012-12-02 12:25:29 +0000},
    Keywords = {auer buehmann lehmann tramp martin stadler dllearner group_aksw sys:relevantFor:infai sys:relevantFor:bis sys:relevantFor:lod2 lod2page 2012 event_ISWC MOLE ES interlinking quality analysis search exploration browsing extraction storage querying manual revision authoring fusing},
    Owner = {soeren},
    Timestamp = {2012.08.14},
    Url = {http://iswc2012.semanticweb.org/sites/default/files/76500001.pdf}
    }

2011

  • J. Lehmann and L. Bühmann, “AutoSPARQL: Let Users Query Your Knowledge Base,” in Proceedings of ESWC 2011, 2011.
    [BibTeX] [Download PDF]
    @InProceedings{lehmann2011,
    Title = {{AutoSPARQL}: Let Users Query Your Knowledge Base},
    Author = {Jens Lehmann and Lorenz B{\"u}hmann},
    Booktitle = {Proceedings of ESWC 2011},
    Year = {2011},
    Bdsk-url-1 = {http://jens-lehmann.org/files/2011/autosparql_eswc.pdf},
    Date-modified = {2012-12-02 12:24:52 +0000},
    Keywords = {2011 group_aksw MOLE event_eswc lehmann buehmann sys:relevantFor:infai dllearner sys:relevantFor:bis sys:relevantFor:lod2 lod2page autosparql peerdllearner dllearner},
    Owner = {jl},
    Timestamp = {2011.03.22},
    Url = {http://jens-lehmann.org/files/2011/autosparql_eswc.pdf}
    }

  • J. Lehmann, S. Auer, L. Bühmann, and S. Tramp, “Class expression learning for ontology engineering,” Journal of Web Semantics, vol. 9, pp. 71-81, 2011.
    [BibTeX] [Download PDF]
    @Article{celoe,
    Title = {Class expression learning for ontology engineering},
    Author = {Jens Lehmann and S{\"o}ren Auer and Lorenz B{\"u}hmann and Sebastian Tramp},
    Journal = {Journal of Web Semantics},
    Year = {2011},
    Pages = {71 - 81},
    Volume = {9},
    Address = {Amsterdam, The Netherlands, The Netherlands},
    Bdsk-url-1 = {http://jens-lehmann.org/files/2011/celoe.pdf},
    Date-modified = {2012-12-02 12:25:06 +0000},
    Keywords = {2011 group_aksw tramp lehmann buehmann auer seebiproject_OntoWiki dllearner sys:relevantFor:infai sys:relevantFor:bis sys:relevantFor:lod2 lod2page peer-reviewed MOLE},
    Owner = {jl},
    Publisher = {Elsevier Science Publishers B. V.},
    Timestamp = {2011.02.18},
    Url = {http://jens-lehmann.org/files/2011/celoe.pdf}
    }

2010

  • J. Lehmann and L. Bühmann, “ORE – A Tool for Repairing and Enriching Knowledge Bases,” in Proceedings of the 9th International Semantic Web Conference (ISWC2010), 2010, pp. 177-193. doi:doi:10.1007/978-3-642-17749-1_12
    [BibTeX] [Download PDF]
    @InProceedings{lehmann-2010-iswc,
    Title = {{ORE} - A Tool for Repairing and Enriching Knowledge Bases},
    Author = {Jens Lehmann and Lorenz B{\"u}hmann},
    Booktitle = {Proceedings of the 9th International Semantic Web Conference (ISWC2010)},
    Year = {2010},
    Pages = {177--193},
    Publisher = {Springer},
    Series = {Lecture Notes in Computer Science},
    Bdsk-url-1 = {http://svn.aksw.org/papers/2010/ORE/public.pdf},
    Bdsk-url-2 = {http://dx.doi.org/10.1007/978-3-642-17749-1_12},
    Date-modified = {2012-12-02 13:02:02 +0000},
    Doi = {doi:10.1007/978-3-642-17749-1_12},
    Keywords = {2010 event_iswc group_aksw MOLE lehmann buehmann sys:relevantFor:infai dllearner ore sys:relevantFor:bis sys:relevantFor:lod2 lod2page peer-reviewed ontowiki_eu},
    Owner = {seebi},
    Timestamp = {2010.08.30},
    Url = {http://svn.aksw.org/papers/2010/ORE/public.pdf}
    }

  • A. Zaveri, D. Kontokostas, M. A. Sherif, L. Bühmann, M. Morsey, S. Auer, and J. Lehmann, “User-driven Quality Evaluation of DBpedia.” , pp. 97-104.
    [BibTeX] [Abstract] [Download PDF]
    Linked Open Data (LOD) comprises of an unprecedented volume of structured datasets on the Web. However, these datasets are of varying quality ranging from extensively curated datasets to crowdsourced and even extracted data of relatively low quality. We present a methodology for assessing the quality of linked data resources, which comprises of a manual and a semi-automatic process. The first phase includes the detection of common quality problems and their representation in a quality problem taxonomy. In the manual process, the second phase comprises of the evaluation of a large number of individual resources, according to the quality problem taxonomy via crowdsourcing. This process is accompanied by a tool wherein a user assesses an individual resource and evaluates each fact for correctness. The semi-automatic process involves the generation and verification of schema axioms. We report the results obtained by applying this methodology to DBpedia. We identified 17 data quality problem types and 58 users assessed a total of 521 resources. Overall, 11.93\% of the evaluated DBpedia triples were identified to have some quality issues. Applying the semi-automatic component yielded a total of 222,982 triples that have a high probability to be incorrect. In particular, we found that problems such as object values being incorrectly extracted, irrelevant extraction of information and broken links were the most recurring quality problems. With this study, we not only aim to assess the quality of this sample of DBpedia resources but also adopt an agile methodology to improve the quality in future versions by regularly providing feedback to the DBpedia maintainers.

    @InProceedings{Zaveri,
    Title = {User-driven Quality Evaluation of DBpedia},
    Author = {Amrapali Zaveri and Dimitris Kontokostas and Mohamed A. Sherif and Lorenz B\"uhmann and Mohamed Morsey and S\"oren Auer and Jens Lehmann},
    Pages = {97--104},
    Abstract = {Linked Open Data (LOD) comprises of an unprecedented volume of structured datasets on the Web. However, these datasets are of varying quality ranging from extensively curated datasets to crowdsourced and even extracted data of relatively low quality. We present a methodology for assessing the quality of linked data resources, which comprises of a manual and a semi-automatic process. The first phase includes the detection of common quality problems and their representation in a quality problem taxonomy. In the manual process, the second phase comprises of the evaluation of a large number of individual resources, according to the quality problem taxonomy via crowdsourcing. This process is accompanied by a tool wherein a user assesses an individual resource and evaluates each fact for correctness. The semi-automatic process involves the generation and verification of schema axioms. We report the results obtained by applying this methodology to DBpedia. We identified 17 data quality problem types and 58 users assessed a total of 521 resources. Overall, 11.93\% of the evaluated DBpedia triples were identified to have some quality issues. Applying the semi-automatic component yielded a total of 222,982 triples that have a high probability to be incorrect. In particular, we found that problems such as object values being incorrectly extracted, irrelevant extraction of information and broken links were the most recurring quality problems. With this study, we not only aim to assess the quality of this sample of DBpedia resources but also adopt an agile methodology to improve the quality in future versions by regularly providing feedback to the DBpedia maintainers.},
    Bdsk-url-1 = {http://svn.aksw.org/papers/2013/ISemantics_DBpediaDQ/public.pdf},
    Crossref = {ISEMANTICS2013},
    Date-modified = {2013-07-11 19:42:39 +0000},
    Ee = {http://doi.acm.org/10.1145/2506182.2506195},
    Keywords = {zaveri sherif morsey buemann kontokostas auer lehmann group_aksw sys:relevantFor:infai sys:relevantFor:bis sys:relevantFor:lod2 lod2page 2013 event_I-Semantics dbpediadq sys:relevantFor:geoknow},
    Owner = {soeren},
    Timestamp = {2013.06.01},
    Url = {http://svn.aksw.org/papers/2013/ISemantics_DBpediaDQ/public.pdf}
    }