Poster and Workshop papers accepted at ESWC 2019

eswc2019We are very pleased to announce that our group got 2 poster papers accepted for presentation at the ESWC 2019 : The 16th edition of the The Extended Semantic Web Conference, which will be held on June 2-6, 2019 in Portorož, Slovenia.

The ESWC is a major venue for discussing the latest scientific results and technology innovations around semantic technologies. Building on its past success, ESWC is seeking to broaden its focus to span other relevant related research areas in which Web semantics plays an important role. ESWC 2019 will present the latest results in research, technologies and applications in its field. Besides the technical program organized over twelve tracks, the conference will feature a workshop and tutorial program, a dedicated track on Semantic Web challenges, system descriptions and demos, a posters exhibition and a doctoral symposium.

Here is the pre-print of the accepted papers with their abstract:

Abstract: Among the various domains involved in large RDF graphs, applications may rely on geographical information which are often carried and presented via Points Of Interests. In particular, one challenge aims at extracting patterns from POIs sets to discover Areas Of Interest. To tackle it, an usual method is to aggregate various points according to specific distances (e.g. geographical) via clustering algorithms. In this study, we present a flexible architecture to design pipelines able to aggregate POIs from contextual to geographical dimensions in a single run. This solution allows any kind of state-of-the-art clustering algorithm combinations to compute AOIs and is built on top of a Semantic Web stack which allows multiple-source querying and filtering through SPARQL.

Abstract: Due to the rapid expansion of multilingual data on the web, developing ontology enrichment approaches has become an interesting and active subject of research. In this paper, we propose a cross-lingual matching approach for ontology enrichment (OECM) in order to enrich an ontology using another one in a different natural language. A prototype for the proposed approach has been implemented and evaluated using the MultiFarm benchmark. Evaluation results are promising and show high precision and recall compared to state-of-the-art approaches.

Furthermore, we got a paper accepted at LASCAR Workshop: 1st Workshop on Large Scale RDF Analytics, co-located with the ESWC 2019.

LASCAR Workshop seeks original “articles and posters” describing theoretical and practical methods as well as techniques for performing scalable analytics on knowledge graphs.

Here is the pre-print of the accepted paper with its abstract:

Abstract: Controlling the usage of business-critical data is essential for every company. While the upcoming age of Industry 4.0 propagates a seamless data exchange between all participating devices, facilities and companies along the production chain, the required data control mechanisms are lacking behind. We claim that for an effective protection, both access and usage control enforcement is a must-have for organizing Industry 4.0 collaboration networks. Formalized and machine-readable policies are one fundamental building block to achieve the needed trust level for real data-driven collaborations. We explain the current challenges of specifying access and usage control policies and outline respective approaches relying on Semantic Web of Things practices. We analyze the requirements and implications of existing technologies and discuss their shortcomings. Based on our experiences from the specification of the International Data Spaces Usage Control Language, the necessary next steps towards automatically monitored and enforced policies are outlined and research needs formulated.

 

Acknowledgment

This work was partially funded by the European H2020 SLIPO project (GA. 731581).

Looking forward to seeing you at The ESWC 2019.

Workshop proposal accepted at ECML 2019

ecml2019We are pleased to announce a new workshop on “New Trends in Representation Learning with Knowledge Graphs” which will be included in the program of ECML 2019: The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Database. ECML 2019 will take place in Würzburg, Germany, from the 16th to the 20th of September 2019.

This conference is the premier European machine learning and data mining conference and builds upon over 17 years of successful events and conferences held across Europe.

Title: New Trends in Representation Learning with Knowledge Graphs

Organizers:

  • Volker Tresp (Ludwig-Maximilians University and Siemens, Germany)
  • Jens Lehmann (Bonn University and Fraunhofer IAIS, Germany)
  • Aditya Mogadala (Saarland University, Germany)
  • Achim Rettinger (Trier University, Germany)
  • Afshin Sadeghi (Fraunhofer IAIS and University of Bonn, Germany)
  • Mehdi Ali (University of Bonn and Fraunhofer IAIS, Germany)

Website: https://sites.google.com/view/kgrlfr-workshop/home

Abstract Knowledge Graphs are becoming the standard for storing, retrieving and querying structured data. In academia and in industry, they are increasingly used to provide background knowledge. Over the last years, several research contributions are made to show machine learning especially representation learning is successfully applied to knowledge graphs enabling inductive inference about facts with unknown truth values. In this workshop we intend to focus on the most exciting new developments in Knowledge Graph learning, bridging the gap to recent developments in different fields. Also, we want to bring together researchers from different disciplines but united by their adoption of earlier mentioned techniques from machine learning.

Invited speakers:

  • Maximilian Nickel (Facebook AI Research, USA)
  • Mathias Niepert (NEC Labs Europe, Germany)

 

Looking forward to seeing you at The ECML/PKDD 2019.

Paper accepted at ICWE 2019

cropped-ICWE2019-LOGO-1We are very pleased to announce that our group got a paper accepted for presentation at the ICWE 2019: The 19th International Conference on Web Engineering, which will be held on June 11 – 14, 2019 / Daejeon Convention Center (DCC), Daejeon, Korea. The ICWE is the prime yearly international conference on the different aspects of designing, building, maintaining and using Web applications.

The conference will cover the different aspects of Web Engineering, including the design, creation, maintenance and usage of Web applications. ICWE aims to bring together researchers and practitioners from various disciplines in academia and industry to tackle the emerging challenges in the engineering of Web applications and in the problems of its associated technologies, as well as the impact of those technologies on society, media and culture.

Here is the accepted paper with its abstract:

Abstract: ver the last decades the Web has evolved from a human–human communication network to a network of complex human–machine interactions. An increasing amount of data is available as Linked Data which allows machines to “understand” the data, but RDF is not meant to be understood by humans. With Jekyll RDF we present a method to close the gap between structured data and human accessible exploration interfaces by publishing RDF datasets as customizable static HTML sites. It consists of an RDF resource mapping system to serve the resources under their respective IRI, a template mapping based on schema classes, and a markup language to define templates to render customized resource pages. Using the template system, it is possible to create domain specific browsing interfaces for RDF data next to the Linked Data resources. This enables content management and knowledge management systems to serve datasets in a highly customizable, low effort, and scalable way to be consumed by machines as well as humans.

 

Looking forward to seeing you at The ICWE 2019.

Dr. Philipp Senger and Dr. Daniel Burkow from Bayer Crop Science visit SDA

naming_logo_cropscienceDr. Philipp Senger and Dr. Daniel Burkow from Bayer Crop Science visited the SDA group on April  17, 2019.

On Wednesday, SDA had visitors from Bayer: Philipp Senger is the Head of CLS (Computational Life Science) Translational R&D at Bayer Crop Science. His main research interests comprise data science, semantic modeling, knowledge graphs (KGs), natural language processing and machine learning. He together with his team apply their research results to develop digital products and services. Philipp earned a Ph.D. in computer science from the Eberhard-Karls-Universität Tübingen in cooperation with the Robert Bosch GmbH where he investigated data-based methods for automatic generation of electrical behavior models in the automotive industry.

Daniel Burkow is a Post-Doctoral researcher at CLS Translational R&D at Bayer Crop Science.
His main research interests comprise applied mathematics for life sciences, machine learning,  data modeling and knowledge graphs. At Translational R&D at Bayer Crop Science, he works on the development of machine learning applications and data modeling approaches.
Daniel earned his Ph.D. from Arizona State University in the field of Applied Mathematics for Life Sciences in which he examined intramyocellular lipids and the progression of muscular insulin resistance.

Dr. Senger and Dr. Burkow were invited at the bi-weekly “SDA colloquium presentations” where they presented Bayer Crop Science and their main research topics. The goal of the visit was to exchange experience and ideas on combining machine learning approaches with knowledge graphs. In particular, they explained how field trials are modeled at Bayer Crop Science and presented a recent use case in which knowledge graph embeddings have been applied to get more insights about their experimental data.

During the meeting, SDA’s core research topics and main research projects have been presented and future collaborations between Bayer Crop Science R&D and SDA has been discussed. Especially, the application of reasoning and inference methods on large scale KGs describing field experiments, the development of KGE embedding models that incorporate literals (e.g., numerical values) and the application of question answering have been discussed.

We are looking forward to future collaborations with Bayer Crop Science.

Paper accepted at NAACL 2019

naclWe are very pleased to announce that our group got a paper accepted for presentation at The 2019 edition of The NAACL conference, which will be held on June 2–7, 2019 Minneapolis, USA.

NAACL aims to bring together researchers interested in the design and study of natural language processing technology as well as its applications to new problem areas. With this goal in mind, the 2019 edition invites the submission of long and short papers on creative, substantial and unpublished research in all aspects of computational linguistics. It covers a diverse technical program–in addition to traditional research results, papers may present negative findings, survey an area, announce the creation of a new resource, argue a position, report novel linguistic insights derived using existing techniques, and reproduce, or fail to reproduce, previous results.

Here is the pre-print of the accepted paper with its abstract:

Abstract: Short texts challenge NLP tasks such as named entity recognition, disambiguation, linking and relation inference because they do not provide sufficient context or are partially malformed (e.g. wrt. capitalization, long tail entities, implicit relations). In this work, we present the Falcon approach which effectively maps entities and relations within a short text to its mentions of a background knowledge graph. Falcon overcomes the challenges of short text using a light-weight linguistic approach relying on a background knowledge graph. Falcon performs joint entity and relation linking of a short text by leveraging several fundamental principles of English morphology (e.g. compounding, headword identification) and utilizes an extended knowledge graph created by merging entities and relations from various knowledge sources. It uses the context of entities for finding relations and does not require training data. Our empirical study using several standard benchmarks and datasets show that Falcon significantly outperforms state-of-the-art entity and relation linking for short text query inventories.

 

Acknowledgment

This work was partially funded by the Fraunhofer IAIS, and EU H2020 project IASIS.

 

Looking forward to seeing you at The NAACL 2019 conference.

Paper accepted at EvoStar 2019

evostar2019logoWe are very pleased to announce that our group got a paper accepted for presentation at the EvoStar 2019: The Leading European Event on Bio-Inspired Computation, which will be held on 24-26 April 2019, Leipzig, Germany.  

EvoStar comprises of four co-located conferences run each spring at different locations throughout Europe. These events arose out of workshops originally developed by EvoNet, the Network of Excellence in Evolutionary Computing, established by the Information Societies Technology Programme of the European Commission, and they represent a continuity of research collaboration stretching back over 20 years.

Our paper got accepted at the EvoMUSART, the 8th International Conference (and 13th European event) on Evolutionary and Biologically Inspired Music, Sound, Art and Design.

The main goal of evoMUSART 2019 is to bring together researchers who are using Computational Intelligence techniques for artistic tasks, providing the opportunity to promote, present and discuss ongoing work in the area.

Here is the accepted paper with its abstract:

Abstract: Computational Intelligence (CI) has proven its artistry in creation of music, graphics, and drawings. EvoChef demonstrates the creativity of CI in artificial evolution of culinary arts. EvoChef takes input from well-rated recipes of different cuisines and evolves new recipes by recombining the instructions, spices, and ingredients. Each recipe is represented as a property graph containing ingredients, their status, spices, and cooking instructions. These recipes are evolved using recombination and mutation operators. The expert opinion (user ratings) has been used as the fitness function for the evolved recipes. It was observed that the overall fitness of the recipes improved with the number of generations and almost all the resulting recipes were found to be conceptually correct. We also conducted a blind-comparison of the original recipes with the EvoChef recipes and the EvoChef was rated to be more innovative. To the best of our knowledge, EvoChef is the first semi-automated, open source, and valid recipe generator that creates easily to follow, and novel recipes.

Acknowledgment

This work was partially funded by the EU H2020 project Big Data Ocean (Gr. No 732310).

Looking forward to seeing you at The EvoStar 2019.

Papers, workshop and tutorials accepted at ESWC 2019

eswc2019We are very pleased to announce that our group got 2 papers accepted for presentation at the ESWC 2019: The 16th edition of The Extended Semantic Web Conference, which will be held on June 2-6, 2019 in Portorož, Slovenia.

The ESWC is a major venue for discussing the latest scientific results and technology innovations around semantic technologies. Building on its past success, ESWC is seeking to broaden its focus to span other relevant related research areas in which Web semantics plays an important role. ESWC 2019 will present the latest results in research, technologies and applications in its field. Besides the technical program organized over twelve tracks, the conference will feature a workshop and tutorial program, a dedicated track on Semantic Web challenges, system descriptions and demos, a posters exhibition and a doctoral symposium.

Here are the pre-prints of the accepted papers with their abstract:

Abstract: Attention-based encoder-decoder neural network models have recently shown promising results in goal-oriented dialogue systems. However, these models struggle to reason over and incorporate state-full knowledge while preserving their end-to-end text generation functionality. Since such models can greatly benefit from user intent and knowledge graph integration, in this paper we propose an RNN-based end-to-end encoder-decoder architecture which is trained with joint embeddings of the knowledge graph and the corpus as input. The model provides an additional integration of user intent along with text generation, trained with multi-task learning paradigm along with an additional regularization technique to penalize generating the wrong entity as output. The model further incorporates a Knowledge graph entity lookup during inference to guarantee the generated output is state-full based on the local knowledge graph provided. We finally evaluated the model using the BLEU score, empirical evaluation depicts that our proposed architecture can aid in the betterment of task-oriented dialogue system‘s performance.

Abstract: Nowadays the organization of scientific events, as well as submission and publication of papers, has become considerably easier than before. Consequently, metadata of scientific events is increasingly available on the Web, albeit often as raw data in various formats, immolating its semantics and interlinking relations. This leads to restricting the usability of this data for, e.g., subsequent analyses and reasoning. Therefore, there is a pressing need to represent this data in a semantic representation, i.e., Linked Data. We present the new release of the EVENTSKG dataset, comprising comprehensive semantic descriptions of scientific events of eight computer science communities. Currently, EVENTSKG is a 5-star dataset containing metadata of 73 top-ranked event series (about 1,950 events in total) established over the last five decades. The new release is a Linked Open Dataset adhering to an updated version of the SEO Scientific Events Ontology, a reference ontology for event metadata representation, leading to richer and cleaner data. To facilitate the maintenance of EVENTSKG and to ensure its sustainability, EVENTSKG is coupled with a Java API that enables users to create/update events metadata without going into the details of the representation of the dataset. We shed light on events characteristics by demonstrating an analysis of the EVENTSKG data, which provides a flexible means for customization in order to better understand the characteristics of top-ranked CS events.

Acknowledgment

This work was partly supported by the European Union‘s Horizon 2020 funded projects WDAqua (grant no. 642795), ScienceGRAPH project (GA no.~819536), and Cleopatra (grant no. 812997), as well as the BmBF funded project Simple-ML.

Furthermore, we are pleased to inform that we got a workshop and two tutorials accepted, which will be co-located with the ESWC 2019.

Here is the accepted workshop and tutorials with their short description:

 

  • Workshops
    • 1st Workshop on Large Scale RDF Analytics (LASCAR-19) by Hajira Jabeen, Damien Graux, Gezim Sejdiu, Muhammad Saleem and Jens Lehmann.
      Abstract: This workshop on Large Scale RDF Analytics (LASCAR) invites papers and posters related to the problems faced when dealing with the enormous growth of linked datasets, and by the advancement of semantic web technologies in the domain of large scale and distributed computing. LASCAR particularly welcomes research efforts exploring the use of generic big data frameworks like Apache Spark, Apache Flink, or specialized libraries like Giraph, Tinkerpop, SparkSQL etc. for Semantic Web technologies. The goal is to demonstrate the use of existing frameworks and libraries to exploit Knowledge Graph processing and to discuss the solutions to the challenges and issues arising therein. There will be a keynote by an expert speaker, and a panel discussion among experts and scientists working in the area of distributed semantic analytics. LASCAR targets a range of interesting research areas in large scale processing of Knowledge Graphs, like querying, inference, and analytics, therefore we expect a wider audience interested in attending the workshop.
  • Tutorials
    • SANSA’s Leap of Faith: Scalable RDF and Heterogeneous Data Lakes by Hajira Jabeen, Mohamed Nadjib Mami, Damien Graux, Gezim Sejdiu, and Jens Lehmann.
      Abstract: Scalable processing of Knowledge Graphs (KG) is an important requirement for today’s KG engineers. Scalable Semantic Analytics Stack (SANSA) is a library built on top of Apache Spark and it offers several APIs tackling various facets of scalable KG processing. SANSA is organized into several layers: (1) RDF data handling e.g. filtering, computation of RDF statistics, and quality assessment (2) SPARQL querying (3) inference reasoning (4) analytics over KGs. In addition to processing native RDF, SANSA also allows users to query a wide range of heterogeneous data sources (e.g. files stored in Hadoop or other popular NoSQL stores) uniformly using SPARQL. This tutorial, aims to provide an overview, detailed discussion, and a hands-on session on SANSA, covering all the aforementioned layers using simple use-cases.
    • Build a Question Answering system overnight by Denis Lukovnikov, Gaurav Maheshwari, Jens Lehmann, Mohnish Dubey and Priyansh Trivedi
      Abstract: With this tutorial, we aim to provide the participants with an overview of the field of Question Answering over Knowledge Graphs, insights into commonly faced problems, its recent trends, and developments. In doing so, we hope to provide a suitable entry point for the people new to this field and ease their process of making informed decisions while creating their own QA systems. At the end of the tutorial, the audience would have hands-on experience of developing a working deep learning based QA system.

 

 

 

 


Looking forward to seeing you at The ESWC 2019.

Demo and workshop papers accepted at The WEBConference (ex WWW) 2019

theweb2019We are very pleased to announce that our group got a demo paper accepted for presentation at the 2019 edition of The Web Conference (30th edition of the former WWW conference), which will be held on May 13-17, 2019, in San Francisco, US.

The 2019 edition of The Web Conference will offer many opportunities to present and discuss latest advances in academia and industry. This first joint call for contributions provides a list of the first calls for: research tracks, workshops, tutorials, exhibition, posters, demos, developers’ track, W3C track, industry track, PhD symposium, challenges, minute of madness, international project track, W4A, hackathon, the BIG web, journal track.

Here is the pre-print of the accepted paper with its abstract:

Abstract: Squerall is a tool that allows the querying of heterogeneous, large-scale data sources by leveraging state-of-the-art Big Data processing engines: Spark and Presto. Queries are posed on-demand against a Data Lake, i.e., directly on the original data sources without requiring prior data transformation. We showcase Squerall’s ability to query five different data sources, including inter alia the popular Cassandra and MongoDB. In particular, we demonstrate how it can jointly query heterogeneous data sources, and how interested developers can easily extend it to support additional data sources. Graphical user interfaces (GUIs) are offered to support users in (1) building  intra-source queries, and (2) creating required input files.

 

Furthermore, we are pleased to inform that we got a workshop paper accepted at the 5th Workshop On Managing The Evolution And Preservation of The Data Web, which will be co-located with TheWebConference 2019.

The MEPDaW’19 aims at addressing challenges and issues on managing Knowledge Graph evolution and preservation by providing a forum for researchers and practitioners to discuss, exchange and disseminate their ideas and work, to network and cross-fertilise new ideas.

Here is the accepted workshop paper with its abstract:

Abstract: Knowledge graphs are dynamic in nature, new facts about an entity are added or removed over time. Therefore, multiple versions of the same knowledge graph exist, each of which represents a snapshot of the knowledge graph at some point in time. Entities within the knowledge graph undergo evolution as new facts are added or removed. The problem of automatically generating a summary out of different versions of a knowledge graph is a long-studied problem. However, most of the existing approaches limit to pair-wise version comparison. Making it difficult to capture complete evolution out of several versions of the same graph. To overcome this limitation, we envision an approach to create a summary graph capturing temporal evolution of entities across different versions of a knowledge graph. The entity summary graphs may then be used for documentation generation, profiling or visualization purposes. First, we take different temporal versions of a knowledge graph and convert them into RDF molecules. Secondly, we perform Formal Concept Analysis on these molecules to generate summary information. Finally, we apply a summary fusion policy in order to generate a compact summary graph which captures the evolution of entities.

Acknowledgment
This research was supported by the German Ministry of Education and Research (BMBF) in the context of the project MLwin (Maschinelles Lernen mit Wissensgraphen, grant no. 01IS18050F).


 

Looking forward to seeing you at The Web Conference 2019.

Paper accepted at Knowledge-Based Systems Journal

KBS-JournalWe are very pleased to announce that our group got a paper accepted at the Knowledge-Based Systems Journal.

Knowledge-Based Systems is an international, interdisciplinary and applications-oriented journal. This journal focuses on systems that use knowledge-based (KB) techniques to support human decision-making, learning, and action; emphases the practical significance of such KB-systems; its computer development and usage; covers the implementation of such KB-systems: design process, models and methods, software tools, decision-support mechanisms, user interactions, organizational issues, knowledge acquisition and representation, and system architectures.

Here is the accepted paper with its abstract:

Abstract: Noise is often present in real datasets used for training Machine Learning classifiers. Their disruptive effects in the learning process may include: increasing the complexity of the induced models, a higher processing time and a reduced predictive power in the classification of new examples. Therefore, treating noisy data in a preprocessing step is crucial for improving data quality and to reduce their harmful effects in the learning process. There are various filters using different concepts for identifying noisy examples in a dataset. Their ability in noise preprocessing is usually assessed in the identification of artificial noise injected into one or more datasets. This is performed to overcome the limitation that only a domain expert can guarantee whether a real example is indeed noisy. The most frequently used label noise injection method is the noise at random method, in which a percentage of the training examples have their labels randomly exchanged. This is carried out regardless of the characteristics and example space positions of the selected examples. This paper proposes two novel methods to inject label noise in classification datasets. These methods, based on complexity measures, can produce more challenging and realistic noisy datasets by the disturbance of the labels of critical examples situated close to the decision borders and can improve the noise filtering evaluation. An extensive experimental evaluation of different noise filters is performed using public datasets with imputed label noise and the influence of the noise injection methods are compared in both data preprocessing and classification steps.

Paper accepted at EDBT 2019

EDBT-ICDT-LisboaWe are very pleased to announce that our group got a paper accepted for presentation at The 2019 edition of The EDBT conference, which will be held on March 26-29, 2019 – Lisbon, Portugal.

The International Conference on Extending Database Technology is a leading international forum for database researchers, practitioners, developers, and users to discuss cutting-edge ideas, and to exchange techniques, tools, and experiences related to data management.

Here is the pre-print of the accepted paper with its abstract:

Abstract: Point of Interest (POI) data constitutes the cornerstone in many modern applications. From navigation to social networks, tourism, and logistics, we use POI data to search, communicate, decide and plan our actions. POIs are semantically diverse and spatio-temporally evolving entities, having geographical, temporal, and thematic relations. Currently, integrating POI datasets to increase their coverage, timeliness, accuracy and value is a resource-intensive and mostly manual process, with no specialized software available to address the specific challenges of this task. In this paper, we present an integrated toolkit for transforming, linking, fusing and enriching POI data, and extracting additional value from them. In particular, we demonstrate how Linked Data technologies can address the limitations, gaps and challenges of the current landscape in Big POI data integration. We have built a prototype application that enables users to define, manage and execute scalable POI data integration workflows built on top of state-of-the-art software for geospatial Linked Data. This application abstracts and hides away the underlying complexity, automates quality-assured integration, scales efficiently for world-scale integration tasks, and lowers the entry barrier for end-users. Validated against real-world POI datasets in several application domains, our system has shown great potential to address the requirements and needs of cross-sector, cross-border and cross-lingual integration of Big POI data.

Acknowledgment

This work was partially funded by the EU H2020 project SLIPO (#731581).

Looking forward to seeing you at The EDBT 2019 conference.