Researcher
Enterprise Information Systems
Fraunhofer IAIS

Profiles: LinkedInGoogle ScholarDBLP, Website

Room B3-236
Schloss Birlinghoven, 53757 Sankt Augustin, Germany
m…@cs.uni-bonn.de (click to get the full email address)

Short CV


Mohamed Nadjib Mami is a Researcher at the Enterprise Information Systems of the Fraunhofer IAIS.

Research Interests


  • Big Data
  • Data Management
  • Heterogeneous Databases
  • Linked Data

Projects


Publications


2018

  • G. Sejdiu, I. Ermilov, J. Lehmann, and M. Nadjib-Mami, “DistLODStats: Distributed Computation of RDF Dataset Statistics,” in Proceedings of 17th International Semantic Web Conference, 2018.
    [BibTeX] [Abstract] [Download PDF]
    Over the last years, the Semantic Web has been growing steadily. Today, we count more than 10,000 datasets made available online following Semantic Web standards. Nevertheless, many applications, such as data integration, search, and interlinking, may not take the full advantage of the data without having a priori statistical information about its internal structure and coverage. In fact, there are already a number of tools, which offer such statistics, providing basic information about RDF datasets and vocabularies. However, those usually show severe deficiencies in terms of performance once the dataset size grows beyond the capabilities of a single machine. In this paper, we introduce a software component for statistical calculations of large RDF datasets, which scales out to clusters of machines. More specifically, we describe the first distributed in-memory approach for computing 32 different statistical criteria for RDF datasets using Apache Spark. The preliminary results show that our distributed approach improves upon a previous centralized approach we compare against and provides approximately linear horizontal scale-up. The criteria are extensible beyond the 32 default criteria, is integrated into the larger SANSA framework and employed in at least four major usage scenarios beyond the SANSA community.

    @InProceedings{sejdiu-2018-dist-lod-stats-iswc,
    Title = {Dist{LODS}tats: {D}istributed {C}omputation of {RDF} {D}ataset {S}tatistics},
    Author = {Sejdiu, Gezim and Ermilov, Ivan and Lehmann, Jens and Nadjib-Mami, Mohamed},
    Booktitle = {Proceedings of 17th International Semantic Web Conference},
    Year = {2018},
    Abstract = {Over the last years, the Semantic Web has been growing steadily. Today, we count more than 10,000 datasets made available online following Semantic Web standards.
    Nevertheless, many applications, such as data integration, search, and interlinking, may not take the full advantage of the data without having a priori statistical information about its internal structure and coverage.
    In fact, there are already a number of tools, which offer such statistics, providing basic information about RDF datasets and vocabularies.
    However, those usually show severe deficiencies in terms of performance once the dataset size grows beyond the capabilities of a single machine.
    In this paper, we introduce a software component for statistical calculations of large RDF datasets, which scales out to clusters of machines.
    More specifically, we describe the first distributed in-memory approach for computing 32 different statistical criteria for RDF datasets using Apache Spark.
    The preliminary results show that our distributed approach improves upon a previous centralized approach we compare against and provides approximately linear horizontal scale-up.
    The criteria are extensible beyond the 32 default criteria, is integrated into the larger SANSA framework and employed in at least four major usage scenarios beyond the SANSA community.},
    Keywords = {2018 bde sejdiu lehmann group_aksw iermilov},
    Url = {http://jens-lehmann.org/files/2018/iswc_distlodstats.pdf}
    }

  • G. Sejdiu, I. Ermilov, J. Lehmann, and M. Mami, “STATisfy Me: What are my Stats?,” in Proceedings of 17th International Semantic Web Conference, Poster & Demos, 2018.
    [BibTeX] [Download PDF]
    @InProceedings{sejdiu-2018-statisfy-iswc-poster,
    Title = {{STAT}isfy {M}e: {W}hat are my {S}tats?},
    Author = {Sejdiu, Gezim and Ermilov, Ivan and Lehmann, Jens and Mami, Mohamed-Nadjib},
    Booktitle = {Proceedings of 17th International Semantic Web Conference, Poster \& Demos},
    Year = {2018},
    Keywords = {2018 bde sejdiu lehmann group_aksw iermilov mami},
    Url = {http://jens-lehmann.org/files/2018/iswc_statisfy_pd.pdf}
    }

2017

  • S. Auer, S. Scerri, A. Versteden, E. Pauwels, A. Charalambidis, S. Konstantopoulos, J. Lehmann, H. Jabeen, I. Ermilov, G. Sejdiu, A. Ikonomopoulos, S. Andronopoulos, M. Vlachogiannis, C. Pappas, A. Davettas, I. A. Klampanos, E. Grigoropoulos, V. Karkaletsis, V. de Boer, R. Siebes, M. N. Mami, S. Albani, M. Lazzarini, P. Nunes, E. Angiuli, N. Pittaras, G. Giannakopoulos, G. Argyriou, G. Stamoulis, G. Papadakis, M. Koubarakis, P. Karampiperis, A. N. Ngomo, and M. Vidal, “The BigDataEurope Platform – Supporting the Variety Dimension of Big Data,” in 17th International Conference on Web Engineering (ICWE2017), 2017.
    [BibTeX] [Abstract] [Download PDF]
    The management and analysis of large-scale datasets — described with the term Big Data — involves the three classic dimensions volume, velocity and variety. While the former two are well supported by a plethora of software components, the variety dimension is still rather neglected. We present the BDE platform — an easy-to-deploy, easy-to-use and adaptable (cluster-based and standalone) platform for the execution of big data components and tools like Hadoop, Spark, Flink. The BDE platform was designed based upon the requirements gathered from the seven societal challenges put forward by the European Commission in the Horizon 2020 programme and targeted by the BigDataEurope pilots. As a result, the BDE platform allows to perform a variety of Big Data flow tasks like message passing (Kafka, Flume), storage (Hive, Cassandra) or publishing (GeoTriples). In order to facilitate the processing of heterogeneous data, a particular innovation of the platform is the semantic layer, which allows to directly process RDF data and to map and transform arbitrary data into RDF.

    @InProceedings{Auer+ICWE-2017,
    Title = {{T}he {B}ig{D}ata{E}urope {P}latform - {S}upporting the {V}ariety {D}imension of {B}ig {D}ata},
    Author = {S\"oren Auer and Simon Scerri and Aad Versteden and Erika Pauwels and Angelos Charalambidis and Stasinos Konstantopoulos and Jens Lehmann and Hajira Jabeen and Ivan Ermilov and Gezim Sejdiu and Andreas Ikonomopoulos and Spyros Andronopoulos and Mandy Vlachogiannis and Charalambos Pappas and Athanasios Davettas and Iraklis A. Klampanos and Efstathios Grigoropoulos and Vangelis Karkaletsis and Victor de Boer and Ronald Siebes and Mohamed Nadjib Mami and Sergio Albani and Michele Lazzarini and Paulo Nunes and Emanuele Angiuli and Nikiforos Pittaras and George Giannakopoulos and Giorgos Argyriou and George Stamoulis and George Papadakis and Manolis Koubarakis and Pythagoras Karampiperis and Axel-Cyrille Ngonga Ngomo and Maria-Esther Vidal},
    Booktitle = {17th International Conference on Web Engineering (ICWE2017)},
    Year = {2017},
    Abstract = {The management and analysis of large-scale datasets -- described with the term Big Data -- involves the three classic dimensions volume, velocity and variety. While the former two are well supported by a plethora of software components, the variety dimension is still rather neglected. We present the BDE platform -- an easy-to-deploy, easy-to-use and adaptable (cluster-based and standalone) platform for the execution of big data components and tools like Hadoop, Spark, Flink. The BDE platform was designed based upon the requirements gathered from the seven societal challenges put forward by the European Commission in the Horizon 2020 programme and targeted by the BigDataEurope pilots. As a result, the BDE platform allows to perform a variety of Big Data flow tasks like message passing (Kafka, Flume), storage (Hive, Cassandra) or publishing (GeoTriples). In order to facilitate the processing of heterogeneous data, a particular innovation of the platform is the semantic layer, which allows to directly process RDF data and to map and transform arbitrary data into RDF.},
    Bdsk-url-1 = {http://svn.aksw.org/lod2/Paper/ISWC2012-InUse_LOD2-Stack/public.pdf},
    Date-modified = {2012-12-02 12:25:29 +0000},
    Keywords = {group_aksw sys:relevantFor:infai sys:relevantFor:bis 2017 auer iermilov ngonga lehmann bde MOLE},
    Url = {http://jens-lehmann.org/files/2017/icwe_bde.pdf}
    }

  • F. Karim, M. N. Mami, M. Vidal, and S. Auer, “Large-scale storage and query processing for semantic sensor data.,” in WIMS, 2017, p. 8:1-8:12.
    [BibTeX] [Download PDF]
    @InProceedings{conf/wims/KarimMVA17,
    Title = {Large-scale storage and query processing for semantic sensor data.},
    Author = {Karim, Farah and Mami, Mohamed Nadjib and Vidal, Maria-Esther and Auer, Sören},
    Booktitle = {WIMS},
    Year = {2017},
    Editor = {Akerkar, Rajendra and Cuzzocrea, Alfredo and Cao, Jannong and Hacid, Mohand-Said},
    Pages = {8:1-8:12},
    Publisher = {ACM},
    Added-at = {2017-08-16T00:00:00.000+0200},
    Biburl = {https://www.bibsonomy.org/bibtex/2d5202edc4bea31e063ad5bf341cc749f/dblp},
    Crossref = {conf/wims/2017},
    Ee = {http://doi.acm.org/10.1145/3102254.3102260},
    Interhash = {2a3b140a44a8bf5c775af34054cb6c0e},
    Intrahash = {d5202edc4bea31e063ad5bf341cc749f},
    ISBN = {978-1-4503-5225-3},
    Keywords = {dblp},
    Timestamp = {2017-08-17T11:38:25.000+0200},
    Url = {http://dblp.uni-trier.de/db/conf/wims/wims2017.html#KarimMVA17}
    }

  • K. Endris, M. Galkin, I. Lytra, M. N. Mami, M. Vidal, and S. Auer, “MULDER: Querying the Linked Data Web by Bridging RDF Molecule Templates.” 2017.
    [BibTeX] [Download PDF]
    @InProceedings{Endris2017,
    Title = {MULDER: Querying the Linked Data Web by Bridging RDF Molecule Templates},
    Author = {Kemele Endris and Mikhail Galkin and Ioanna Lytra and Mohamed Nadjib Mami and Maria-Esther Vidal and S{\"{o}}ren Auer},
    Year = {2017},
    Bdsk-url-1 = {https://www.researchgate.net/publication/318362785_MULDER_Querying_the_Linked_Data_Web_by_Bridging_RDF_Molecule_Templates},
    Crossref = {DEXA2017-1},
    File = {https://github.com/EIS-Bonn/Papers/blob/master/2017/DEXA_Mulder/MulderDEXA2017_CR.pdf},
    Numpages = {15},
    Pubs = {mgalkin,ilytra,endris,auer,vidal},
    Timestamp = {2017.10.12},
    Url = {https://www.researchgate.net/publication/318362785_MULDER_Querying_the_Linked_Data_Web_by_Bridging_RDF_Molecule_Templates}
    }

2016

  • M. N. Mami, S. Scerri, S. Auer, and M. -, “Towards Semantification of Big Data Technology,” in Big Data Analytics and Knowledge Discovery – 18th International Conference, DaWaK 2016, Porto, Portugal, September 6-8, 2016, Proceedings, 2016, pp. 376-390. doi:10.1007/978-3-319-43946-4_25
    [BibTeX] [Download PDF]
    @InProceedings{Mami2016,
    Title = {Towards Semantification of Big Data Technology},
    Author = {Mohamed Nadjib Mami and Simon Scerri and S{\"{o}}ren Auer and Maria{-}Esther Vidal},
    Booktitle = {Big Data Analytics and Knowledge Discovery - 18th International Conference, DaWaK 2016, Porto, Portugal, September 6-8, 2016, Proceedings},
    Year = {2016},
    Pages = {376--390},
    Publisher = {Springer},
    Bibsource = {dblp computer science bibliography, http://dblp.org},
    Biburl = {http://dblp.uni-trier.de/rec/bib/conf/dawak/MamiSAV16},
    Crossref = {DBLP:conf/dawak/2016},
    Doi = {10.1007/978-3-319-43946-4_25},
    File = {https://github.com/EIS-Bonn/Papers/raw/65f5ed535a8f7035e088653113f456837b332a09/2015/Semantifying_Big_Data/SEMANTiCS_2015_Research_Track_submission_107.pdf},
    Pubs = {mami,vidal,scerri,auer},
    Timestamp = {Mon, 08 Aug 2016 14:53:45 +0200},
    Url = {http://dx.doi.org/10.1007/978-3-319-43946-4_25}
    }