Liste de toutes les manifestations à venir et passées proposées par l'équipe SPARKS.
Liste des manifestations passées
Thursday 5 July 2018 10:00 to 11:00
SPARQL query optimization, suggestion and its application
In this presentation, I would like to talk about my previous work on SPARQL query processing and one of its application in the domain of financial regulation. Firstly, I will talk about the work on selectivity estimation for RDF graph patterns, which is fundamental to SPARQL query processing. The previous work took the join uniformity assumption when estimating joined triple patterns, which leads to highly inaccurate estimation in the cases where properties in RDF graph patterns are correlated. We take into account the dependencies among properties in graph patterns and propose a more accurate estimation model. Then, I will talk about our work on query suggestion on databases. Existing query suggestion methods over databases score candidate suggestions individually and return the top-k best of them. However, the top-k suggestions may have high redundancy with respect to the topics. To provide informative suggestions, the returned k suggestions are expected to be diverse, i.e., maximizing the relevance to the user query and the diversity with respect to topics that the user might be interested in simultaneously. We proposed an objective function considering both factors is defined for evaluating a suggestion set. At last, I will present a recent work on financial regulation. We transform legal rules and business activities to a set of SPARQL queries and RDF data graphs so that we are able to automate the process of compliance checking issue by posing SPARQL queries on RDF activity data graphs.
Wednesday 2 May 2018 10:00 to 12:00
High-Accuracy Semantic Information Extraction in Rakuten
In this talk Martin Rezk will first introduce Rakuten and Rakuten Institute of Technology, and give a brief overview about different approaches to semantic information extraction and sharing in our company: from finding attributes and values from product descriptions and images to chat-bots. After this general introduction, Martin Rezk will present a bootstrapping approach for attribute value extraction from text that minimizes human intervention. This approach automatically extracts attribute names and values from semi-structured text, generates a small labelled dataset, and bootstraps it by extracting new values from unstructured text using different machine learning algorithms. The proposed framework is domain/language-independent and it is currently running for two languages: Japanese and German.
Research interests: Ontology Based Data Access for data integration, query optimization, Knowledge Representation and Reasoning, and Natural Language Processing.
Thursday 12 April 2018 15:00 to 16:00
Information visualization techniques for exploring multidimensional graphs.
In this presentation we will discuss the use of visualization techniques for exploring multidimensional data and more specifically, multidimensional graphs. Whilst visualization techniques are a suitable alternative for helping users to gain knowledge about the internal structure of abstract data and causal relationships in it, informational items stored in multidimensional graphs are often complex and cannot be easily displayed at once. We present a tool called MGExplorer which is aimed at helping users to explore large data sets using visualizations for the exploration of multidimensional graphs. For that MGExplorer implements an exploratory process that allow users to inspect and compare data subsets in chained views. Moreover, MGExplorer that allows the comparison of data subsets using four different information visualization techniques named Node-Edge diagram, ClusterVis, GlyphMatrix and IRIS. Moreover, MGExplorer was conceived as an open framework (built on the top of framework D3) allowing extensions and the possible inclusion of many other information visualization techniques. One of the ultimate goals of MGExplorer is to support the development of search engines based on visual queries.
Thursday 8 March 2018 15:00 to 17:00
Une Ligne de workflows de Machine Learning : ROCKFlows
Vous avez entendu parlé de ROCKFlows et surtout de ses objectifs. Aujourd’hui les briques de base de la ligne ont été construites. Vous pouvez déjà questionner le système pour obtenir des WF en classification. Dans cet exposé, après un bref rappel des objectifs de RF, nous présenterons : - Son architecture et surtout les nombreuses questions de recherche autour de la comparaison de WF auxquelles elle vise à répondre; - Le portfolios et la fabrique d’expérimentation; - Nos perspectives à court et long terme et toutes les questions passionnantes soulevées par l’approche ROCKFlows.
Thursday 22 February 2018 14:00 to 16:00
User-centric approaches for streaming VR
Streaming Virtual Reality (VR), even under the mere form of 360°-videos, is much more complex than for regular videos because, to lower the required rates, the transmission decisions must take the user’s head position into account. The way the user exploits her/his freedom is therefore crucial for the network load. In turn, the way the user moves depends on the video content itself.
VR is how ever a whole new medium, for which the film-making language does not exist yet, its “grammar” only being invented. We present a strongly inter-disciplinary approach to improve the streaming of 360°-videos: designing high-level content manipulations (film editing) to limit and even control the user’s motion in order to consume less bandwidth while maintaining the user’s experience.
We will discuss in particular the approaches to defining the so-called Quality of Experience in the networking community for video streaming, and the way we have carried out user experiments to prove our hypothesis right. Two sets of user experiments enabled to show that editing indeed impacts head velocity (reduction of up to 30%),
consumed bandwidth (reduction of up to 25%) and subjective assessment. User’s attention driving tools from other communities can hence be designed in order to improve streaming, which opens up the path to a whole new field of possibilities in defining degrees of freedom to be wielded for VR streaming optimization.
Thursday 23 November 2017 15:00 to 16:00
From chocolate cake to biodiversity facts: Google rich snippets, schema.org and Linked Data...
In recent years, major search engines have promoted the marking up of web pages using vocabularies such as schema.org. These data not only helps them understand and rank better web pages, but more importantly search engines now leverage these data to literally respond to our searches, typically in the form of rich snippets.
In this talk, I will take the example of Google rich snippets to discuss the relationship between markup data and Linked Open Data, with a specific focus on biodiversity information. I will also show how extensions of schema.org could dramatically improve the discovery of biodiversity data on the web, and in turn, foster now data integration scenarios.
Friday 12 May 2017 14:00 to 15:30
Abstract Interpretation 101
Abstract Interpretation [Cousot/Cousot POPL77] is a very useful framework to infer properties of programs, which has shown its effectivity in proving safety of critical code like airplane controlers. In this talk Laure Gonnord makes an Abstract Interpretation scratch course, from the very begining in the late 70's to some recent developments, from formal verification to code optimisation. Laure Gonnord demonstrates that designing abstract domains tailored for specific applications can be fun and easy!
Friday 3 March 2017 10:30 to 12:00
Datalog revisited for reasoning in Linked Data
Linked Data provides access to huge, continuously growing amounts of open data and ontologies in RDF format that describe entities, links and properties on those entities. Equipping Linked Data with inference paves the way to make the Semantic Web a reality.
In this presentation, I will describe a unifying framework for RDF ontologies and databases that we call deductive RDF triplestores. It consists in equipping RDF triplestores with Datalog inference rules. This rule language allows to capture in a uniform manner OWL constraints that are useful in practice, such as property transtivity or symmetry, but also domain-specific rules with practical relevance for users in many domains of interest. I will illustrate the expressivity of this framework for modeling Linked Data applications and its genericity for developing inference algorithms. In particular, we will show how it allows to model the problem of data linkage in Linked Data as a reasoning problem on possibly decentralized data. I will also explain how it makes possible to efficiently extract expressive modules from Semantic Web ontologies and databases with formal guarantees, whilst effectively controlling their succinctness. Experiments conducted on real-world datasets have demonstrated the feasibility of this approach and its usefulness in practice for data integration and information extraction.
Thursday 5 January 2017 14:00
Construction d'un projet I3S/Géoazur autour des signaux sismologiques
COMRED, MDSC, SIS, SPARKS
Nous souhaiterions développer un projet entre Géoazur et I3S pour créer un programme qui permettra de détecter, extraire et ranger dans une librairie tous ces signaux. Au cours de ce séminaire, je vous présenterai les différents types de signaux que nous utilisons et ce que nous en faisons, les autres signaux qui sont enregistrés par les stations en mer, pour ensuite développer ce que nous souhaiterions faire grâce à vos compétences.
Tuesday 8 November 2016 10:00 to 12:00
"Learning slowly to learn better: curriculum learning for legal ontology population" presented by Milagro Teruel
Monday 20 June 2016 14:00 to 15:30
Precise Model Composition Interfaces with Instantiation Cardinalities
Separation of concerns and multi-view modelling advocate that structural and behavioural properties of different facets of a system be modularized within separate models. To understand, analyze, simulate, execute or generate combined structure and behaviour from such separate modules, many model composition operators and approaches have been proposed. However, when the functionality provided by a module is needed repeatedly in a composed system, it is often not clear how the existing composition operators and approaches are to be applied to the models that are to be composed, and what the expected composed result will look like. This talk introduces instantiation cardinalities, a mechanism that removes potential ambiguities for developers that use model composition to integrate separate structure and behaviour. Instantiation cardinalities give the model designer fine-grained control about how many instances of each structural and behavioural element contained in a model are to be created in the target model during model composition. I will illustrate the approach by presenting the aspect-oriented design of a behavioural, a structural and a creational design pattern.
Friday 17 June 2016 11:00 to 12:30
Challenges for making the Semantic Web affordable to industries in the Energy sector. Sharing our experience in the European Smart Energy Aware Systems project.
The ITEA3 12004 Smart Energy Aware Systems aims at developing a global ecosystem of distributed services that all target energy efficiency. The Ecole des Mines de Saint-Etienne was originally part of the ambitious Knowledge Modeling task, which was to ground the model on Semantic Web formalisms. Soon, industrial and academic partners also faced the challenge to make the best out of the developed ontologies. This especially includes adapting their legacy Web services, sensors, and actuators, at the lowest possible costs. In this talk, I will first share our experiences on these different challenges. Then, I will focus on two important results that aim at lowering the cost of making legacy systems interoperable on the Web:
1. How to concisely represent data with custom datatypes, while ensuring RDF processors and SPARQL engines can discover them on the fly, and process them uniformly ?
2. How to query an RDF dataset, along with a set of documents with heterogeneous formats ?
3. How to enable servers and clients to use legacy (potentially very lightweight) data representation formats while ensuring one can still interpret messages as RDF.
Friday 27 May 2016 14:00 to 15:30
Concern-Oriented Reuse, and how to Support Delaying of Decisions
Concern-Oriented Reuse (CORE) is a new paradigm aimed at maximizing software reuse, in which software development is structured around new modules called concerns. Techniques from model-driven engineering (MDE), software product line engineering (SPL) and software composition (in particular aspect-orientation) allow concerns to encapsulate a variety of solutions (requirements, architecture, design, implementation models and code) to recurring software development problems in a versatile, generic way. Reusing the encapsulated solutions is streamlined through well-defined interfaces: a) the variation interface, which exposes the variants offered by the concern and allow the user to reason about the impact of selecting a solution on high-level goals and system properties, b) the customization interface, which designates the partial, generic elements of a solution which need to be adapted to the specific reuse context in order to be usable, and c) the usage interface, which allows the user to access the functionality offered by the concern.
In this talk I will briefly explain the main idea and concepts of CORE, and then focus on one of the most exciting advantages of CORE: the support for delayed decision making. With CORE, high-level decisions (e.g. architectural decisions) can be made at one point in time in order to allow development to continue, while low-level decisions (e.g., implementation choices) can be deferred to a later point when more detailed requirements are known. Likewise, when building a concern by reusing other low-level concerns, CORE allows developers to make partial feature selections from the reused concern's variation interface. Using the partial selection, the CORE tool generates partial customization and partial usage interfaces for the concern that the developer can use to finalize their development. Any variants encapsulated by the reused concern that are identified by the developer as potentially usable solutions in the current context are reexposed at the interface of the concern being designed. I will present and demonstrate the advanced software composition algorithms that make delaying of decisions possible in CORE: feature model composition, impact model composition, (class-, sequence- and state diagram composition if time permits), as well as the overall software composition algorithm that collects all decisions made at different points in time to generate the final application.
Thursday 19 May 2016 14:00 to 15:30
Adaptation Dynamique : des processus métiers à l'environnement opérationnel. Application à la continuité de services ambiants.
Adaptation Dynamique : des processus métiers à l'environnement opérationnel. Application à la continuité de services ambiants.
Tuesday 17 May 2016 10:00 to 11:30
Deep learning and weak supervision for image classification
Deep learning and Convolutional Neural Networks (CNN) are state-of-the-art methods for various visual recognition tasks, e.g. image classification or object detection. To better identify or localize objects, bounding box annotations are often used. These rich annotations quickly become too costly to get, making the development of Weakly Supervised Learning (WSL) models appealing.
We discuss several strategies to automatically select relevant image regions from weak annotations (e.g. image-level labels) in deep CNN. We also introduce our architecture WELDON for WEakly supervised Learning of Deep cOnvolutional neural Networks.
Our deep learning framework, leveraging recent improvements on the Multiple Instance Learning paradigm, is validated on several recognition tasks.
Tuesday 3 May 2016 15:00 to 16:30
Trying to make the best out of the least: experiments on optimizing supervision in machine learning approaches for Spanish NLP
Tuesday 22 March 2016 11:00 to 12:30
Exposing heterogeneous legacy data on the Web of Data
The Web of Data is now emerging thanks to the publication and interlinking of datasets on the Web. A key enabler to its advent is the ability to translate legacy data into RDF, data representation language of the Semantic Web. In this talk, I will first shortly recall the principles of the Web of Data. Then, I will present an overview of the questions that come up when translating legacy into RDF, notably the reuse vs. creation of vocabularies, and I will make a specific focus on methods aimed at translating (i) relational data into RDF, and (ii) other types of data stores such as NoSQL systems.
Tuesday 1 March 2016 14:00 to 15:30
Visualisation des données liées
- Olivier Corby présentera une technologie permettant d'engendrer des navigateurs hypertexte sur le Web données, qui repose sur un SPARQL endpoint augmenté avec le langage STTL de transformation de graphes RDF, une extension de SPARQL développée dans l'équipe.
- Erwan Demairy présentera un plugin basé sur Corese permettant de visualiser dans Gephi des graphes sémantiques.
- Raphael Boyer présentera ses travaux sur la visualisation de données d'historique de DBpedia, affichées à l'aide d'AngularJS, et sur la visualisation 3D de graphes RDF en utilisant le moteur de jeu Unity.
- Emilie Palagi présentera ses travaux sur la reconception des IHM de Discovery Hub, un moteur de recherche exploratoire sur DBpedia développé dans l'équipe. Ces IHM sont dites "explicatives": elles permettent l'explicitation et la compréhension par les utilisateurs des relations entre la requête et un résultat.
Thursday 11 February 2016 11:00 to 12:30
CORESE Semantic Web platform
This talk will introduce this platform multiples capabilities: support of W3C standards (RDF, RDFS, SPARQL 1.1) and research extensions (inference rules language, RDF graph transformation language, SPARQL functional extensions…), and discuss its application domain.
Tuesday 12 January 2016 10:00 to 11:30
Agents BDI possibilistes
Le prochain séminaire SPARKS de la thématique FORUM aura lieu mardi 12 janvier 2016 à 10h en salle du conseil.
Celia, Serena et Andrea nous présenteront trois travaux sur les agents BDI possibilistes.
Thursday 17 December 2015 14:00 to 16:00
Things heard at Supercomputing 2015: the foreseeable future of computing technologies
Computing power has evolved exponentially over the last 40 years following the technological development of CMOS technologies. The computing power growth rate was surprisingly steady for several decades. However, there are several indicators announcing a very near (if not already happening) slow down in computing capacity growth, at a time where demand for computing as never been as high. Alternatives to the existing technology are not clear in spite of recent progresses in different areas, new materials, nanotechnologies, quantum computing, neuromorphic computing… This talk will discuss the current status and investigates what the future might be.
Friday 4 December 2015 10:00 to 12:00
Prédiction des associations microARN-maladie par analyse distributionnelle
Depuis quelques années, le dogme central de la biologie moléculaire (ADN → ARN → protéine) est secoué par les découvertes de plus en plus nombreuses de brins d'ARN non codants (ARNnc) qui jouent un rôle critique dans de nombreux processus physiologiques. Les dérégulations de ces ARNnc sont aussi étroitement liées au développement et la progression de diverses maladies humaines, y compris le cancer.
L'idée de base de mon travail est de traiter les données disponibles sur les ARNnc pour en faire émerger de nouvelles connaissances sur les fonctions de ces molécules et accroître notre compréhension des mécanismes de pathogénicité. Je me concentre, dans un premier temps, sur l'étude d'ARNnc particuliers appelés micro ARN (miARN).
L'hypothèse de départ est que l'analyse distributionnelle peut être utilisée pour révéler de nouvelles informations attachées aux miARN. Mon approche consiste à combiner des données textuelles et factuelles dans un espace vectoriel de grande dimension et de définir les associations entre miARN et maladie en terme de similarités de vecteurs.
Des validations croisées effectuées sur différents ensembles de données démontrent l'excellente performance de cette approche. De plus, une étude détaillée portant sur le cancer du sein confirme la capacité de la méthode à découvrir de nouvelles associations miARN-maladie et à identifier de potentielles fausses associations stockées dans les bases de données ou décrites dans des articles.
Tuesday 17 November 2015 16:00 to 17:30
Visulab, un outil pour la recherche
Thursday 24 September 2015 14:00 to 15:00
Distributed Data Management
Jean-Pierre Lozi présente ses travaux sur la gestion des données distribuées, la synchronisation et la cohérence de cache dans les architectures multicœurs.