MOSIG Master 2ND YEAR Research
YEAR 2020–2021

Fixpoint semantics for relational concept analysis

Master topic / Sujet de master recherche

Relational concept analysis is an extension of formal concept analysis that can extract simultaneously several interrelated concepts. Although, in absence of circularities, it provides the single grounded solutions; circular dependent concept descriptions call for a fixed-point semantics.

Formal concept analysis (FCA, [Ganter 1999]) is a data mining approach that identifies concepts within a data set. More precisely, from a set of data items, e.g. Researchers, identified by attributes, e.g. topic, age, it finds out concepts which are characterized by the set of objects sharing a set of attributes, e.g. Researchers in artificial intelligence of less than 25 years old.

Relational concept analysis (RCA, [Rouane-Hacene 2013]) enables to find interdependent concepts within related data sets, i.e. that Researchers are also characterised by the Papers they wrote and the Labs they belong to which, in turns, depend on other Researchers. For that purpose, it splits the data in related datasets (here Researchers, Papers, Labs) which are processed individually by formal concept analysis. The resulting concepts are then injected back to the initial FCA problems with the help of scaling operators and formal concept analysis is performed again on these problems, for instance new attributes would be: has-some-members-as-Reserachers-in-AI-of-less-than-25-years-old, or wrote-all-her-papers-with-AI-Researchers. This process is iterated until it reaches a fixed point, i.e. until no more concepts are added.

We use such techniques, in the ANR Elker project, to extract sameAs links in RDF data sets [Atencia 2019]. This led us to think that the semantics of RCA is underspecified.

As seen above, RCA is dealing with concepts which may eventually be defined in function of themselves. Circular dependencies are commonplace in ontologies, e.g. the class Person depending to itself through the attribute 'parent''. The use of a relation and its inverse ('authorOf'-'hasWritten') is already a simple case of circular dependency. Similarly, circular dependencies across link keys raise the problem of self-supported links, i.e. 'EEUU' is the same as 'USA' if 'Nueva York' is the same as 'New-York' which relies on 'EEUU' being the same as 'USA'. This allows for not-well-grounded definitions (when there is no circular dependency, there is only one fixed-point).

Since RCA had been initially defined for extracting description logic concepts, it naturally adopted the standard description logic semantics [Baader 2007]: the least fixed point semantics which only retain what holds in all fixed points. However, link key extraction is an inductive task rather than a deductive task. It may be worth generating as many links as possible, so self-supported links should not be discarded upfront. It is thus interesting to consider the greatest fixed point for link keys (which are not concept description in the description logic sense).

The goal of this topic is to investigate the definition of relational concept analysis and to define a fixed-point semantics for it. This also involves considering how classical FCA/RCA algorithms may be adapted to use different semantics. It could be useful for each formal contexts, to specify the semantics under which it has to be interpreted. In addition, it may be interesting considering how such semantics relate to each others.

References:

[Atencia 2019] Manuel Atencia, Jérôme David, Jérôme Euzenat, Amedeo Napoli, Jérémy Vizzini, Link key candidate extraction with relational concept analysis, Discrete applied mathematics, 2019
[Baader 2007] Franz Baader, Deborah McGuinness, Daniele Nardi, Peter Patel-Schneider (eds.), The description logic handbook: Theory, implementation, and applications, 2nd ed., Cambridge university press, Cambriddge (UK), 2007
[Ganter 1999] Bernhard Ganter, Rudolf Wille, Formal concept analysis: mathematical foundations, Springer, Berlin (DE), 1999
[Hacene 2013a] Mohamed Rouane Hacene, Marianne Huchard, Amedeo Napoli, Petko Valtchev, Relational concept analysis: mining concept lattices from multi-relational data, Annals of Mathematics and Artificial Intelligence

Links:

mOeX web site: https://moex.inria.fr
Elker web site: https://project.inria.fr/elker/

Reference number: Proposal n°2749

Master profile: M2R MOSIG, M2R MSIAM or M2R Informatics

Advisor: Jérôme Euzenat (Jerome:Euzenat#inria:fr) and Jérôme David (Jerome:David#inria:fr).

Team: The work will be carried out in the mOeX team common to INRIA & Université Grenoble Alpes. mOeX is dedicated to study knowledge evolution through adaptation. It gather permanent researchers from the Exmo team which has taken an active part these past 15 years in the development of the semantic web and more specifically ontology matching.

Laboratory: LIG.

Procedure: Contact us and provide vitæ and possibly motivation letter and references.