Conceptualisation des bonnes pratiques au sein d'une communauté de pratique.
Loading...
Date
Journal Title
Journal ISSN
Volume Title
Publisher
17-11-2022
Abstract
Since the advent of the social and semantic web, in recent years new tools and sharing sites
such as Meta, Twitter, WikiHow, etc. have emerged, making the web a universal collection of
knowledge, where users geographically form communities of practice (CoP) online, these
CoPs are originally a concept of sociology but find their full development in the current web
where users share and exchange their know-how in different fields in the form of procedural
knowledge ( PK) called good practices.
These good practices are defined by a set of successive steps taken to achieve an objective.
Conceptualizing this procedural knowledge has become a major challenge in several fields
(information retrieval, intelligent applications, robotics...), knowledge extraction from data
base (KDD) is the field that is evolving to offer solutions. KDD combines different methods of
learning and knowledge representation in order to find solutions to explore unstructured data
in order to facilitate their exploitation and in this context several works have focused on the
exploration of procedural knowledge in different purposes, sometimes to create a knowledge
base or to identify instructions from procedural knowledge. Most of this work is in the field
of natural language processing, the goal we pursue is another, in this thesis we present a
new approach to extract and conceptualize good practices from the web, and extract the best
practice for a given query.
The proposed approach takes place in two phases: in the first one extracts good practices
from the web using a web scrapping method, after we represent them by oriented data graphs.
In the second phase, we extract the best practice for a given query by applying the techniques
of machine learning and text summarization on graphs. This phase takes place in three steps:
(1) search for practices similar to the user’s query, here we use the word embedding model to
identify sentences similar to the goal sought by the user; (2) Grouping and fusion of similar
steps, where we use unsupervised learning (DBScan) and text summarization (PageRank)
techniques to group semantically close nodes that we merge in the same step; (3) Extraction
of the best practice that is identified by the path of the graph traversing the most important
steps to reach the objective, this importance is calculated by measures of centrality of the
graphs which quantify the importance of the nodes in a graph oriented by the number of
their incoming and outgoing arc.
The results obtained demonstrated the superiority of our approach for: (1) capturing practices similar to the goal sought by the user, and this by optimizing the execution time, (2)
extracting the best practices for queries compared to a search engine from a real data set.
Description
Citation
salle des thèses