-
An integrated framework of learning and evidential reasoning for user profiling using short texts
- Back
Document Title
An integrated framework of learning and evidential reasoning for user profiling using short texts
Author
Vo DV, Karnjana J, Huynh V
Name from Authors Collection
Affiliations
Japan Advanced Institute of Science & Technology (JAIST); National Science & Technology Development Agency - Thailand; National Electronics & Computer Technology Center (NECTEC)
Type
Article
Source Title
INFORMATION FUSION
ISSN
1566-2535
Year
2021
Volume
70
Issue
3
Open Access
hybrid
Publisher
ELSEVIER
DOI
10.1016/j.inffus.2020.12.004
Format
Abstract
Inferring user profiles based on texts created by users on social networks has a variety of applications in recommender systems such as job offering, item recommendation, and targeted advertisement. The problem becomes more challenging when working with short texts like tweets on Twitter, or posts on Facebook. This work aims at proposing an integrated framework based on Dempster-Shafer theory of evidence, word embedding, and kappa-means clustering for user profiling problem, which is capable of not only working well with short texts but also dealing with uncertainty inherently in user texts. The proposed framework is essentially composed of three phases: (1) Learning abstract concepts at multiple levels of abstraction from user corpora; (2) Evidential inference and combination for user modeling; and (3) User profile extraction. Particularly, in the first phase, a word embedding technique is used to convert preprocessed texts into vectors which capture semantics of words in user corpus, and then kappa-means clustering is utilized for learning abstract concepts at multiple levels of abstraction, each of which reflects appropriate semantics of user profiles. In the second phase, by considering each document in user corpus as an evidential source that carries some partial information for inferring user profiles, we first infer a mass function associated with each user document by maximum a posterior estimation, and then apply Dempster's rule of combination for fusing all documents' mass functions into an overall one for the user corpus. Finally, in the third phase, we apply the so-called pignistic probability principle to extract top-n keywords from user's overall mass function to define the user profile. Thanks to the ability of combining pieces of information from many documents, the proposed framework is flexible enough to be scaled when input data coming from not only multiple modes but different sources on web environments. Besides, the resulting profiles are interpretable, visualizable, and compatible in practical applications. The effectiveness of the proposed framework is validated by experimental studies conducted on datasets crawled from Twitter and Facebook.
Industrial Classification
Knowledge Taxonomy Level 1
Knowledge Taxonomy Level 2
Knowledge Taxonomy Level 3
Funding Sponsor
US Office of Naval Research Global [N62909-19-1-2031]
License
CC BY-NC-ND
Rights
Authors
Publication Source
WOS