SPTF: A Scalable Probabilistic Tensor Factorization Model for Semantic-Aware Behavior Prediction
File version
Author(s)
Chen, Hongxu
Sun, Xiaoshuai
Wang, Hao
Wang, Yang
Quoc, Viet Hung Nguyen
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Raghavan, V
Aluru, S
Karypis, G
Miele, L
Wu, X
Date
Size
File type(s)
Location
New Orleans, LA
License
Abstract
With the rapid rise of various e-commerce and social network platforms, users are generating large amounts of heterogeneous behavior data, such as purchasehistory, adding-to-favorite, adding-to-cart and click activities, and this kind of user behavior data is usually binary, only reflecting a user's action or inaction (i.e., implicit feedback data). Tensor factorization is a promising means of modeling heterogeneous user behaviors by distinguishing different behavior types. However, ambiguity arises in the interpretation of the unobserved user behavior records that mix both real negative examples and potential positive examples. Existing tensor factorization models either ignore unobserved examples or treat all of them as negative examples, leading to either poor prediction performance or huge computation cost. In addition, the distribution of positive examples w.r.t. behavior types is heavily skewed. Existing tensor factorization models would bias towards the type of behaviors with a large number of positive examples. In this paper, we propose a scalable probabilistic tensor factorization model (SPTF) for heterogeneous behavior data and develop a novel negative sampling technique to optimize SPTF by leveraging both observed and unobserved examples with much lower computational costs and higher modeling accuracy. To overcome the issue of the heavy skewness of the behavior data distribution, we propose a novel adaptive ranking-based positive sampling approach to speed up the model convergence and improve the prediction accuracy for sparse behavior types. Our proposed model optimization techniques enable SPTF to be scalable to large-scale behavior datasets. Extensive experiments have been conducted on a large-scale e-commerce dataset, and the experimental results show the superiority of our proposed SPTF model in terms of prediction accuracy and scalability.
Journal Title
Conference Title
2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM)
Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject
Pattern recognition
Numerical computation and mathematical software