TABLE II. E
XPERIMENTAL RESULTS OF THREE MODELS
Model Error AUC
LR 0.3062 0.7268
xgboost 0.2723 0.7663
LX 0.2376 0.8087
From the table, we can see that the AUC score of LR is
lower than the scores of the other two models, which means the
LR model achieve a poor prediction result. The reason behind
this phenomenon is that it’s difficult for LR to extract non-
linear features without manual construction of large-scale
feature engineering. The error and AUC of xgboost are
obviously better than the LR model, which shows that xgboost
has good performance under the current simple feature
engineering. This proves that xgboost can handle non-linear
data very well. However, as shown in Figure 7, xgboost is
over-fitting, and the generalization ability on the test set does
not perform well. Among these three models, the performance
of the LX model is optimal, and the AUC even reaches a high
score of 0.80. This demonstrates that the LX model not only
extracts the non-linear features of the data but also overcomes
the over-fitting problem of xgboost. Therefore, our LX model
combines the advantages of the xgboost and LR models and is
a high-level classifier.
VI. C
ONCLUSION AND FUTURE WORK
This paper studies the prediction of users' music
preferences in the music recommendation field. We adopt the
fusion model of xgboost and LR as our classifier and optimize
the fusion part. In terms of features, we do some feature
engineering on the user's profile and proved effective by
xgboost. Comparing the comprehensive performance of
different models on the test set, we find that our LX model has
the best results, which proves that our method has strong
practicability in the field of music recommendation.
This article only studies the user's behavior information and
does not combine the user's social network information.
Currently, social network-based methods have proven to be
effective to improve prediction accuracy in recommendation
systems. So, In the future, we will try to mine the user's social
relationships and combine it with our LX model to recommend
music to users.
A
CKNOWLEDGMENT
This work was supported in part by the National Natural
Science Foundation of China under Grant 6167060382,
61602070, in part by the New Academic Seedling Cultivation
and Exploration Innovation Project under Grant [2017]5789-
21.
R
EFERENCES
[1] Sánchez-Moreno D., González A. B. G., Vicente M. D. M., Batista, V.
F. L., and García M. N. M., “A collaborative filtering method for music
recommendation using playing coefficients for artists and users,” Expert
Systems with Applications, 2016, vol. 66, pp. 234-244.
[2] Li Q., Myaeng S. H., and Kim B. M., “A probabilistic music
recommender considering user opinions and audio features,”
Information processing & management, 2007, vol. 43, no. 2, pp. 473-
487.
[3] Van den Oord A., Dieleman S., and Schrauwen B., “Deep content-based
music recommendation,” Advances in neural information processing
systems. 2013, pp. 2643-2651.
[4] Bartz K., Murthi V., and Sebastian S., “Logistic regression and
collaborative filtering for sponsored search term recommendation,”
Second workshop on sponsored search auctions. 2006, pp. 5.
[5] Chen T., Guestrin C., “Xgboost: A scalable tree boosting system,”
Proceedings of the 22nd acm sigkdd international conference on
knowledge discovery and data mining. ACM, 2016, pp. 785-794.
[6] Xu L., Liu J., and Gu Y., “A Recommendation System Based on
Extreme Gradient Boosting Classifier,” 2018 10th International
Conference on Modelling, Identification and Control (ICMIC). IEEE,
2018, pp. 1-5.
[7] Bender R., Grouven U., “Ordinal logistic regression in medical
research,” Journal of the Royal College of physicians of London, 1997,
vol. 31, no. 5, pp. 546-551.
[8] Hua Z., Wang Y., Xu X., Zhang B., and Liang L., “Predicting corporate
financial distress based on integration of support vector machine and
logistic regression,” Expert Systems with Applications, 2007, vol. 33,
no. 2, pp. 434-440.
[9] Maranzato R., Pereira A., do Lago A. P., and Neubert M., “Fraud
detection in reputation systems in e-markets using logistic regression,”
Proceedings of the 2010 ACM symposium on applied computing. ACM,
2010, pp. 1454-1455.
[10] Quevedo J. R., Montañés E., Ranilla J., and Díaz I., “Ranked tag
recommendation systems based on logistic regression,” International
Conference on Hybrid Artificial Intelligence Systems. Springer, Berlin,
Heidelberg, 2010, pp. 237-244.
[11] Wang Y., Feng D., Li D., et al. “A mobile recommendation system
based on logistic regression and Gradient Boosting Decision Trees,”
IJCNN. 2016, pp. 1896-1902.
[12] Medina F., Aguila S., Baratto M. C., et al. “Prediction model based on
decision tree analysis for laccase mediators,” Enzyme and microbial
technology, 2013, vol. 52, no. 1, pp. 68-76.
[13] Lee S., Lee S., and Park Y. “A prediction model for success of services
in e-commerce using decision tree: E-customer’s attitude towards online
service,” Expert Systems with Applications, 2007, vol. 33, no. 3, pp.
572-581.
[14] Friedman J. H. “Greedy function approximation: a gradient boosting
machine,” Annals of statistics, 2001, pp. 1189-1232.
[15] Xie J., Coggeshall S., “Prediction of transfers to tertiary care and
hospital mortality: A gradient boosting decision tree approach,”
Statistical Analysis and Data Mining: The ASA Data Science Journal,
2010, vol. 3, no. 4, pp. 253-258.
[16] Zhang X., Wang X., Chen W., et al. “A Taxi Gap Prediction Method via
Double Ensemble Gradient Boosting Decision Tree,” Big Data Security
on Cloud (BigDataSecurity), IEEE International Conference on High
Performance and Smart Computing (HPSC), and IEEE International
Conference on Intelligent Data and Security (IDS), 2017 IEEE 3rd
International Conference on. IEEE, 2017, pp. 255-260.
[17] Thierry Bertin-Mahieux, Daniel P.W. Ellis, Brian Whitman, and Paul
Lamere. The million song dataset. In Proceedings of the 11th
International Conference on Music Information Retrieval (ISMIR),
2011.
[18] He X., Pan J., Jin O., et al. “Practical lessons from predicting clicks on
ads at facebook,” Proceedings of the Eighth International Workshop on
Data Mining for Online Advertising. ACM, 2014, pp. 1-9.
[19] Salgado C. M., Fernandes M. P., Horta A, et al. “Multistage modeling
for the classification of numerical and categorical datasets,” IEEE
International Conference on Fuzzy Systems. IEEE, 2017, pp. 1-6.
IJCNN 2019. International Joint Conference on Neural Networks. Budapest, Hungary. 14-19 July 2019
paper N-19514.pdf
- 6 -
Authorized licensed use limited to: University of Luxembourg. Downloaded on January 15,2022 at 17:56:59 UTC from IEEE Xplore. Restrictions apply.