2020年 05期

Human Resource Recommendation Using K-means Clustering Algorithm Based on Spark Platform


摘要(Abstract):

为了提高人力资源推荐系统的效率,提出一种基于Spark平台的K均值聚类算法来完成人力资源推荐;Spark平台在分布式系统所有节点的内存弹性分布式数据集中完成聚类迭代运算,以加快聚类速度;将K均值聚类算法与聚类簇思想相结合,以提高大规模数据样本聚类的效率,得到聚类结果后,采用动态推荐算法实现人力资源实时推荐。结果表明,Spark平台相比于单机在聚类计算效率方面更有优势,且所提出的算法比单机的K均值聚类算法的聚类速度和准确率均更优,在动态推荐性能方面也优于常用推荐算法。

关键词(KeyWords): Spark平台;人力资源推荐;K均值聚类;聚类簇;动态推荐

基金项目(Foundation): 国家重点研发计划项目(2017YFF0106407);; 国家自然科学基金项目(61672077,61702026);; 湖南省普通高等学校教学改革研究项目(20161023)

作者(Author): 李宇翔,李帅,宋艳琼,张福泉,周湘贞

DOI: 10.13349/j.cnki.jdxbn.20200515.003

参考文献(References):

[1] ZHAO W X,WANG J P,HE Y L,et al.Mining product adopter information from online reviews for improving product recommen-dation[J].ACM Transactions on Knowledge Discovery from Data,2016,10(3):29.

[2] JIANG M,FANG Y,XIE H M,et al.User click prediction for personalized job recommendation[J].World Wide Web,2018,22(29):325-345.

[3] PORTER C M,WOO S E,CAMPION M A.Internal and external networking differentially predict turnover through job embeddedness and job offers[J].Personnel Psychology,2016,69(3):635-672.

[4] LUO J,BRODSKY A,LI Y.An EM-based ensemble learning algorithm on piecewise surface regression problem[J].Inter-national Journal of Applied Mathematics and Statistics,2012,28(4):59-74.

[5] 李星,李涛.基于Spark的推荐系统的设计与实现[J].计算机技术与发展,2018,28(10):194-198.

[6] 郭霖.基于Spark的推荐系统的研究[J].电信快报,2018(5):40-41,44.

[7] 黄震,钱育蓉,范迎迎,等.Spark下遥感大数据特征提取的加速策略[J].计算机工程与设计,2017,38(12):3279-3283.

[8] 刘鹏,滕家雨,丁恩杰,等.基于Spark的大规模文本k-means并行聚类算法[J].中文信息学报,2017,31(4):145-153.

[9] 段元波,高茂庭.基于项目评分与类型评分聚类的推荐算法[J].计算机工程,2018,44(6):13-17,23.

[10] 蒋宗礼,乔向梅.基于差分隐私保护的模糊C均值聚类推荐[J].计算机系统应用,2018,27(10):189-195.

[11] ZHANG T F,MA F M.Improved rough k-means clustering algorithm based on weighted distance measure with Gaussian function[J].International Journal of Computer Mathematics,2017,94(4):663-675.

[12] TRAN D C,WU Z J,WANG Z L,et al.A novel hybrid data clustering algorithm based on artificial bee colony algorithm and K-means[J].Chinese Journal of Electronics,2015,24(4):694-701.

[13] 王宏杰,师彦文.结合初始中心优化和特征加权的K-Means聚类算法[J].计算机科学,2017,44(增刊2):457-459,502.

[14] 刘金平,张五霞,唐朝晖,等.基于模糊粗糙集属性约简与GMM-LDA最优聚类簇特征学习的自适应网络入侵检测[J].控制与决策,2019,34(2):243-251.

[15] 张亚楠,陈德运,王莹洁,等.基于增量图形模式匹配的动态冷启动推荐方法[J].浙江大学学报(工学版),2017,51(2):408-415.

[16] 归伟夏,刘一帝,陈华,等.基于Hadoop协同过滤的电商数据推荐研究[J].软件导刊,2015,14(10):118-120.

[17] GEVA T,ZAHAVI J.Empirical evaluation of an automated intraday stock recommendation system incorporating both market data and textual news[J].Decision Support Systems,2014,57(3):212-223.

[18] ZHOU P,ZHOU Y X,WU D P,et al.Differentially private online learning for cloud-based video recommendation with multimedia big data in social networks[J].IEEE Transactions on Multimedia,2016,18(6):1217-1229.

[19] MEEHAN K,LUNNEY T,CURRAN K,et al.Aggregating social media data with temporal and environmental context for recommendation in a mobile tour guide system[J].Journal of Hospitality and Tourism Technology,2016,7(3):281-299.