-
导师-学生关系是科研合作网络中重要的关系类型之一, 准确识别此类关系对促进科研交流与合作、评审回避等有重要意义. 以论文合作网络为基础, 依据学生发表论文时通常与导师共同署名的现象, 抽象出能够反映导师-学生合作关系的特征, 提出了基于最大熵模型的导师-学生关系识别算法. 利用DBLP中1990-2011年的论文数据进行实例验证, 结果显示: 1)关系类型识别结果的准确率超过95%; 2)导师-学生关系终止时间的平均误差为1.39年. 该方法在识别关系时避免了特征之间相互独立的约束, 准确率优于其他同类识别算法, 且建模方法对识别社交网络中的其他关系类型也具有借鉴意义.Research collaboration network has become an essential part in our academic activities. We can keep or develop collaboration relationships with other researchers or share research results with them within the research collaboration network. It is well generally accepted that different relationships have essentially different influences on the collaboration of researchers. Such a scenario also happens in our daily life. The advisor-advisee relationship plays an important role in the research collaboration network, so identification of advisor-advisee relationship can benefit the collaboration of researchers. In this paper, we aim to conduct a systematic investigation of the problem of indentifying the social relationship types from publication networks, and try to propose an easily computed and effective solution to this problem. Based on the common knowledge that graduate student always co-authors his papers with his advisor and not vice versa, our study starts with an analysis on publication network, and retrieves these features that can represent the advisor-advisee relationship. According to these features, an advisor-advisee relationship identification algorithm based on maximum entropy model with feature selection is proposed in this paper. We employ the DBLP dataset to test the proposed algorithm. The results show that 1) the mean of deviation of estimated end year to graduation year is 1.39; 2) the accuracy of advisor-advisee relationship identification results is more than 95%, and it is better than those of other algorithms obviously. Finally, the proposed algorithm can be extended to the relationship identification in online social network.
-
Keywords:
- social network /
- relationship identification /
- maximum entropy /
- feature selection
[1] Bai M, Hu K, Tang Y 2011 Chin. Phys. B 20 12
[2] Backstrom L, Leskovec J 2011 Proceedings of the 4th ACM International Conference on Web Search and Data Mining Hong Kong, China, February 9-12, 2011 pp635-644
[3] Leskovec J, Huttenlocher D P, Kleinberg J M 2010 Proceedings of 19th International World Wide Web Conference Raleigh, USA, April 26-30, 2010 pp641-650
[4] Diehl C P, Namata G, Getoor L 2007 Proceedings of Twenty-Second Conference on Artificial Intelligence Vancouver, Canada, July 22-26, 2007 pp546-552
[5] Eagle N, Pentland A S, Lazer D 2009 Proc. Nat. Acad. Sci. U. S. A 106 36
[6] Wang C, Han J, Jia Y, Tang J, Zhang D, Yu Y, Guo J 2010 Proceedings of 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Washington D.C., USA, July 24-28, 2010 pp203-212
[7] Tang J, Lou T, Kleinberg J 2012 Proceedings of the 5th ACM International Conference on Web Search and Data Mining Seattle, USA, February 8-12, 2012 pp743-752
[8] Tang S, Yuan J, Mao X, Li X, Chen W, Dai G 2011 Proceedings of 30th IEEE International Conference on Computer Communications Shanghai, China, April 10-15, 2011 pp1661-1669
[9] Zhang Y Ch, Liu Y, Zhang H F, Cheng H, Xiong F 2012 Acta Phys. Sin. 60 050501(in Chinese) [张彦超, 刘云, 张海峰, 程辉, 熊菲 2012 60 050501]
[10] Gu Y R, Xia L L 2012 Acta Phys. Sin. 61 238701 (in Chinese) [顾亦然, 夏玲玲 2012 61 238701]
[11] Yu H, Liu Z, Li Y J 2013 Acta Phys. Sin. 62 020204 (in Chinese) [于会, 刘尊, 李勇军 2013 62 020204]
[12] Wu T Y, Chen Y G, Han J W 2010 Data Min. Knowl. Disc. 21 3
[13] Byrd R H, Nocedal J, Schnabel R B 1994 Mathematical Programming A, B 63 4
-
[1] Bai M, Hu K, Tang Y 2011 Chin. Phys. B 20 12
[2] Backstrom L, Leskovec J 2011 Proceedings of the 4th ACM International Conference on Web Search and Data Mining Hong Kong, China, February 9-12, 2011 pp635-644
[3] Leskovec J, Huttenlocher D P, Kleinberg J M 2010 Proceedings of 19th International World Wide Web Conference Raleigh, USA, April 26-30, 2010 pp641-650
[4] Diehl C P, Namata G, Getoor L 2007 Proceedings of Twenty-Second Conference on Artificial Intelligence Vancouver, Canada, July 22-26, 2007 pp546-552
[5] Eagle N, Pentland A S, Lazer D 2009 Proc. Nat. Acad. Sci. U. S. A 106 36
[6] Wang C, Han J, Jia Y, Tang J, Zhang D, Yu Y, Guo J 2010 Proceedings of 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Washington D.C., USA, July 24-28, 2010 pp203-212
[7] Tang J, Lou T, Kleinberg J 2012 Proceedings of the 5th ACM International Conference on Web Search and Data Mining Seattle, USA, February 8-12, 2012 pp743-752
[8] Tang S, Yuan J, Mao X, Li X, Chen W, Dai G 2011 Proceedings of 30th IEEE International Conference on Computer Communications Shanghai, China, April 10-15, 2011 pp1661-1669
[9] Zhang Y Ch, Liu Y, Zhang H F, Cheng H, Xiong F 2012 Acta Phys. Sin. 60 050501(in Chinese) [张彦超, 刘云, 张海峰, 程辉, 熊菲 2012 60 050501]
[10] Gu Y R, Xia L L 2012 Acta Phys. Sin. 61 238701 (in Chinese) [顾亦然, 夏玲玲 2012 61 238701]
[11] Yu H, Liu Z, Li Y J 2013 Acta Phys. Sin. 62 020204 (in Chinese) [于会, 刘尊, 李勇军 2013 62 020204]
[12] Wu T Y, Chen Y G, Han J W 2010 Data Min. Knowl. Disc. 21 3
[13] Byrd R H, Nocedal J, Schnabel R B 1994 Mathematical Programming A, B 63 4
计量
- 文章访问数: 6749
- PDF下载量: 500
- 被引次数: 0