流失预测模型在零售银行,利用模糊C-均值聚类Churn Prediction Model in Retail Banking Using Fuzzy C-Means Clustering |
|
课程网址: | http://videolectures.net/sikdd08_popovic_cpm/ |
主讲教师: | Džulijana Popović |
开课单位: | 领英公司 |
开课时间: | 2008-11-07 |
课程语种: | 英语 |
中文简介: | 本文提出了基于模糊方法的零售银行客户流失预测模型。这项研究是对一家零售银行5000名客户的真实匿名数据进行的。真实数据是研究的强大力量,因为许多研究经常使用旧的、无关的或人为的数据。运用典型判别分析方法,揭示了能够最大限度地分离出搅拌工和非搅拌工群体的变量。将标准差、标准判别分析和K均值聚类结果相结合,进行离群值检测。由于实际客户关系管理问题的模糊性,期望并证明了模糊方法比经典方法有更好的表现。根据模糊C均值算法输入参数不同值的初步数据挖掘和模糊聚类结果,选择最佳参数组合,并将其应用于训练数据集。已经开发出四种不同的预测模型,称为预测引擎。介绍了模糊过渡条件下客户机的定义和K实例模糊和的距离。使用这些总和的预测引擎在客户流失预测中表现最好,适用于平衡和非平衡测试集。 |
课程简介: | The paper presents model based on fuzzy methods for churn prediction in retail banking. The study was done on the real, anonymised data of 5000 clients of a retail bank. Real data are great strength of the study, as a lot of studies often use old, irrelevant or artificial data. Canonical discriminant analysis was applied to reveal variables that provide maximal separation between clusters of churners and non-churners. Combination of standard deviation, canonical discriminant analysis and k-means clustering results were used for outliers detection. Due to the fuzzy nature of practical customer relationship management problems it was expected, and shown, that fuzzy methods performed better than the classical ones. According to the results of the preliminary data exploration and fuzzy clustering with different values of the input parameters for fuzzy c-means algorithm, the best parameter combination was chosen and applied to training data set. Four different prediction models, called prediction engines, have been developed. The definitions of clients in the fuzzy transitional conditions and the distance of k instances fuzzy sums were introduced. The prediction engine using these sums performed best in churn prediction, applied to both balanced and non-balanced test sets. |
关 键 词: | 计算机科学; 模糊逻辑; 聚类算法 |
课程来源: | 视频讲座网公开课 |
最后编审: | 2020-06-04:dingaq |
阅读次数: | 54 |