针对不同类型客户,进行精准营销,实现利润最大化。
建立客户价值评估模型,进行客户分类,是解决问题的办法。
import pandas as pd
import numpy as np
from sklearn.cluster import KMeans #导入K均值聚类算法
import matplotlib.pyplot as plt
首先,明确目标是客户价值识别。
识别客户价值,应用最广泛的模型是三个指标(消费时间间隔(Recency),消费频率(Frequency),消费金额(Monetary))
以上指标简称RFM模型,作用是识别高价值的客户
消费金额,一般表示一段时间内,消费的总额。但是,因为航空票价收到距离和舱位等级的影响,同样金额对航空公司价值不同。
因此,需要修改指标。选定变量,舱位因素=舱位所对应的折扣系数的平均值=C,距离因素=一定时间内积累的飞行里程=M。
再考虑到,航空公司的会员系统,用户的入会时间长短能在一定程度上影响客户价值,所以增加指标L=入会时间长度=客户关系长度
总共确定了五个指标,消费时间间隔R,客户关系长度L,消费频率F,飞行里程M和折扣系数的平均值C
以上指标,作为航空公司识别客户价值指标,记为LRFMC模型
1、从航空公司,选择性抽取与新增数据抽取,形成历史数据和增量数据。
2、对步骤一的两个数据,进行数据探索性分析和预处理,主要有缺失值与异常值的分析处理,属性规约、清洗和变换。
3、利用步骤2中的已处理数据作为建模数据,基于旅客价值的LRFMC模型进行客户分群,对各个客户群再进行特征分析,识别有价值客户。
4、针对模型结果得到不同价值的客户,采用不同的营销手段,指定定制化的营销服务,或者针对性的优惠与关怀。(重点维护老客户)
pd.set_option('display.max_columns', None) # 强制显示所有属性值
# 读取原始数据,指定UTF-8编码(需要用文本编辑器将数据装换为UTF-8编码)
data = pd.read_csv('./data/air_data.csv', encoding = 'utf-8')
# 包括对数据的基本描述,percentiles参数是指定计算多少的分位数表(如1/4分位数、中位数等);T是转置,转置后更方便查阅
explore = data.describe(percentiles = [], include = 'all').T
explore.head()
count | unique | top | freq | mean | std | min | 50% | max | |
---|---|---|---|---|---|---|---|---|---|
MEMBER_NO | 62988 | NaN | NaN | NaN | 31494.5 | 18183.2 | 1 | 31494.5 | 62988 |
FFP_DATE | 62988 | 3068 | 2011/01/13 | 184 | NaN | NaN | NaN | NaN | NaN |
FIRST_FLIGHT_DATE | 62988 | 3406 | 2013/02/16 | 96 | NaN | NaN | NaN | NaN | NaN |
GENDER | 62985 | 2 | 男 | 48134 | NaN | NaN | NaN | NaN | NaN |
FFP_TIER | 62988 | NaN | NaN | NaN | 4.10216 | 0.373856 | 4 | 4 | 6 |
data.shape
(62988, 44)
data.head()
MEMBER_NO | FFP_DATE | FIRST_FLIGHT_DATE | GENDER | FFP_TIER | WORK_CITY | WORK_PROVINCE | WORK_COUNTRY | AGE | LOAD_TIME | FLIGHT_COUNT | BP_SUM | EP_SUM_YR_1 | EP_SUM_YR_2 | SUM_YR_1 | SUM_YR_2 | SEG_KM_SUM | WEIGHTED_SEG_KM | LAST_FLIGHT_DATE | AVG_FLIGHT_COUNT | AVG_BP_SUM | BEGIN_TO_FIRST | LAST_TO_END | AVG_INTERVAL | MAX_INTERVAL | ADD_POINTS_SUM_YR_1 | ADD_POINTS_SUM_YR_2 | EXCHANGE_COUNT | avg_discount | P1Y_Flight_Count | L1Y_Flight_Count | P1Y_BP_SUM | L1Y_BP_SUM | EP_SUM | ADD_Point_SUM | Eli_Add_Point_Sum | L1Y_ELi_Add_Points | Points_Sum | L1Y_Points_Sum | Ration_L1Y_Flight_Count | Ration_P1Y_Flight_Count | Ration_P1Y_BPS | Ration_L1Y_BPS | Point_NotFlight | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 54993 | 2006/11/02 | 2008/12/24 | 男 | 6 | . | 北京 | CN | 31.0 | 2014/03/31 | 210 | 505308 | 0 | 74460 | 239560.0 | 234188.0 | 580717 | 558440.14 | 2014/03/31 | 26.250 | 63163.500 | 2 | 1 | 3.483254 | 18 | 3352 | 36640 | 34 | 0.961639 | 103 | 107 | 246197 | 259111 | 74460 | 39992 | 114452 | 111100 | 619760 | 370211 | 0.509524 | 0.490476 | 0.487221 | 0.512777 | 50 |
1 | 28065 | 2007/02/19 | 2007/08/03 | 男 | 6 | NaN | 北京 | CN | 42.0 | 2014/03/31 | 140 | 362480 | 0 | 41288 | 171483.0 | 167434.0 | 293678 | 367777.20 | 2014/03/25 | 17.500 | 45310.000 | 2 | 7 | 5.194245 | 17 | 0 | 12000 | 29 | 1.252314 | 68 | 72 | 177358 | 185122 | 41288 | 12000 | 53288 | 53288 | 415768 | 238410 | 0.514286 | 0.485714 | 0.489289 | 0.510708 | 33 |
2 | 55106 | 2007/02/01 | 2007/08/30 | 男 | 6 | . | 北京 | CN | 40.0 | 2014/03/31 | 135 | 351159 | 0 | 39711 | 163618.0 | 164982.0 | 283712 | 355966.50 | 2014/03/21 | 16.875 | 43894.875 | 10 | 11 | 5.298507 | 18 | 3491 | 12000 | 20 | 1.254676 | 65 | 70 | 169072 | 182087 | 39711 | 15491 | 55202 | 51711 | 406361 | 233798 | 0.518519 | 0.481481 | 0.481467 | 0.518530 | 26 |
3 | 21189 | 2008/08/22 | 2008/08/23 | 男 | 5 | Los Angeles | CA | US | 64.0 | 2014/03/31 | 23 | 337314 | 0 | 34890 | 116350.0 | 125500.0 | 281336 | 306900.88 | 2013/12/26 | 2.875 | 42164.250 | 21 | 97 | 27.863636 | 73 | 0 | 0 | 11 | 1.090870 | 13 | 10 | 186104 | 151210 | 34890 | 0 | 34890 | 34890 | 372204 | 186100 | 0.434783 | 0.565217 | 0.551722 | 0.448275 | 12 |
4 | 39546 | 2009/04/10 | 2009/04/15 | 男 | 6 | 贵阳 | 贵州 | CN | 48.0 | 2014/03/31 | 152 | 273844 | 0 | 42265 | 124560.0 | 130702.0 | 309928 | 300834.06 | 2014/03/27 | 19.000 | 34230.500 | 3 | 5 | 4.788079 | 47 | 0 | 22704 | 27 | 0.970658 | 71 | 81 | 128448 | 145396 | 42265 | 22704 | 64969 | 64969 | 338813 | 210365 | 0.532895 | 0.467105 | 0.469054 | 0.530943 | 39 |
这里只选取部分探索结果。
describe()函数自动计算的字段有count(非空值数)、unique(唯一值数)、top(频数最高者)、freq(最高频数)、
mean(平均值)、std(方差)、min(最小值)、50%(中位数)、max(最大值)
explore['null'] = len(data)-explore['count'] # describe()函数自动计算非空值数,需要手动计算空值数
explore = explore[['null', 'max', 'min']]
explore.columns = [u'空值数', u'最大值', u'最小值'] # 表头重命名
print('-----------------------------------------------------------------以下是处理后数据')
explore
-----------------------------------------------------------------以下是处理后数据
空值数 | 最大值 | 最小值 | |
---|---|---|---|
MEMBER_NO | 0 | 62988 | 1 |
FFP_DATE | 0 | NaN | NaN |
FIRST_FLIGHT_DATE | 0 | NaN | NaN |
GENDER | 3 | NaN | NaN |
FFP_TIER | 0 | 6 | 4 |
WORK_CITY | 2269 | NaN | NaN |
WORK_PROVINCE | 3245 | NaN | NaN |
WORK_COUNTRY | 26 | NaN | NaN |
AGE | 420 | 110 | 6 |
LOAD_TIME | 0 | NaN | NaN |
FLIGHT_COUNT | 0 | 213 | 2 |
BP_SUM | 0 | 505308 | 0 |
EP_SUM_YR_1 | 0 | 0 | 0 |
EP_SUM_YR_2 | 0 | 74460 | 0 |
SUM_YR_1 | 551 | 239560 | 0 |
SUM_YR_2 | 138 | 234188 | 0 |
SEG_KM_SUM | 0 | 580717 | 368 |
WEIGHTED_SEG_KM | 0 | 558440 | 0 |
LAST_FLIGHT_DATE | 0 | NaN | NaN |
AVG_FLIGHT_COUNT | 0 | 26.625 | 0.25 |
AVG_BP_SUM | 0 | 63163.5 | 0 |
BEGIN_TO_FIRST | 0 | 729 | 0 |
LAST_TO_END | 0 | 731 | 1 |
AVG_INTERVAL | 0 | 728 | 0 |
MAX_INTERVAL | 0 | 728 | 0 |
ADD_POINTS_SUM_YR_1 | 0 | 600000 | 0 |
ADD_POINTS_SUM_YR_2 | 0 | 728282 | 0 |
EXCHANGE_COUNT | 0 | 46 | 0 |
avg_discount | 0 | 1.5 | 0 |
P1Y_Flight_Count | 0 | 118 | 0 |
L1Y_Flight_Count | 0 | 111 | 0 |
P1Y_BP_SUM | 0 | 246197 | 0 |
L1Y_BP_SUM | 0 | 259111 | 0 |
EP_SUM | 0 | 74460 | 0 |
ADD_Point_SUM | 0 | 984938 | 0 |
Eli_Add_Point_Sum | 0 | 984938 | 0 |
L1Y_ELi_Add_Points | 0 | 728282 | 0 |
Points_Sum | 0 | 985572 | 0 |
L1Y_Points_Sum | 0 | 728282 | 0 |
Ration_L1Y_Flight_Count | 0 | 1 | 0 |
Ration_P1Y_Flight_Count | 0 | 1 | 0 |
Ration_P1Y_BPS | 0 | 0.999989 | 0 |
Ration_L1Y_BPS | 0 | 0.999993 | 0 |
Point_NotFlight | 0 | 140 | 0 |
data = pd.read_csv('./data/air_data.csv', encoding = 'utf-8')
data = data[data['SUM_YR_1'].notnull() & data['SUM_YR_2'].notnull()] #票价非空值才保留
#只保留票价非零的,或者平均折扣率与总飞行公里数同时为0的记录。
index1 = data['SUM_YR_1'] != 0
index2 = data['SUM_YR_2'] != 0
index3 = (data['SEG_KM_SUM'] == 0) & (data['avg_discount'] == 0) #该规则是“与”,书上给的代码无法正常运行,修改'*'为'&'
data = data[index1 | index2 | index3] #该规则是“或”
data.head()
MEMBER_NO | FFP_DATE | FIRST_FLIGHT_DATE | GENDER | FFP_TIER | WORK_CITY | WORK_PROVINCE | WORK_COUNTRY | AGE | LOAD_TIME | FLIGHT_COUNT | BP_SUM | EP_SUM_YR_1 | EP_SUM_YR_2 | SUM_YR_1 | SUM_YR_2 | SEG_KM_SUM | WEIGHTED_SEG_KM | LAST_FLIGHT_DATE | AVG_FLIGHT_COUNT | AVG_BP_SUM | BEGIN_TO_FIRST | LAST_TO_END | AVG_INTERVAL | MAX_INTERVAL | ADD_POINTS_SUM_YR_1 | ADD_POINTS_SUM_YR_2 | EXCHANGE_COUNT | avg_discount | P1Y_Flight_Count | L1Y_Flight_Count | P1Y_BP_SUM | L1Y_BP_SUM | EP_SUM | ADD_Point_SUM | Eli_Add_Point_Sum | L1Y_ELi_Add_Points | Points_Sum | L1Y_Points_Sum | Ration_L1Y_Flight_Count | Ration_P1Y_Flight_Count | Ration_P1Y_BPS | Ration_L1Y_BPS | Point_NotFlight | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 54993 | 2006/11/02 | 2008/12/24 | 男 | 6 | . | 北京 | CN | 31.0 | 2014/03/31 | 210 | 505308 | 0 | 74460 | 239560.0 | 234188.0 | 580717 | 558440.14 | 2014/03/31 | 26.250 | 63163.500 | 2 | 1 | 3.483254 | 18 | 3352 | 36640 | 34 | 0.961639 | 103 | 107 | 246197 | 259111 | 74460 | 39992 | 114452 | 111100 | 619760 | 370211 | 0.509524 | 0.490476 | 0.487221 | 0.512777 | 50 |
1 | 28065 | 2007/02/19 | 2007/08/03 | 男 | 6 | NaN | 北京 | CN | 42.0 | 2014/03/31 | 140 | 362480 | 0 | 41288 | 171483.0 | 167434.0 | 293678 | 367777.20 | 2014/03/25 | 17.500 | 45310.000 | 2 | 7 | 5.194245 | 17 | 0 | 12000 | 29 | 1.252314 | 68 | 72 | 177358 | 185122 | 41288 | 12000 | 53288 | 53288 | 415768 | 238410 | 0.514286 | 0.485714 | 0.489289 | 0.510708 | 33 |
2 | 55106 | 2007/02/01 | 2007/08/30 | 男 | 6 | . | 北京 | CN | 40.0 | 2014/03/31 | 135 | 351159 | 0 | 39711 | 163618.0 | 164982.0 | 283712 | 355966.50 | 2014/03/21 | 16.875 | 43894.875 | 10 | 11 | 5.298507 | 18 | 3491 | 12000 | 20 | 1.254676 | 65 | 70 | 169072 | 182087 | 39711 | 15491 | 55202 | 51711 | 406361 | 233798 | 0.518519 | 0.481481 | 0.481467 | 0.518530 | 26 |
3 | 21189 | 2008/08/22 | 2008/08/23 | 男 | 5 | Los Angeles | CA | US | 64.0 | 2014/03/31 | 23 | 337314 | 0 | 34890 | 116350.0 | 125500.0 | 281336 | 306900.88 | 2013/12/26 | 2.875 | 42164.250 | 21 | 97 | 27.863636 | 73 | 0 | 0 | 11 | 1.090870 | 13 | 10 | 186104 | 151210 | 34890 | 0 | 34890 | 34890 | 372204 | 186100 | 0.434783 | 0.565217 | 0.551722 | 0.448275 | 12 |
4 | 39546 | 2009/04/10 | 2009/04/15 | 男 | 6 | 贵阳 | 贵州 | CN | 48.0 | 2014/03/31 | 152 | 273844 | 0 | 42265 | 124560.0 | 130702.0 | 309928 | 300834.06 | 2014/03/27 | 19.000 | 34230.500 | 3 | 5 | 4.788079 | 47 | 0 | 22704 | 27 | 0.970658 | 71 | 81 | 128448 | 145396 | 42265 | 22704 | 64969 | 64969 | 338813 | 210365 | 0.532895 | 0.467105 | 0.469054 | 0.530943 | 39 |
index1.head()
0 True 1 True 2 True 3 True 4 True Name: SUM_YR_1, dtype: bool
type(index1)
pandas.core.series.Series
data.shape # 少了大概800多条
(62044, 44)
pd.to_datetime(data['FFP_DATE']).head()
0 2006-11-02 1 2007-02-19 2 2007-02-01 3 2008-08-22 4 2009-04-10 Name: FFP_DATE, dtype: datetime64[ns]
data['FFP_DATE'].head()
0 2006/11/02 1 2007/02/19 2 2007/02/01 3 2008/08/22 4 2009/04/10 Name: FFP_DATE, dtype: object
意思是,将原始数据转换成“适当”的格式,用来适应算法和分析等等的需要。
本案例,主要采用数据变换的方式为属性构造和数据标准化 3.需要构造LRFMC的五个指标
L=LOAD_TIME-FFP_DATE(会员入会时间距观测窗口结束的月数=观测窗口的结束时间-入会时间(单位:月))
R=LAST_TO_END(客户最近一次乘坐公司距观测窗口结束的月数=最后一次。。。)
F=FLIGHT_COUNT(观测窗口内的飞行次数)
M=SEG_KM_SUM(观测窗口的总飞行里程)
C=AVG_DISCOUNT(评价折扣率)
def reduction_data(data):
data = data[['LOAD_TIME', 'FFP_DATE', 'LAST_TO_END', 'FLIGHT_COUNT', 'SEG_KM_SUM', 'avg_discount']]
# data['L']=pd.datetime(data['LOAD_TIME'])-pd.datetime(data['FFP_DATE'])
# data['L']=int(((parse(data['LOAD_TIME'])-parse(data['FFP_ADTE'])).days)/30)
d_ffp = pd.to_datetime(data['FFP_DATE'])
d_load = pd.to_datetime(data['LOAD_TIME'])
res = d_load - d_ffp
data2=data.copy()
data2['L'] = res.map(lambda x: x / np.timedelta64(30 * 24 * 60, 'm'))
data2['R'] = data['LAST_TO_END']
data2['F'] = data['FLIGHT_COUNT']
data2['M'] = data['SEG_KM_SUM']
data2['C'] = data['avg_discount']
data3 = data2[['L', 'R', 'F', 'M', 'C']]
return data3
data3 = reduction_data(data)
data3.head()
L | R | F | M | C | |
---|---|---|---|---|---|
0 | 90.200000 | 1 | 210 | 580717 | 0.961639 |
1 | 86.566667 | 7 | 140 | 293678 | 1.252314 |
2 | 87.166667 | 11 | 135 | 283712 | 1.254676 |
3 | 68.233333 | 97 | 23 | 281336 | 1.090870 |
4 | 60.533333 | 5 | 152 | 309928 | 0.970658 |
标准分数是一种不受原始测量单位影响的数值。其作用除了能够表明原数据在其分布中的位置外,还能对未来不能直接比较的各种不同单位的数据进行比较。如比较各个学生的成绩在班级成绩中的位置或比较某个学生在两种或多种测验中所得分数的优劣。
def zscore_data(data):
data = (data - data.mean(axis=0)) / data.std(axis=0)
data.columns = ['Z' + i for i in data.columns]
return data
data4=zscore_data(data3)
data4.head()
ZL | ZR | ZF | ZM | ZC | |
---|---|---|---|---|---|
0 | 1.435707 | -0.944948 | 14.034016 | 26.761154 | 1.295540 |
1 | 1.307152 | -0.911894 | 9.073213 | 13.126864 | 2.868176 |
2 | 1.328381 | -0.889859 | 8.718869 | 12.653481 | 2.880950 |
3 | 0.658476 | -0.416098 | 0.781585 | 12.540622 | 1.994714 |
4 | 0.386032 | -0.922912 | 9.923636 | 13.898736 | 1.344335 |
res.map(lambda x: x / np.timedelta64(30 * 24 * 60, 'm'))
0 90.200000 1 86.566667 2 87.166667 3 68.233333 4 60.533333 5 74.700000 6 97.700000 7 48.400000 8 34.266667 9 45.500000 10 40.966667 11 114.166667 12 89.500000 13 90.466667 14 50.633333 15 73.133333 16 45.166667 17 41.233333 18 97.200000 19 31.500000 20 69.000000 21 34.266667 22 109.233333 23 48.266667 24 50.200000 25 42.133333 26 87.166667 27 112.066667 28 100.166667 29 70.466667 ... 62948 12.833333 62949 24.400000 62950 37.266667 62951 13.966667 62952 17.633333 62953 68.700000 62954 76.800000 62956 107.966667 62957 21.433333 62958 13.400000 62959 13.100000 62960 74.600000 62961 52.333333 62962 32.933333 62963 113.066667 62964 14.133333 62965 64.733333 62966 20.900000 62967 23.933333 62968 45.433333 62969 17.466667 62970 110.233333 62971 13.733333 62972 39.066667 62973 12.233333 62974 108.300000 62975 65.366667 62976 45.400000 62977 15.533333 62978 36.066667 Length: 62044, dtype: float64
data = data4
k = 5 #需要进行的聚类类别数
#调用k-means算法,进行聚类分析
kmodel = KMeans(n_clusters = k, n_jobs = 1) #n_jobs是并行数,一般等于CPU数较好
kmodel.fit(data) #训练模型
r1 = pd.Series(kmodel.labels_).value_counts()
r2 = pd.DataFrame(kmodel.cluster_centers_)
r = pd.concat([r2, r1], axis=1)
r.columns = list(data.columns) + ['类别数目']
# print(r)
# r.to_excel(classoutfile,index=False)
r = pd.concat([data, pd.Series(kmodel.labels_, index=data.index)], axis=1)
r.columns = list(data.columns) + ['聚类类别']
print(kmodel.cluster_centers_)
print(kmodel.labels_)
r
[[-0.70022909 -0.41487223 -0.16115608 -0.16097537 -0.25507251] [ 0.48332845 -0.79938326 2.4832016 2.42472391 0.30863003] [ 1.16063114 -0.37723101 -0.08693538 -0.09485761 -0.15582812] [-0.31357211 1.68622472 -0.57397357 -0.53680287 -0.17306867] [ 0.05189162 -0.00324595 -0.22666647 -0.23105177 2.19223968]] [1 1 1 ... 0 3 3]
ZL | ZR | ZF | ZM | ZC | 聚类类别 | |
---|---|---|---|---|---|---|
0 | 1.435707 | -0.944948 | 14.034016 | 26.761154 | 1.295540 | 1 |
1 | 1.307152 | -0.911894 | 9.073213 | 13.126864 | 2.868176 | 1 |
2 | 1.328381 | -0.889859 | 8.718869 | 12.653481 | 2.880950 | 1 |
3 | 0.658476 | -0.416098 | 0.781585 | 12.540622 | 1.994714 | 1 |
4 | 0.386032 | -0.922912 | 9.923636 | 13.898736 | 1.344335 | 1 |
5 | 0.887281 | -0.515257 | 5.671519 | 13.169947 | 1.328291 | 1 |
6 | 1.701075 | -0.944948 | 6.309337 | 12.811656 | 1.315599 | 1 |
7 | -0.043274 | -0.933930 | 4.325015 | 12.820586 | 1.297873 | 1 |
8 | -0.543344 | -0.917403 | 3.120249 | 14.447881 | 0.575103 | 1 |
9 | -0.145883 | -0.867824 | 3.687198 | 16.993157 | -0.076664 | 1 |
10 | -0.306283 | -0.829262 | 2.198957 | 11.622784 | 1.441721 | 1 |
11 | 2.283704 | -0.917403 | 9.427556 | 12.070469 | 1.246284 | 1 |
12 | 1.410940 | -0.581364 | 1.206796 | 14.449781 | 0.416304 | 1 |
13 | 1.445143 | -0.933930 | 7.514103 | 7.704099 | 3.658441 | 1 |
14 | 0.035747 | -0.939439 | 2.695037 | 12.005347 | 1.081000 | 1 |
15 | 0.831849 | -0.592381 | 0.710716 | 10.326420 | 1.644209 | 1 |
16 | -0.157677 | -0.911894 | 6.309337 | 7.358158 | 3.594292 | 1 |
17 | -0.296847 | -0.702558 | 1.986351 | 12.674761 | 0.625775 | 1 |
18 | 1.683384 | -0.939439 | 3.687198 | 7.221691 | 3.675833 | 1 |
19 | -0.641235 | -0.818244 | 1.844614 | 14.989711 | -0.075174 | 1 |
20 | 0.685602 | -0.928421 | 6.663680 | 7.115054 | 3.501664 | 1 |
21 | -0.543344 | -0.917403 | 0.781585 | 9.370202 | 1.836524 | 1 |
22 | 2.109151 | -0.823753 | 5.813256 | 13.676532 | 0.106170 | 1 |
23 | -0.047992 | -0.542802 | 0.568979 | 9.740225 | 1.529618 | 1 |
24 | 0.020414 | -0.856806 | 2.269825 | 12.564419 | 0.352363 | 1 |
25 | -0.265003 | -0.856806 | 3.545461 | 10.751685 | 0.978232 | 1 |
26 | 1.328381 | -0.933930 | 14.246621 | 8.103240 | 2.294903 | 1 |
27 | 2.209401 | -0.889859 | 4.395884 | 6.239728 | 3.847707 | 1 |
28 | 1.788351 | -0.851297 | 6.309337 | 6.735816 | 3.332678 | 1 |
29 | 0.737496 | -0.889859 | 5.317176 | 6.266756 | 3.800909 | 1 |
... | ... | ... | ... | ... | ... | ... |
62948 | -1.301705 | 1.164941 | -0.706656 | -0.805297 | 0.691541 | 3 |
62949 | -0.892450 | -0.080058 | -0.706656 | -0.779837 | -2.067702 | 0 |
62950 | -0.437197 | 0.145804 | -0.706656 | -0.805297 | 0.610387 | 0 |
62951 | -1.261605 | -0.900877 | -0.706656 | -0.775277 | -2.245592 | 0 |
62952 | -1.131870 | 1.792950 | -0.706656 | -0.786677 | -1.743085 | 3 |
62953 | 0.674987 | 0.735251 | -0.706656 | -0.765397 | -2.554628 | 3 |
62954 | 0.961584 | 2.487065 | -0.706656 | -0.805297 | 0.421027 | 3 |
62956 | 2.064333 | -0.548311 | -0.706656 | -0.779837 | -2.175908 | 2 |
62957 | -0.997417 | 0.140295 | -0.706656 | -0.805297 | 0.258719 | 0 |
62958 | -1.281655 | -0.675014 | -0.706656 | -0.805297 | 0.204616 | 0 |
62959 | -1.292270 | -0.228797 | -0.706656 | -0.777272 | -2.338216 | 0 |
62960 | 0.883743 | -0.636452 | -0.706656 | -0.753617 | -2.879244 | 2 |
62961 | 0.095897 | 1.644211 | -0.706656 | -0.805297 | 0.150513 | 3 |
62962 | -0.590521 | 1.759897 | -0.706656 | -0.805297 | 0.150513 | 3 |
62963 | 2.244783 | 0.437773 | -0.706656 | -0.805297 | 0.150513 | 2 |
62964 | -1.255708 | 1.341224 | -0.706656 | -0.805297 | 0.150513 | 3 |
62965 | 0.534637 | 1.352242 | -0.706656 | -0.805297 | 0.150513 | 3 |
62966 | -1.016288 | 1.319189 | -0.706656 | -0.805297 | 0.150513 | 3 |
62967 | -0.908961 | 2.745980 | -0.706656 | -0.770717 | -2.554628 | 3 |
62968 | -0.148241 | 1.605649 | -0.706656 | -0.770717 | -2.554628 | 3 |
62969 | -1.137767 | 1.908636 | -0.706656 | -0.770717 | -2.554628 | 3 |
62970 | 2.144533 | 2.013304 | -0.706656 | -0.770717 | -2.554628 | 3 |
62971 | -1.269861 | 1.032729 | -0.706656 | -0.770717 | -2.554628 | 3 |
62972 | -0.373509 | 0.041136 | -0.706656 | -0.731862 | -3.149758 | 0 |
62973 | -1.322935 | 1.016202 | -0.706656 | -0.786677 | -2.013599 | 3 |
62974 | 2.076128 | -0.460169 | -0.706656 | -0.805297 | -0.065898 | 2 |
62975 | 0.557046 | -0.283886 | -0.706656 | -0.805297 | -0.282309 | 2 |
62976 | -0.149421 | -0.735611 | -0.706656 | -0.772332 | -2.689885 | 0 |
62977 | -1.206173 | 1.605649 | -0.706656 | -0.779837 | -2.554628 | 3 |
62978 | -0.479656 | 0.603039 | -0.706656 | -0.786677 | -2.392319 | 3 |
62044 rows × 6 columns