请 [注册] 或 [登录]  | 返回主站

量化交易吧 /  量化平台 帖子:3364712 新帖:0

机器学习——特征选择的基本方法

fx1118发表于:5 月 10 日 00:26回复(1)

本文主要介绍机器学习中的特征提取方法,包括filter,wrapper,embedded,另外借用了github上feature_selection引用最多的方法,并在此基础上做了部分修改。
三个模块:
1、数据获取,使用的是聚宽提供的技术因子,未做处理
2、数据处理,去极值和标准化,PCA降维,本文数据未用PCA,但代码写好了
3、特征选择,三种基本方式,外加基于相关系数删除法和基于LightGBM进行特征选择
1、为什么要做特征选择
在有限的样本数目下,用大量的特征来设计分类器计算开销太大而且分类性能差。
2、特征选择的确切含义
将高维空间的样本通过映射或者是变换的方式转换到低维空间,达到降维的目的,然后通过特征选取删选掉冗余和不相关的特征来进一步降维。
3、特征选取的原则
获取尽可能小的特征子集,不显著降低分类精度、不影响类分布以及特征子集应具有稳定适应性强等特点
1、Filter方法
其主要思想是:对每一维的特征“打分”,即给每一维的特征赋予权重,这样的权重就代表着该维特征的重要性,然后依据权重排序。
主要的方法有:
Chi-squared test(卡方检验)
information gain(信息增益),详细可见“简单易学的机器学习算法——决策树之ID3算法”
correlation coefficient scores(相关系数)
2、Wrapper方法
其主要思想是:将子集的选择看作是一个搜索寻优问题,生成不同的组合,对组合进行评价,再与其他的组合进行比较。
主要方法有: (递归特征消除算法RFE)
3、Embedded方法
其主要思想是:在模型既定的情况下学习出对提高模型准确性最好的属性。通过threshold与feature_importance比较控制保留特征数,L1正则化,threshold默认1e-5,其他方法可设,selectFromModel默认median。
github连接:https://github.com/WillKoehrsen/feature-selector

import numpy as np
import pandas as pd
from sklearn.feature_selection import SelectKBest,SelectPercentile,SelectFromModel,chi2,f_classif,mutual_info_classif,RFE
from scipy.stats import pearsonr
from sklearn.ensemble import RandomForestRegressor,RandomForestClassifier
from sklearn.svm import SVC,LinearSVC,LinearSVR,SVR
from sklearn.tree import DecisionTreeClassifier
import lightgbm as lgb
from sklearn.model_selection import train_test_split
import gc
from jqdata import *
from jqlib.technical_analysis import *

数据获取¶

start_date = '2016-01-01'
end_date = '2018-11-01'
trade_days = get_trade_days(start_date=start_date,end_date=end_date).tolist()
date = trade_days[0]

lookback = 5 #lstm时间轴数据长度
stocks = '000905.XSHG' #中证500
#技术因子数据准备

def get_factors_one_stock(stocks,date):
    '''
    获取一只股票一天的因子数据集
    input:
    stocks:一只股票
    date:日期
    output:
    df:dataframe,各个因子一天的数值
    '''
    if type(date) != str:
        date = datetime.datetime.strftime(date,'%Y-%m-%d')
    
    price = get_price(stocks,end_date=date,count=1)
    price.index = [date]
    
    accer = ACCER(stocks,check_date=date,N=5)
    accer_df = pd.DataFrame(list(accer.values()),columns=['ACCER'])
    
    #ADTM-动态买卖气指标
    adtm,maadtm = ADTM(stocks, date, N = 23, M = 8)
    adtm_df = pd.DataFrame(list(adtm.values()),columns=['ADTM'])
    maadtm_df = pd.DataFrame(list(maadtm.values()),columns=['MAADTM'])

    
    #ATR-真实波幅
    mtr,atr = ATR(stocks, date, timeperiod=14)
    mtr_df = pd.DataFrame(list(mtr.values()),columns=['MTR'])
    atr_df = pd.DataFrame(list(atr.values()),columns=['ATR'])


    #乘离率
    bias,bias_ma = BIAS_QL(stocks, date, M = 6)
    bias_df = pd.DataFrame(list(bias.values()),columns=['BIAS'])


    #商品路径
    cci = CCI(stocks, date, N=14)
    cci_df = pd.DataFrame(list(cci.values()),columns=['CCI'])
    
    #多空线
    dkx,madkx = DKX(stocks, date, M = 10)
    dkx_df = pd.DataFrame(list(dkx.values()),columns=['DKX'])

    #随机指标
    k,d = SKDJ(stocks, date, N = 9, M = 3)
    k_df = pd.DataFrame(list(k.values()),columns=['KBJ'])
    
    #市场趋势
    cye,_ = CYE(stocks, date)
    cye_df = pd.DataFrame(list(cye.values()),columns=['CYE'])
    
    #MFI-资金流量指标
    mfi = MFI(stocks, date, timeperiod=14)
    mfi_df = pd.DataFrame(list(mfi.values()),columns=['MFI'])

    #MTM-动量线
    mtm = MTM(stocks, date, timeperiod=14)
    mtm_df = pd.DataFrame(list(mtm.values()),columns=['MTM'])

    
    #简单波动指标
    emv,_ = EMV(stocks, date, N = 14, M = 9)
    emv_df = pd.DataFrame(list(mtm.values()),columns=['EMV'])
    
    #ROC-变动率指标
    roc = ROC(stocks, date, timeperiod=12)
    roc_df = pd.DataFrame(list(roc.values()),columns=['ROC'])
    
    #RSI-相对强弱指标
    rsi = RSI(stocks, date, N1=6)
    rsi_df = pd.DataFrame(list(rsi.values()),columns=['RSI'])
    
    #MARSI-相对强弱平均线
    rsi10,rsi6 = MARSI(stocks, date, M1 = 10, M2 = 6)
    rsi10_df = pd.DataFrame(list(rsi10.values()),columns=['RSI10'])
    rsi6_df = pd.DataFrame(list(rsi6.values()),columns=['RSI6'])

    
    #OSC-变动速率线
    osc,maosc = OSC(stocks, date, N = 20, M = 6)
    osc_df = pd.DataFrame(list(osc.values()),columns=['OSC'])
    maosc_df = pd.DataFrame(list(maosc.values()),columns=['MAOSC'])

    
    #UDL-引力线
    udl,maudl = UDL(stocks, date, N1 = 3, N2 = 5, N3 = 10, N4 = 20, M = 6)
    udl_df = pd.DataFrame(list(udl.values()),columns=['UDL'])
    maudl_df = pd.DataFrame(list(maudl.values()),columns=['MAUDL'])


    wr,mawr = WR(stocks, date, N = 10, N1 = 6)
    wr_df = pd.DataFrame(list(wr.values()),columns=['WR'])
    mawr_df = pd.DataFrame(list(mawr.values()),columns=['MAWR'])

    #FSL-分水岭
    fsl,mafsl = FSL(stocks, date)
    fsl_df = pd.DataFrame(list(fsl.values()),columns=['FSL'])
    mafsl_df = pd.DataFrame(list(mafsl.values()),columns=['MAFSL'])

    
    #趋势型
    cho,macho = CHO(stocks, date, N1 = 10, N2 = 20, M = 6)
    cho_df = pd.DataFrame(list(cho.values()),columns=['CHO'])
    macho_df = pd.DataFrame(list(macho.values()),columns=['MACHO'])

    dif,difma = DMA(stocks, date, N1 = 10, N2 = 50, M = 10)
    dif_df = pd.DataFrame(list(dif.values()),columns=['DIF'])
    difma_df = pd.DataFrame(list(difma.values()),columns=['DIFMA'])
    
   
    emv,maemv = EMV(stocks, date, N = 14, M = 9)
    emv_df = pd.DataFrame(list(emv.values()),columns=['EMV'])
    maemv_df = pd.DataFrame(list(maemv.values()),columns=['MAEMV'])


    #能量型
    #相对强弱
    br, ar = BRAR(stocks, date, N=26)
    br_df = pd.DataFrame(list(br.values()),columns=['BR'])
    ar_df = pd.DataFrame(list(ar.values()),columns=['AR'])
    
    cr,M1,M2,M3,M4 = CR(stocks, date, N=26, M1=10, M2=20, M3=40, M4=62)
    cr_df = pd.DataFrame(list(cr.values()),columns=['CR'])
    
    mass,mamass = MASS(stocks, date, N1=9, N2=25, M=6)
    mass_df = pd.DataFrame(list(mass.values()),columns=['MASS'])
    mamass_df = pd.DataFrame(list(mamass.values()),columns=['MAMASS'])

    #成交量型
    
    amo,amo1,amo2 = AMO(stocks, date, M1 = 5, M2 = 10)
    amo_df = pd.DataFrame(list(amo.values()),columns=['AMO'])
    amo1_df = pd.DataFrame(list(amo1.values()),columns=['AMO1'])
    amo2_df = pd.DataFrame(list(amo2.values()),columns=['AMO2'])

    df = pd.concat([accer_df,adtm_df,maadtm_df,mtr_df,atr_df,bias_df,cci_df,dkx_df,k_df,cye_df,
                    mfi_df,mtm_df,emv_df,roc_df,rsi_df,rsi10_df,rsi6_df,osc_df,udl_df,maudl_df,wr_df,
                    mawr_df,fsl_df,mafsl_df,cho_df,macho_df,dif_df,difma_df,cr_df,mass_df,mamass_df,
                    amo_df,amo1_df,amo2_df,
                    br_df,ar_df],axis=1)
    df.index = [date]
    df = pd.concat([price,df],axis=1)
    
    return df
def get_data_from_date(start_date,end_date,stocks):
    '''
    获取时间轴数据
    '''
    trade_date = get_trade_days(start_date=start_date,end_date=end_date)
    df = get_factors_one_stock(stocks,trade_date[0])
    for date in trade_date[1:]:
        df1 = get_factors_one_stock(stocks,date)
        df = pd.concat([df,df1])
    return df
data = get_data_from_date(start_date,end_date,stocks)
def get_day_profit(stocks,end_date,start_date=None,count=-1,pre_num=1):
    '''
    获取每天的收益率
    input:
    stocks:list or Series,股票代码
    start_date:开始时间
    end_date:结束时间
    count:与start_date二选一,向前取值个数
    pre_num:int,向前计算的天数
    output:
    profit:dataframe,index为日期,values为收益率,收益率大于0标记为1,否则为0
    '''
    if count == -1:
        price = get_price(stocks,start_date,end_date,fields=['close'])['close']
    else:
        price = get_price(stocks,end_date=end_date,count=count,fields=['close'])['close']
    profit = price.pct_change(periods=pre_num).dropna()
    profit[profit > 0] = 1
    profit[profit < 0] = 0
    profit = profit.to_frame()
    profit.columns=['profit_dis']
    return profit
profit_dis = get_day_profit(stocks,start_date=start_date,end_date=end_date)
def get_day_profit_data(stocks,end_date,start_date=None,count=-1,pre_num=1):
    '''
    获取每天的收益率
    input:
    stocks:list or Series,股票代码
    start_date:开始时间
    end_date:结束时间
    count:与start_date二选一,向前取值个数
    pre_num:int,向前计算的天数
    output:
    profit:dataframe,index为日期,values为收益率,收益率大于0标记为1,否则为0
    '''
    if count == -1:
        price = get_price(stocks,start_date,end_date,fields=['close'])['close']
    else:
        price = get_price(stocks,end_date=end_date,count=count,fields=['close'])['close']
    profit = price.pct_change(periods=pre_num).dropna()
    profit = profit.to_frame()
    profit.columns=['profit']
    return profit
profit = get_day_profit_data(stocks,start_date=start_date,end_date=end_date)
data_profit = pd.concat([profit,profit_dis],axis=1)
index = data_profit.index
index = [ind.date() for ind in index]
index = [datetime.datetime.strftime(ind,'%Y-%m-%d') for ind in index]
data_profit.index = index
data_concat = pd.concat([data,data_profit],axis=1).dropna(axis=1,how='all').dropna()
data_concat
.dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; }
open close high low volume money ACCER ADTM MAADTM MTR ... CR MASS MAMASS AMO AMO1 AMO2 BR AR profit profit_dis
2016-01-05 6663.7260 6889.7410 7020.5410 6627.6420 1.134282e+10 1.567173e+11 -0.033774 0.103386 0.195367 392.8990 ... 61.328407 24.405686 23.756116 1.567173e+07 1.328749e+07 1.467738e+07 53.962648 78.039721 -0.013412 0.0
2016-01-06 6916.5320 7065.6070 7066.0850 6878.6540 1.009304e+10 1.363515e+11 -0.029422 0.263709 0.220436 187.4310 ... 81.048517 24.740819 23.977806 1.363515e+07 1.359704e+07 1.428262e+07 63.770513 92.237866 0.025526 1.0
2016-01-07 6937.4160 6462.2430 6937.4160 6454.9730 2.842674e+09 3.478154e+10 -0.034488 0.208112 0.231059 610.6340 ... 61.128983 25.155129 24.254879 3.478154e+06 1.150453e+07 1.276140e+07 49.294591 72.750838 -0.085395 0.0
2016-01-08 6618.0810 6570.4320 6689.1410 6222.5930 1.111296e+10 1.389340e+11 -0.019077 -0.051259 0.214857 466.5480 ... 57.078890 25.634080 24.604707 1.389340e+07 1.163747e+07 1.273742e+07 56.334665 70.071847 0.016742 1.0
2016-01-11 6435.8940 6128.6560 6529.8790 6128.2230 1.156007e+10 1.331196e+11 -0.032917 -0.240532 0.195078 442.2090 ... 48.078305 26.073200 25.007304 1.331196e+07 1.199808e+07 1.278055e+07 45.064586 61.183308 -0.067237 0.0
2016-01-12 6126.6700 6136.6630 6205.5910 5994.6360 9.190522e+09 1.057272e+11 -0.035711 -0.277086 0.108378 210.9550 ... 42.163158 26.443142 25.408676 1.057272e+07 1.097828e+07 1.213288e+07 45.997216 60.367115 0.001306 1.0
2016-01-13 6168.9870 5924.0440 6226.5410 5924.0440 8.612340e+09 9.845670e+10 -0.025492 -0.280338 -0.015562 302.4970 ... 42.284528 26.774535 25.803484 9.845670e+06 1.022038e+07 1.190871e+07 43.868450 56.349530 -0.034647 0.0
2016-01-14 5734.6080 6124.8200 6137.5070 5726.8830 9.225145e+09 1.042551e+11 -0.017892 -0.255879 -0.066236 410.6240 ... 41.392104 27.101585 26.196945 1.042551e+07 1.160985e+07 1.155719e+07 49.446404 71.322609 0.033892 1.0
2016-01-15 6078.6150 5893.6820 6135.2710 5852.2410 8.863054e+09 1.022773e+11 -0.008175 -0.048379 -0.085206 283.0300 ... 46.500592 27.425037 26.575263 1.022773e+07 1.087672e+07 1.125709e+07 46.717747 67.425849 -0.037738 0.0
2016-01-18 5768.7577 5978.9076 6051.1817 5765.1458 7.246979e+09 8.698031e+10 -0.005785 -0.181086 -0.140806 286.0359 ... 44.627386 27.805402 26.937150 8.698031e+06 9.953933e+06 1.097601e+07 48.000275 74.629467 0.014461 1.0
2016-01-19 5965.3551 6197.0947 6203.3998 5956.2253 8.798102e+09 1.061855e+11 0.006458 -0.085336 -0.177487 247.1745 ... 52.272253 28.100521 27.275037 1.061855e+07 9.963099e+06 1.047069e+07 54.364819 82.225748 0.036493 1.0
2016-01-20 6161.2319 6138.5031 6244.9612 6095.7216 9.194752e+09 1.133945e+11 0.005389 -0.056109 -0.178093 149.2396 ... 53.839254 28.319922 27.587834 1.133945e+07 1.026186e+07 1.024112e+07 51.571102 77.053864 -0.009455 0.0
2016-01-21 6031.1695 5886.8835 6191.3173 5886.8835 8.188276e+09 1.005553e+11 0.002480 -0.203549 -0.173470 304.4338 ... 46.794107 28.498707 27.875195 1.005553e+07 1.018786e+07 1.089886e+07 47.664853 76.054134 -0.040990 0.0
2016-01-22 5954.3643 5980.6111 6013.9706 5799.7117 6.983071e+09 8.442937e+10 -0.005130 -0.262440 -0.171639 214.2589 ... 42.922207 28.578580 28.121361 8.442937e+06 9.830900e+06 1.035381e+07 47.868975 73.463078 0.015921 1.0
2016-01-25 6034.0844 6041.3261 6093.2693 5976.3363 6.530851e+09 7.867768e+10 -0.007770 -0.278809 -0.171448 116.9330 ... 43.096433 28.615792 28.319820 7.867768e+06 9.664847e+06 9.809390e+06 46.149629 69.997841 0.010152 1.0
2016-01-26 5966.7049 5589.3069 5982.4993 5573.8266 8.544392e+09 9.810410e+10 -0.016888 -0.417006 -0.191589 467.4995 ... 37.635734 28.689761 28.467214 9.810410e+06 9.503219e+06 9.733159e+06 41.701218 63.667711 -0.074821 0.0
2016-01-27 5604.1667 5514.1886 5628.1856 5240.6457 8.792412e+09 9.535493e+10 -0.020614 -0.469818 -0.244269 387.5399 ... 32.690003 28.724038 28.571133 9.535493e+06 9.142428e+06 9.702141e+06 37.805510 56.806062 -0.013440 0.0
2016-01-28 5441.7378 5271.2325 5511.3206 5257.0812 6.733808e+09 7.315274e+10 -0.036915 -0.500799 -0.284233 257.1074 ... 31.198836 28.711882 28.636460 7.315274e+06 8.594376e+06 9.391118e+06 34.996236 55.746143 -0.044060 0.0
2016-01-29 5262.4596 5469.1254 5515.4864 5248.2150 7.235909e+09 7.830371e+10 -0.026741 -0.523053 -0.338948 267.2714 ... 31.634291 28.659395 28.663241 7.830371e+06 8.471863e+06 9.151382e+06 39.619276 62.037772 0.037542 1.0
2016-02-01 5458.6048 5399.4771 5493.9818 5319.7007 6.471611e+09 7.086333e+10 -0.007866 -0.444234 -0.387463 174.2811 ... 34.060319 28.722404 28.687212 7.086333e+06 8.315576e+06 8.990212e+06 39.890095 62.215496 -0.012735 0.0
2016-02-02 5407.7068 5586.6266 5594.3607 5407.6681 6.657196e+09 7.567418e+10 0.004889 -0.480827 -0.422123 194.8836 ... 35.509556 28.672878 28.696726 7.567418e+06 7.866978e+06 8.685099e+06 42.675524 65.688524 0.034661 1.0
2016-02-03 5520.6415 5610.1734 5634.9198 5482.2247 6.328073e+09 7.062492e+10 0.014178 -0.458559 -0.446638 152.6951 ... 37.517234 28.527640 28.669706 7.062492e+06 7.372378e+06 8.257403e+06 43.414685 70.510162 0.004215 1.0
2016-02-04 5631.3198 5721.4853 5744.0236 5631.3198 7.108398e+09 8.322622e+10 0.012504 -0.245624 -0.442490 133.8502 ... 41.793586 28.408172 28.617062 8.322622e+06 7.573847e+06 8.084112e+06 45.092782 71.972026 0.019841 1.0
2016-02-05 5734.4807 5664.1971 5747.9794 5662.4365 5.607554e+09 6.532197e+10 0.011728 -0.183561 -0.413309 85.5429 ... 40.314220 28.245359 28.539308 6.532197e+06 7.314212e+06 7.893038e+06 43.784455 70.286227 -0.010013 0.0
2016-02-15 5485.7230 5668.0398 5700.5221 5482.7957 5.363125e+09 6.335130e+10 0.003826 -0.304867 -0.392691 217.7264 ... 38.465993 27.619241 28.365949 6.335130e+06 7.163972e+06 7.739774e+06 43.572896 77.166748 0.000678 1.0
2016-02-16 5694.7487 5902.5341 5918.5932 5694.7487 8.287714e+09 9.853490e+10 0.009001 -0.208391 -0.356140 250.5534 ... 49.712291 26.835981 28.051545 9.853490e+06 7.621186e+06 7.744082e+06 54.739266 97.180582 0.041371 1.0
2016-02-17 5889.5021 5971.4123 5979.4073 5854.2197 9.058252e+09 1.078570e+11 0.012362 0.081631 -0.280554 125.1876 ... 60.656122 26.102972 27.623228 1.078570e+07 8.365828e+06 7.869103e+06 59.516647 89.603707 0.011669 1.0
2016-02-18 6009.7485 5957.1305 6034.8519 5945.2040 9.371480e+09 1.101097e+11 0.014927 0.292758 -0.188430 89.6479 ... 57.990047 25.232855 27.074097 1.101097e+07 8.903498e+06 8.238673e+06 56.726834 85.412110 -0.002392 0.0
2016-02-19 5939.6973 5979.5164 6009.7240 5919.6832 7.038050e+09 8.684912e+10 0.011331 0.349324 -0.084661 90.0408 ... 64.627884 24.312137 26.391424 8.684912e+06 9.334041e+06 8.324127e+06 66.676823 100.525871 0.003758 1.0
2016-02-22 6055.6751 6103.7225 6111.6899 6029.4589 9.432052e+09 1.108016e+11 0.006725 0.367970 0.018655 132.1735 ... 76.778650 23.469802 25.595498 1.108016e+07 1.028305e+07 8.723510e+06 68.506189 113.681365 0.020772 1.0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2018-09-13 4727.4300 4721.2100 4752.9000 4655.2100 5.569210e+09 4.727921e+10 -0.003221 0.011175 0.200413 97.6900 ... 69.946715 23.967377 24.337838 4.727921e+06 4.532770e+06 4.619689e+06 74.260672 98.480900 0.005803 1.0
2018-09-14 4720.7800 4670.0800 4724.0000 4669.9400 5.644793e+09 4.493192e+10 -0.001098 -0.162627 0.156551 54.0600 ... 64.427541 23.802394 24.204614 4.493192e+06 4.366552e+06 4.648480e+06 61.099788 79.808969 -0.010830 0.0
2018-09-17 4649.9500 4602.3900 4649.9500 4586.3200 4.933895e+09 3.790586e+10 -0.004887 -0.104195 0.106659 83.7600 ... 53.579840 23.604222 24.056001 3.790586e+06 4.217149e+06 4.562581e+06 55.468585 72.928115 -0.014494 0.0
2018-09-18 4588.2300 4683.5100 4684.2900 4584.4300 5.443233e+09 4.457916e+10 -0.002984 -0.046867 0.049788 99.8600 ... 55.431872 23.562772 23.907221 4.457916e+06 4.264419e+06 4.458680e+06 61.550877 75.261002 0.017626 1.0
2018-09-19 4673.9500 4736.8900 4772.9500 4664.1300 6.705586e+09 5.567138e+10 0.000946 0.042052 0.003803 108.8200 ... 61.326228 23.667366 23.795992 5.567138e+06 4.607351e+06 4.542994e+06 69.087141 83.588936 0.011397 1.0
2018-09-20 4734.4800 4733.4800 4761.4100 4723.3000 5.639897e+09 4.357434e+10 0.005520 0.216915 0.006135 38.1100 ... 69.225693 23.656107 23.710040 4.357434e+06 4.533253e+06 4.533012e+06 76.719923 93.487073 -0.000720 0.0
2018-09-21 4743.8100 4805.2200 4813.2900 4721.6900 6.589520e+09 5.250771e+10 0.009482 0.216408 0.013203 91.6000 ... 81.780129 23.656792 23.658276 5.250771e+06 4.684769e+06 4.525661e+06 89.122726 95.465800 0.015156 1.0
2018-09-25 4774.7900 4792.5500 4810.3400 4763.2100 5.265424e+09 4.264165e+10 0.005976 0.204063 0.047116 47.1300 ... 87.418677 23.694994 23.640376 4.264165e+06 4.779485e+06 4.498317e+06 89.051036 109.867677 -0.002637 0.0
2018-09-26 4794.0700 4815.6300 4841.7000 4782.3900 6.471956e+09 5.170976e+10 0.004497 0.273951 0.079963 59.3100 ... 103.199804 23.731866 23.661650 5.170976e+06 4.922097e+06 4.593258e+06 96.160201 116.778116 0.004816 1.0
2018-09-27 4812.6400 4755.3800 4817.6200 4754.2800 6.143767e+09 4.709783e+10 0.001140 0.265048 0.133422 63.3400 ... 87.797973 23.831006 23.706355 4.709783e+06 4.750626e+06 4.678988e+06 85.369181 103.925847 -0.012511 0.0
2018-09-28 4762.3300 4800.7300 4802.1600 4746.1800 5.726864e+09 4.593874e+10 -0.000961 0.105008 0.159572 55.9800 ... 86.656692 23.900306 23.745179 4.593874e+06 4.797914e+06 4.665583e+06 94.883650 111.995422 0.009537 1.0
2018-10-08 4722.4200 4646.4900 4737.2900 4639.4000 6.327566e+09 4.949483e+10 -0.006608 -0.209880 0.139196 161.3300 ... 76.833532 24.047857 23.810470 4.949483e+06 4.737656e+06 4.711213e+06 78.120821 99.340290 -0.032128 0.0
2018-10-09 4652.4900 4634.8900 4685.2200 4612.6400 5.043745e+09 4.124402e+10 -0.010148 -0.269989 0.100190 72.5800 ... 71.410050 24.074624 23.880109 4.124402e+06 4.709704e+06 4.744594e+06 79.902892 98.088716 -0.002497 0.0
2018-10-10 4634.6400 4629.8900 4655.5500 4586.8900 5.133964e+09 3.954108e+10 -0.009003 -0.235881 0.043591 68.6600 ... 60.490738 24.133435 23.953182 3.954108e+06 4.466330e+06 4.694213e+06 68.749785 85.112210 -0.001079 0.0
2018-10-11 4483.2700 4307.4200 4503.2000 4274.7000 9.056360e+09 6.448032e+10 -0.023291 -0.392253 -0.032492 355.1900 ... 43.731282 24.506854 24.082347 6.448032e+06 4.813980e+06 4.782303e+06 51.307174 71.237939 -0.069650 0.0
2018-10-12 4267.3600 4278.0400 4300.0100 4136.6300 8.152515e+09 5.670063e+10 -0.024880 -0.447213 -0.113901 170.7900 ... 38.549977 24.980842 24.273986 5.670063e+06 5.029218e+06 4.913566e+06 47.363092 68.756647 -0.006821 0.0
2018-10-15 4283.3600 4199.1500 4317.7100 4195.8300 6.229185e+09 4.484900e+10 -0.029133 -0.490331 -0.209436 121.8800 ... 45.391382 25.518870 24.543747 4.484900e+06 4.936301e+06 4.836979e+06 49.093801 69.663600 -0.018441 0.0
2018-10-16 4193.3800 4099.6800 4221.3400 4076.8700 6.107666e+09 4.401588e+10 -0.028507 -0.596143 -0.317085 144.4700 ... 43.525290 26.058143 24.878795 4.401588e+06 4.991738e+06 4.850721e+06 48.656020 67.771137 -0.023688 0.0
2018-10-17 4147.3500 4133.3800 4167.7600 4038.4200 6.952388e+09 4.713772e+10 -0.012736 -0.627584 -0.408659 129.3400 ... 43.450150 26.528915 25.287843 4.713772e+06 5.143671e+06 4.805000e+06 51.131251 64.354712 0.008220 1.0
2018-10-18 4104.3300 4018.4600 4104.3300 4013.6500 5.978391e+09 4.038261e+10 -0.014556 -0.645559 -0.463119 119.7300 ... 36.282002 26.914567 25.751365 4.038261e+06 4.661717e+06 4.737848e+06 44.393792 56.040237 -0.027803 0.0
2018-10-19 3958.4500 4127.9900 4130.9300 3948.5600 6.956158e+09 4.993703e+10 -0.005415 -0.617363 -0.506541 182.3700 ... 37.777959 27.247880 26.208203 4.993703e+06 4.526445e+06 4.777831e+06 50.752837 69.198765 0.027257 1.0
2018-10-22 4165.7600 4333.1400 4375.6000 4165.7600 9.295832e+09 6.951100e+10 0.010651 -0.392518 -0.526121 247.6100 ... 55.956711 27.581421 26.641633 6.951100e+06 5.019685e+06 4.977993e+06 64.027530 81.750285 0.049697 1.0
2018-10-23 4336.4500 4253.5600 4350.7300 4228.3300 7.466806e+09 5.444839e+10 0.013049 -0.204236 -0.502618 122.4000 ... 57.432321 27.747775 27.013117 5.444839e+06 5.228335e+06 5.110037e+06 58.471927 74.170288 -0.018365 0.0
2018-10-24 4233.8400 4245.4900 4297.3700 4207.1600 6.282348e+09 4.426471e+10 0.013653 -0.252139 -0.478234 90.2100 ... 58.702171 27.705536 27.287683 4.426471e+06 5.170875e+06 5.157273e+06 62.230431 81.798145 -0.001897 0.0
2018-10-25 4124.1900 4234.4200 4240.5800 4103.5600 6.968721e+09 4.837184e+10 0.002957 -0.228771 -0.445539 141.9300 ... 56.672138 27.677517 27.479116 4.837184e+06 5.330659e+06 4.996188e+06 56.982746 88.190540 -0.002607 0.0
2018-10-26 4260.9100 4233.9600 4294.5600 4217.7900 6.798362e+09 4.682410e+10 -0.005137 -0.023063 -0.373904 76.7700 ... 61.913304 27.553745 27.585646 4.682410e+06 5.268401e+06 4.897423e+06 58.693241 86.840421 -0.000109 0.0
2018-10-29 4220.6400 4162.5300 4227.6400 4144.8100 5.745157e+09 4.127810e+10 -0.004651 -0.095220 -0.307358 89.1500 ... 57.950275 27.438440 27.617406 4.127810e+06 4.703743e+06 4.861714e+06 54.140324 85.357584 -0.016871 0.0
2018-10-30 4145.4000 4204.5400 4233.7900 4096.1400 7.283465e+09 5.230816e+10 -0.003658 -0.235358 -0.256083 137.6500 ... 57.724272 27.358520 27.580256 5.230816e+06 4.660938e+06 4.944636e+06 57.217529 91.302167 0.010092 1.0
2018-10-31 4207.7600 4272.5500 4298.2400 4204.3600 7.468776e+09 5.324331e+10 0.001096 -0.206124 -0.204678 93.8800 ... 67.485252 27.255069 27.498138 5.324331e+06 4.840510e+06 5.005692e+06 64.763499 101.728797 0.016175 1.0
2018-11-01 4299.4600 4298.9800 4367.7300 4294.8700 8.605914e+09 6.500469e+10 0.005584 -0.184435 -0.178668 95.1800 ... 71.161960 27.045829 27.388187 6.500469e+06 5.173167e+06 5.251913e+06 66.104624 99.680511 0.006186 1.0

689 rows × 42 columns

columns = data_concat.columns
data_x = data_concat[columns[:-2]]
data_y = data_concat[columns[-1]]
data_x
.dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; }
open close high low volume money ACCER ADTM MAADTM MTR ... DIF DIFMA CR MASS MAMASS AMO AMO1 AMO2 BR AR
2016-01-05 6663.7260 6889.7410 7020.5410 6627.6420 1.134282e+10 1.567173e+11 -0.033774 0.103386 0.195367 392.8990 ... 63.850480 186.924138 61.328407 24.405686 23.756116 1.567173e+07 1.328749e+07 1.467738e+07 53.962648 78.039721
2016-01-06 6916.5320 7065.6070 7066.0850 6878.6540 1.009304e+10 1.363515e+11 -0.029422 0.263709 0.220436 187.4310 ... -11.933040 170.916110 81.048517 24.740819 23.977806 1.363515e+07 1.359704e+07 1.428262e+07 63.770513 92.237866
2016-01-07 6937.4160 6462.2430 6937.4160 6454.9730 2.842674e+09 3.478154e+10 -0.034488 0.208112 0.231059 610.6340 ... -131.304000 141.078470 61.128983 25.155129 24.254879 3.478154e+06 1.150453e+07 1.276140e+07 49.294591 72.750838
2016-01-08 6618.0810 6570.4320 6689.1410 6222.5930 1.111296e+10 1.389340e+11 -0.019077 -0.051259 0.214857 466.5480 ... -238.793640 98.111196 57.078890 25.634080 24.604707 1.389340e+07 1.163747e+07 1.273742e+07 56.334665 70.071847
2016-01-11 6435.8940 6128.6560 6529.8790 6128.2230 1.156007e+10 1.331196e+11 -0.032917 -0.240532 0.195078 442.2090 ... -386.014140 36.896780 48.078305 26.073200 25.007304 1.331196e+07 1.199808e+07 1.278055e+07 45.064586 61.183308
2016-01-12 6126.6700 6136.6630 6205.5910 5994.6360 9.190522e+09 1.057272e+11 -0.035711 -0.277086 0.108378 210.9550 ... -516.688560 -37.896410 42.163158 26.443142 25.408676 1.057272e+07 1.097828e+07 1.213288e+07 45.997216 60.367115
2016-01-13 6168.9870 5924.0440 6226.5410 5924.0440 8.612340e+09 9.845670e+10 -0.025492 -0.280338 -0.015562 302.4970 ... -671.801780 -128.938366 42.284528 26.774535 25.803484 9.845670e+06 1.022038e+07 1.190871e+07 43.868450 56.349530
2016-01-14 5734.6080 6124.8200 6137.5070 5726.8830 9.225145e+09 1.042551e+11 -0.017892 -0.255879 -0.066236 410.6240 ... -810.561120 -233.916080 41.392104 27.101585 26.196945 1.042551e+07 1.160985e+07 1.155719e+07 49.446404 71.322609
2016-01-15 6078.6150 5893.6820 6135.2710 5852.2410 8.863054e+09 1.022773e+11 -0.008175 -0.048379 -0.085206 283.0300 ... -955.306440 -351.075974 46.500592 27.425037 26.575263 1.022773e+07 1.087672e+07 1.125709e+07 46.717747 67.425849
2016-01-18 5768.7577 5978.9076 6051.1817 5765.1458 7.246979e+09 8.698031e+10 -0.005785 -0.181086 -0.140806 286.0359 ... -1026.397292 -468.494953 44.627386 27.805402 26.937150 8.698031e+06 9.953933e+06 1.097601e+07 48.000275 74.629467
2016-01-19 5965.3551 6197.0947 6203.3998 5956.2253 8.798102e+09 1.061855e+11 0.006458 -0.085336 -0.177487 247.1745 ... -1068.727796 -581.752781 52.272253 28.100521 27.275037 1.061855e+07 9.963099e+06 1.047069e+07 54.364819 82.225748
2016-01-20 6161.2319 6138.5031 6244.9612 6095.7216 9.194752e+09 1.133945e+11 0.005389 -0.056109 -0.178093 149.2396 ... -1132.708088 -693.830286 53.839254 28.319922 27.587834 1.133945e+07 1.026186e+07 1.024112e+07 51.571102 77.053864
2016-01-21 6031.1695 5886.8835 6191.3173 5886.8835 8.188276e+09 1.005553e+11 0.002480 -0.203549 -0.173470 304.4338 ... -1154.288388 -796.128724 46.794107 28.498707 27.875195 1.005553e+07 1.018786e+07 1.089886e+07 47.664853 76.054134
2016-01-22 5954.3643 5980.6111 6013.9706 5799.7117 6.983071e+09 8.442937e+10 -0.005130 -0.262440 -0.171639 214.2589 ... -1179.233660 -890.172726 42.922207 28.578580 28.121361 8.442937e+06 9.830900e+06 1.035381e+07 47.868975 73.463078
2016-01-25 6034.0844 6041.3261 6093.2693 5976.3363 6.530851e+09 7.867768e+10 -0.007770 -0.278809 -0.171448 116.9330 ... -1157.451012 -967.316414 43.096433 28.615792 28.319820 7.867768e+06 9.664847e+06 9.809390e+06 46.149629 69.997841
2016-01-26 5966.7049 5589.3069 5982.4993 5573.8266 8.544392e+09 9.810410e+10 -0.016888 -0.417006 -0.191589 467.4995 ... -1169.595660 -1032.607124 37.635734 28.689761 28.467214 9.810410e+06 9.503219e+06 9.733159e+06 41.701218 63.667711
2016-01-27 5604.1667 5514.1886 5628.1856 5240.6457 8.792412e+09 9.535493e+10 -0.020614 -0.469818 -0.244269 387.5399 ... -1167.857352 -1082.212681 32.690003 28.724038 28.571133 9.535493e+06 9.142428e+06 9.702141e+06 37.805510 56.806062
2016-01-28 5441.7378 5271.2325 5511.3206 5257.0812 6.733808e+09 7.315274e+10 -0.036915 -0.500799 -0.284233 257.1074 ... -1208.101912 -1121.966760 31.198836 28.711882 28.636460 7.315274e+06 8.594376e+06 9.391118e+06 34.996236 55.746143
2016-01-29 5262.4596 5469.1254 5515.4864 5248.2150 7.235909e+09 7.830371e+10 -0.026741 -0.523053 -0.338948 267.2714 ... -1205.567640 -1146.992880 31.634291 28.659395 28.663241 7.830371e+06 8.471863e+06 9.151382e+06 39.619276 62.037772
2016-02-01 5458.6048 5399.4771 5493.9818 5319.7007 6.471611e+09 7.086333e+10 -0.007866 -0.444234 -0.387463 174.2811 ... -1215.374912 -1165.890642 34.060319 28.722404 28.687212 7.086333e+06 8.315576e+06 8.990212e+06 39.890095 62.215496
2016-02-02 5407.7068 5586.6266 5594.3607 5407.6681 6.657196e+09 7.567418e+10 0.004889 -0.480827 -0.422123 194.8836 ... -1233.270614 -1182.344924 35.509556 28.672878 28.696726 7.567418e+06 7.866978e+06 8.685099e+06 42.675524 65.688524
2016-02-03 5520.6415 5610.1734 5634.9198 5482.2247 6.328073e+09 7.062492e+10 0.014178 -0.458559 -0.446638 152.6951 ... -1242.337092 -1193.307824 37.517234 28.527640 28.669706 7.062492e+06 7.372378e+06 8.257403e+06 43.414685 70.510162
2016-02-04 5631.3198 5721.4853 5744.0236 5631.3198 7.108398e+09 8.322622e+10 0.012504 -0.245624 -0.442490 133.8502 ... -1214.858938 -1199.364879 41.793586 28.408172 28.617062 8.322622e+06 7.573847e+06 8.084112e+06 45.092782 71.972026
2016-02-05 5734.4807 5664.1971 5747.9794 5662.4365 5.607554e+09 6.532197e+10 0.011728 -0.183561 -0.413309 85.5429 ... -1202.302680 -1201.671781 40.314220 28.245359 28.539308 6.532197e+06 7.314212e+06 7.893038e+06 43.784455 70.286227
2016-02-15 5485.7230 5668.0398 5700.5221 5482.7957 5.363125e+09 6.335130e+10 0.003826 -0.304867 -0.392691 217.7264 ... -1205.469626 -1206.473643 38.465993 27.619241 28.365949 6.335130e+06 7.163972e+06 7.739774e+06 43.572896 77.166748
2016-02-16 5694.7487 5902.5341 5918.5932 5694.7487 8.287714e+09 9.853490e+10 0.009001 -0.208391 -0.356140 250.5534 ... -1144.029608 -1203.917037 49.712291 26.835981 28.051545 9.853490e+06 7.621186e+06 7.744082e+06 54.739266 97.180582
2016-02-17 5889.5021 5971.4123 5979.4073 5854.2197 9.058252e+09 1.078570e+11 0.012362 0.081631 -0.280554 125.1876 ... -1068.856584 -1194.016961 60.656122 26.102972 27.623228 1.078570e+07 8.365828e+06 7.869103e+06 59.516647 89.603707
2016-02-18 6009.7485 5957.1305 6034.8519 5945.2040 9.371480e+09 1.101097e+11 0.014927 0.292758 -0.188430 89.6479 ... -971.292194 -1170.335989 57.990047 25.232855 27.074097 1.101097e+07 8.903498e+06 8.238673e+06 56.726834 85.412110
2016-02-19 5939.6973 5979.5164 6009.7240 5919.6832 7.038050e+09 8.684912e+10 0.011331 0.349324 -0.084661 90.0408 ... -888.433262 -1138.622551 64.627884 24.312137 26.391424 8.684912e+06 9.334041e+06 8.324127e+06 66.676823 100.525871
2016-02-22 6055.6751 6103.7225 6111.6899 6029.4589 9.432052e+09 1.108016e+11 0.006725 0.367970 0.018655 132.1735 ... -789.849672 -1096.070027 76.778650 23.469802 25.595498 1.108016e+07 1.028305e+07 8.723510e+06 68.506189 113.681365
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2018-09-13 4727.4300 4721.2100 4752.9000 4655.2100 5.569210e+09 4.727921e+10 -0.003221 0.011175 0.200413 97.6900 ... -212.223968 -171.924877 69.946715 23.967377 24.337838 4.727921e+06 4.532770e+06 4.619689e+06 74.260672 98.480900
2018-09-14 4720.7800 4670.0800 4724.0000 4669.9400 5.644793e+09 4.493192e+10 -0.001098 -0.162627 0.156551 54.0600 ... -220.166968 -177.095321 64.427541 23.802394 24.204614 4.493192e+06 4.366552e+06 4.648480e+06 61.099788 79.808969
2018-09-17 4649.9500 4602.3900 4649.9500 4586.3200 4.933895e+09 3.790586e+10 -0.004887 -0.104195 0.106659 83.7600 ... -233.711768 -184.254765 53.579840 23.604222 24.056001 3.790586e+06 4.217149e+06 4.562581e+06 55.468585 72.928115
2018-09-18 4588.2300 4683.5100 4684.2900 4584.4300 5.443233e+09 4.457916e+10 -0.002984 -0.046867 0.049788 99.8600 ... -245.328368 -193.098868 55.431872 23.562772 23.907221 4.457916e+06 4.264419e+06 4.458680e+06 61.550877 75.261002
2018-09-19 4673.9500 4736.8900 4772.9500 4664.1300 6.705586e+09 5.567138e+10 0.000946 0.042052 0.003803 108.8200 ... -247.156968 -202.494888 61.326228 23.667366 23.795992 5.567138e+06 4.607351e+06 4.542994e+06 69.087141 83.588936
2018-09-20 4734.4800 4733.4800 4761.4100 4723.3000 5.639897e+09 4.357434e+10 0.005520 0.216915 0.006135 38.1100 ... -244.363568 -211.431888 69.225693 23.656107 23.710040 4.357434e+06 4.533253e+06 4.533012e+06 76.719923 93.487073
2018-09-21 4743.8100 4805.2200 4813.2900 4721.6900 6.589520e+09 5.250771e+10 0.009482 0.216408 0.013203 91.6000 ... -235.040168 -219.683408 81.780129 23.656792 23.658276 5.250771e+06 4.684769e+06 4.525661e+06 89.122726 95.465800
2018-09-25 4774.7900 4792.5500 4810.3400 4763.2100 5.265424e+09 4.264165e+10 0.005976 0.204063 0.047116 47.1300 ... -218.220168 -224.568828 87.418677 23.694994 23.640376 4.264165e+06 4.779485e+06 4.498317e+06 89.051036 109.867677
2018-09-26 4794.0700 4815.6300 4841.7000 4782.3900 6.471956e+09 5.170976e+10 0.004497 0.273951 0.079963 59.3100 ... -199.335768 -225.841168 103.199804 23.731866 23.661650 5.170976e+06 4.922097e+06 4.593258e+06 96.160201 116.778116
2018-09-27 4812.6400 4755.3800 4817.6200 4754.2800 6.143767e+09 4.709783e+10 0.001140 0.265048 0.133422 63.3400 ... -185.169568 -224.071728 87.797973 23.831006 23.706355 4.709783e+06 4.750626e+06 4.678988e+06 85.369181 103.925847
2018-09-28 4762.3300 4800.7300 4802.1600 4746.1800 5.726864e+09 4.593874e+10 -0.000961 0.105008 0.159572 55.9800 ... -171.005568 -219.949888 86.656692 23.900306 23.745179 4.593874e+06 4.797914e+06 4.665583e+06 94.883650 111.995422
2018-10-08 4722.4200 4646.4900 4737.2900 4639.4000 6.327566e+09 4.949483e+10 -0.006608 -0.209880 0.139196 161.3300 ... -162.789368 -214.212128 76.833532 24.047857 23.810470 4.949483e+06 4.737656e+06 4.711213e+06 78.120821 99.340290
2018-10-09 4652.4900 4634.8900 4685.2200 4612.6400 5.043745e+09 4.124402e+10 -0.010148 -0.269989 0.100190 72.5800 ... -147.957168 -205.636668 71.410050 24.074624 23.880109 4.124402e+06 4.709704e+06 4.744594e+06 79.902892 98.088716
2018-10-10 4634.6400 4629.8900 4655.5500 4586.8900 5.133964e+09 3.954108e+10 -0.009003 -0.235881 0.043591 68.6600 ... -139.824168 -195.086248 60.490738 24.133435 23.953182 3.954108e+06 4.466330e+06 4.694213e+06 68.749785 85.112210
2018-10-11 4483.2700 4307.4200 4503.2000 4274.7000 9.056360e+09 6.448032e+10 -0.023291 -0.392253 -0.032492 355.1900 ... -162.746368 -186.645188 43.731282 24.506854 24.082347 6.448032e+06 4.813980e+06 4.782303e+06 51.307174 71.237939
2018-10-12 4267.3600 4278.0400 4300.0100 4136.6300 8.152515e+09 5.670063e+10 -0.024880 -0.447213 -0.113901 170.7900 ... -188.381568 -181.046988 38.549977 24.980842 24.273986 5.670063e+06 5.029218e+06 4.913566e+06 47.363092 68.756647
2018-10-15 4283.3600 4199.1500 4317.7100 4195.8300 6.229185e+09 4.484900e+10 -0.029133 -0.490331 -0.209436 121.8800 ... -227.985768 -180.341548 45.391382 25.518870 24.543747 4.484900e+06 4.936301e+06 4.836979e+06 49.093801 69.663600
2018-10-16 4193.3800 4099.6800 4221.3400 4076.8700 6.107666e+09 4.401588e+10 -0.028507 -0.596143 -0.317085 144.4700 ... -275.563368 -186.075868 43.525290 26.058143 24.878795 4.401588e+06 4.991738e+06 4.850721e+06 48.656020 67.771137
2018-10-17 4147.3500 4133.3800 4167.7600 4038.4200 6.952388e+09 4.713772e+10 -0.012736 -0.627584 -0.408659 129.3400 ... -322.681568 -198.410448 43.450150 26.528915 25.287843 4.713772e+06 5.143671e+06 4.805000e+06 51.131251 64.354712
2018-10-18 4104.3300 4018.4600 4104.3300 4013.6500 5.978391e+09 4.038261e+10 -0.014556 -0.645559 -0.463119 119.7300 ... -374.635368 -217.357028 36.282002 26.914567 25.751365 4.038261e+06 4.661717e+06 4.737848e+06 44.393792 56.040237
2018-10-19 3958.4500 4127.9900 4130.9300 3948.5600 6.956158e+09 4.993703e+10 -0.005415 -0.617363 -0.506541 182.3700 ... -424.548968 -242.711368 37.777959 27.247880 26.208203 4.993703e+06 4.526445e+06 4.777831e+06 50.752837 69.198765
2018-10-22 4165.7600 4333.1400 4375.6000 4165.7600 9.295832e+09 6.951100e+10 0.010651 -0.392518 -0.526121 247.6100 ... -443.951768 -270.827608 55.956711 27.581421 26.641633 6.951100e+06 5.019685e+06 4.977993e+06 64.027530 81.750285
2018-10-23 4336.4500 4253.5600 4350.7300 4228.3300 7.466806e+09 5.444839e+10 0.013049 -0.204236 -0.502618 122.4000 ... -470.650968 -303.096988 57.432321 27.747775 27.013117 5.444839e+06 5.228335e+06 5.110037e+06 58.471927 74.170288
2018-10-24 4233.8400 4245.4900 4297.3700 4207.1600 6.282348e+09 4.426471e+10 0.013653 -0.252139 -0.478234 90.2100 ... -494.863368 -338.600908 58.702171 27.705536 27.287683 4.426471e+06 5.170875e+06 5.157273e+06 62.230431 81.798145
2018-10-25 4124.1900 4234.4200 4240.5800 4103.5600 6.968721e+09 4.837184e+10 0.002957 -0.228771 -0.445539 141.9300 ... -489.240368 -371.250308 56.672138 27.677517 27.479116 4.837184e+06 5.330659e+06 4.996188e+06 56.982746 88.190540
2018-10-26 4260.9100 4233.9600 4294.5600 4217.7900 6.798362e+09 4.682410e+10 -0.005137 -0.023063 -0.373904 76.7700 ... -478.232168 -400.235368 61.913304 27.553745 27.585646 4.682410e+06 5.268401e+06 4.897423e+06 58.693241 86.840421
2018-10-29 4220.6400 4162.5300 4227.6400 4144.8100 5.745157e+09 4.127810e+10 -0.004651 -0.095220 -0.307358 89.1500 ... -464.534568 -423.890248 57.950275 27.438440 27.617406 4.127810e+06 4.703743e+06 4.861714e+06 54.140324 85.357584
2018-10-30 4145.4000 4204.5400 4233.7900 4096.1400 7.283465e+09 5.230816e+10 -0.003658 -0.235358 -0.256083 137.6500 ... -437.033768 -440.037288 57.724272 27.358520 27.580256 5.230816e+06 4.660938e+06 4.944636e+06 57.217529 91.302167
2018-10-31 4207.7600 4272.5500 4298.2400 4204.3600 7.468776e+09 5.324331e+10 0.001096 -0.206124 -0.204678 93.8800 ... -407.810168 -448.550148 67.485252 27.255069 27.498138 5.324331e+06 4.840510e+06 5.005692e+06 64.763499 101.728797
2018-11-01 4299.4600 4298.9800 4367.7300 4294.8700 8.605914e+09 6.500469e+10 0.005584 -0.184435 -0.178668 95.1800 ... -367.081568 -447.794768 71.161960 27.045829 27.388187 6.500469e+06 5.173167e+06 5.251913e+06 66.104624 99.680511

689 rows × 40 columns

数据处理¶

def winsorize_and_standarlize(data,qrange=[0.05,0.95],axis=0):
    '''
    input:
    data:Dataframe or series,输入数据
    qrange:list,list[0]下分位数,list[1],上分位数,极值用分位数代替
    '''
    if isinstance(data,pd.DataFrame):
        if axis == 0:
            q_down = data.quantile(qrange[0])
            q_up = data.quantile(qrange[1])
            index = data.index
            col = data.columns
            for n in col:
                array = np.array(data[n])
                data[n][data[n] > q_up[n]] = q_up[n]
                data[n][data[n] < q_down[n]] = q_down[n]
            data = (data - data.mean())/data.std()
            data = data.fillna(0)
        else:
            data = data.stack()
            data = data.unstack(0)
            q = data.quantile(qrange)
            index = data.index
            col = data.columns
            for n in col:
                data[n][data[n] > q[n]] = q[n]
            data = (data - data.mean())/data.std()
            data = data.stack().unstack(0)
            data = data.fillna(0)
            
    elif isinstance(data,pd.Series):
        name = data.name
        q = data.quantile(qrange)
        data[data>q] = q
        data = (data - data.mean())/data.std()
    return data
datax_new = winsorize_and_standarlize(data_x)
/opt/conda/envs/python3new/lib/python3.6/site-packages/pandas/core/generic.py:5233: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self._update_inplace(new_data)
/opt/conda/envs/python3new/lib/python3.6/site-packages/ipykernel_launcher.py:37: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
#PCA降维
def pca_analysis(data,n_components='mle'):
    index = data.index
    model = PCA(n_components=n_components)
    model.fit(data)
    data_pca = model.transform(data)
    df = pd.DataFrame(data_pca,index=index)
    return df

特征选择¶

class FeatureSelection():
    '''
    特征选择:
    identify_collinear:基于相关系数,删除小于correlation_threshold的特征
    identify_importance_lgbm:基于LightGBM算法,得到feature_importance,选择和大于p_importance的特征
    filter_select:单变量选择,指定k,selectKBest基于method提供的算法选择前k个特征,selectPercentile选择前p百分百的特征
    wrapper_select:RFE,基于estimator递归特征消除,保留n_feature_to_select个特征
    '''
    def __init__(self):
        self.supports = None #bool型,特征是否被选中
        self.columns = None  #选择的特征
        self.record_collinear = None #自相关矩阵大于门限值
        
    def identify_collinear(self, data, correlation_threshold):
        """
        Finds collinear features based on the correlation coefficient between features. 
        For each pair of features with a correlation coefficient greather than `correlation_threshold`,
        only one of the pair is identified for removal. 

        Using code adapted from: https://gist.github.com/Swarchal/e29a3a1113403710b6850590641f046c
        
        Parameters
        --------

        data : dataframe
            Data observations in the rows and features in the columns

        correlation_threshold : float between 0 and 1
            Value of the Pearson correlation cofficient for identifying correlation features

        """
        columns = data.columns
        self.correlation_threshold = correlation_threshold

        # Calculate the correlations between every column
        corr_matrix = data.corr()
        
        self.corr_matrix = corr_matrix
    
        # Extract the upper triangle of the correlation matrix
        upper = corr_matrix.where(np.triu(np.ones(corr_matrix.shape), k = 1).astype(np.bool))
        # Select the features with correlations above the threshold
        # Need to use the absolute value
        to_drop = [column for column in upper.columns if any(upper[column].abs() > correlation_threshold)]
        obtain_columns = [column for column in columns if column not in to_drop]
        self.columns = obtain_columns
        # Dataframe to hold correlated pairs
        record_collinear = pd.DataFrame(columns = ['drop_feature', 'corr_feature', 'corr_value'])

        # Iterate through the columns to drop
        for column in to_drop:

            # Find the correlated features
            corr_features = list(upper.index[upper[column].abs() > correlation_threshold])

            # Find the correlated values
            corr_values = list(upper[column][upper[column].abs() > correlation_threshold])
            drop_features = [column for _ in range(len(corr_features))]    

            # Record the information (need a temp df for now)
            temp_df = pd.DataFrame.from_dict({'drop_feature': drop_features,
                                             'corr_feature': corr_features,
                                             'corr_value': corr_values})

            # Add to dataframe
            record_collinear = record_collinear.append(temp_df, ignore_index = True)

        self.record_collinear = record_collinear
        return data[obtain_columns]
     
        
    def identify_importance_lgbm(self, features, labels,p_importance=0.8, eval_metric='auc', task='classification', 
                                 n_iterations=10, early_stopping = True):
        """
        
        Identify the features with zero importance according to a gradient boosting machine.
        The gbm can be trained with early stopping using a validation set to prevent overfitting. 
        The feature importances are averaged over n_iterations to reduce variance. 
        
        Uses the LightGBM implementation (http://lightgbm.readthedocs.io/en/latest/index.html)

        Parameters 
        --------
        features : dataframe
            Data for training the model with observations in the rows
            and features in the columns

        labels : array, shape = (1, )
            Array of labels for training the model. These can be either binary 
            (if task is 'classification') or continuous (if task is 'regression')
            
        p_importance:float, range[0,1],default = 0.8
            sum of the importance of features above the value

        eval_metric : string
            Evaluation metric to use for the gradient boosting machine

        task : string, default = 'classification'
            The machine learning task, either 'classification' or 'regression'

        n_iterations : int, default = 10
            Number of iterations to train the gradient boosting machine
            
        early_stopping : boolean, default = True
            Whether or not to use early stopping with a validation set when training
        
        
        Notes
        --------
        
        - Features are one-hot encoded to handle the categorical variables before training.
        - The gbm is not optimized for any particular task and might need some hyperparameter tuning
        - Feature importances, including zero importance features, can change across runs

        """

        # One hot encoding
        data = features
        features = pd.get_dummies(features)

        # Extract feature names
        feature_names = list(features.columns)

        # Convert to np array
        features = np.array(features)
        labels = np.array(labels).reshape((-1, ))

        # Empty array for feature importances
        feature_importance_values = np.zeros(len(feature_names))
        
        print('Training Gradient Boosting Model\n')
        
        # Iterate through each fold
        for _ in range(n_iterations):

            if task == 'classification':
                model = lgb.LGBMClassifier(n_estimators=100, learning_rate = 0.05, verbose = -1)

            elif task == 'regression':
                model = lgb.LGBMRegressor(n_estimators=100, learning_rate = 0.05, verbose = -1)

            else:
                raise ValueError('Task must be either "classification" or "regression"')
                
            # If training using early stopping need a validation set
            if early_stopping:
                
                train_features, valid_features, train_labels, valid_labels = train_test_split(features, labels, test_size = 0.15)

                # Train the model with early stopping
                model.fit(train_features, train_labels, eval_metric = eval_metric,
                          eval_set = [(valid_features, valid_labels)],
                           verbose = -1)
                
                # Clean up memory
                gc.enable()
                del train_features, train_labels, valid_features, valid_labels
                gc.collect()
                
            else:
                model.fit(features, labels)

            # Record the feature importances
            feature_importance_values += model.feature_importances_ / n_iterations

        feature_importances = pd.DataFrame({'feature': feature_names, 'importance': feature_importance_values})

        # Sort features according to importance
        feature_importances = feature_importances.sort_values('importance', ascending = False).reset_index(drop = True)

        # Normalize the feature importances to add up to one
        feature_importances['normalized_importance'] = feature_importances['importance'] / feature_importances['importance'].sum()
        feature_importances['cumulative_importance'] = np.cumsum(feature_importances['normalized_importance'])
        select_df = feature_importances[feature_importances['cumulative_importance']<=p_importance]
        select_columns = select_df['feature']
        self.columns = list(select_columns.values)
        res = data[self.columns]
        return res
        
    def filter_select(self, data_x, data_y, k=None, p=50,method=f_classif):
        columns = data_x.columns
        if k != None:
            model = SelectKBest(method,k)
            res = model.fit_transform(data_x,data_y)
            supports = model.get_support()
        else:
            model = SelectPercentile(method,p)
            res = model.fit_transform(data_x,data_y)
            supports = model.get_support()
        self.support_ = supports
        self.columns = columns[supports]
        return res
    
    def wrapper_select(self,data_x,data_y,n,estimator):
        columns = data_x.columns
        model = RFE(estimator=estimator,n_features_to_select=n)
        res = model.fit_transform(data_x,data_y)
        supports = model.get_support() #标识被选择的特征在原数据中的位置
        self.supports = supports
        self.columns = columns[supports]
        return res
    
    def embedded_select(self,data_x,data_y,estimator,threshold=None):
        '''
        threshold : string, float, optional default None
        The threshold value to use for feature selection. Features whose importance is greater or
        equal are kept while the others are discarded. If “median” (resp. “mean”), then the 
        threshold value is the median (resp. the mean) of the feature importances. 
        A scaling factor (e.g., “1.25*mean”) may also be used. If None and if the estimator
        has a parameter penalty set to l1, either explicitly or implicitly (e.g, Lasso),
        the threshold used is 1e-5. Otherwise, “mean” is used by default.
        '''
        columns = data_x.columns
        model = SelectFromModel(estimator=estimator,prefit=False,threshold=threshold)
        res = model.fit_transform(data_x,data_y)
        supports = model.get_support()
        self.supports = supports
        self.columns = columns[supports]
        return res
f = FeatureSelection()

lgbm_res = f.identify_importance_lgbm(data_x,data_y)
print(f.columns)
print(lgbm_res)
Training Gradient Boosting Model

['MAWR', 'BIAS', 'ACCER', 'MTR', 'CYE', 'KBJ', 'RSI', 'WR', 'CCI', 'MASS', 'ADTM', 'open', 'MAMASS', 'ATR', 'volume', 'money', 'CHO']
                 MAWR      BIAS     ACCER       MTR       CYE        KBJ  \
2016-01-05  78.338711 -3.653359 -0.014579  244.0453 -1.265312  25.995501   
2016-01-06  61.434997 -3.589190 -0.014579  187.4310 -1.265312  26.969961   
2016-01-07  96.373261 -3.653359 -0.014579  244.0453 -1.265312  20.602534   
2016-01-08  77.423358 -3.653359 -0.014579  244.0453 -1.265312  19.244427   
2016-01-11  96.373261 -3.653359 -0.014579  244.0453 -1.265312  16.128881   
2016-01-12  86.744399 -3.653359 -0.014579  210.9550 -1.265312  16.128881   
2016-01-13  96.373261 -3.653359 -0.014579  244.0453 -1.265312  16.128881   
2016-01-14  67.127125 -1.601040 -0.014579  244.0453 -1.080721  16.128881   
2016-01-15  82.665875 -3.653359 -0.008175  244.0453 -1.265312  16.128881   
2016-01-18  68.614464 -0.865861 -0.005785  244.0453 -0.495727  16.128881   
2016-01-19   5.893291  2.081724  0.006458  244.0453  0.201050  20.389337   
2016-01-20  20.548655  1.583050  0.005389  149.2396  0.712050  27.703703   
2016-01-21  69.116535 -2.480929  0.002480  244.0453 -0.784414  27.587728   
2016-01-22  55.094126 -0.532257 -0.005130  214.2589  0.288848  32.905520   
2016-01-25  42.440301  0.067996 -0.007770  116.9330  0.206807  41.182122   
2016-01-26  96.373261 -3.653359 -0.014579  244.0453 -1.265312  33.532389   
2016-01-27  72.763250 -3.653359 -0.014579  244.0453 -1.265312  30.046047   
2016-01-28  96.373261 -3.653359 -0.014579  244.0453 -1.265312  22.424333   
2016-01-29  73.202747 -3.103540 -0.014579  244.0453 -1.265312  20.600271   
2016-02-01  81.371452 -2.667277 -0.007866  174.2811 -1.265312  18.947910   
2016-02-02  53.362645  2.081724  0.004889  194.8836 -0.009838  22.896170   
2016-02-03   6.276446  2.081724  0.008691  152.6951  0.352359  28.994213   
2016-02-04   4.545766  2.081724  0.008691  133.8502  0.770239  37.368993   
2016-02-05  16.764359  1.596653  0.008691   85.5429  0.702028  44.393854   
2016-02-15  18.665322  1.064605  0.003826  217.7264  0.770239  56.112406   
2016-02-16   3.143142  2.081724  0.008691  244.0453  0.770239  69.415061   
2016-02-17   1.608061  2.081724  0.008691  125.1876  0.770239  80.113602   
2016-02-18  14.078530  2.081724  0.008691   89.6479  0.770239  85.042880   
2016-02-19  10.023527  2.081724  0.008691   90.0408  0.770239  87.808769   
2016-02-22   1.266890  2.081724  0.006725  132.1735  0.770239  91.231612   
...               ...       ...       ...       ...       ...        ...   
2018-09-13  68.914846 -0.284532 -0.003221   97.6900 -0.304557  16.128881   
2018-09-14  92.996420 -0.934565 -0.001098   54.0600 -0.515356  16.128881   
2018-09-17  92.262507 -1.712343 -0.004887   83.7600 -0.436257  16.128881   
2018-09-18  41.188342  0.096174 -0.002984   99.8600 -0.082897  16.717529   
2018-09-19  19.127944  1.114592  0.000946  108.8200  0.183645  27.302842   
2018-09-20  20.936771  0.899971  0.005520   38.1100  0.052404  40.903068   
2018-09-21   3.526173  2.081724  0.008691   91.6000  0.576872  58.195815   
2018-09-25   9.062309  1.415178  0.005976   47.1300  0.770239  70.704470   
2018-09-26  10.133323  1.142916  0.004497   59.3100  0.556256  78.622184   
2018-09-27  48.611815 -0.373161  0.001140   63.3400  0.077417  77.557991   
2018-09-28  34.138822  0.353238 -0.000961   55.9800  0.281354  78.921221   
2018-10-08  96.373261 -2.575692 -0.006608  161.3300 -0.662216  65.562352   
2018-10-09  90.286388 -2.237001 -0.010148   72.5800 -0.662137  48.260449   
2018-10-10  83.124681 -1.780822 -0.009003   68.6600 -0.785266  36.088690   
2018-10-11  93.973329 -3.653359 -0.014579  244.0453 -1.265312  25.466259   
2018-10-12  78.752273 -3.653359 -0.014579  170.7900 -1.265312  21.458130   
2018-10-15  89.591449 -3.653359 -0.014579  121.8800 -1.265312  17.308367   
2018-10-16  96.250514 -3.653359 -0.014579  144.4700 -1.265312  16.128881   
2018-10-17  84.612642 -3.303550 -0.012736  129.3400 -1.265312  16.128881   
2018-10-18  96.373261 -3.653359 -0.014556  119.7300 -1.265312  16.128881   
2018-10-19  51.393742 -0.357087 -0.005415  182.3700 -0.723875  16.128881   
2018-10-22   9.942862  2.081724  0.008691  244.0453  0.651111  23.176810   
2018-10-23  28.578119  2.081724  0.008691  122.4000  0.742928  33.999893   
2018-10-24  30.467872  1.437240  0.008691   90.2100  0.537272  45.588723   
2018-10-25  33.060135  0.767301  0.002957  141.9300  0.770239  53.823716   
2018-10-26  33.167853 -0.097528 -0.005137   76.7700  0.499986  59.134571   
2018-10-29  78.323041 -1.916185 -0.004651   89.1500 -0.800964  58.204987   
2018-10-30  57.421737 -0.423375 -0.003658  137.6500 -0.231993  58.407060   
2018-10-31  12.711529  1.111524  0.001096   93.8800  0.128362  62.823006   
2018-11-01  25.313892  1.522810  0.005584   95.1800  0.305856  66.366949   

                  RSI         WR         CCI       MASS      ADTM       open  \
2016-01-05  18.903721  79.093178 -186.121428  24.405686  0.103386  6617.1761   
2016-01-06  30.910499  65.064894 -158.915266  24.740819  0.263709  6617.1761   
2016-01-07  18.903721  95.045375 -176.589372  25.155129  0.208112  6617.1761   
2016-01-08  24.735370  78.462475 -147.819294  25.634080 -0.051259  6617.1761   
2016-01-11  18.903721  95.045375 -142.946447  26.073200 -0.240532  6435.8940   
2016-01-12  18.903721  91.969796 -132.997057  26.443142 -0.277086  6126.6700   
2016-01-13  18.903721  95.045375 -119.558891  26.774535 -0.280338  6168.9870   
2016-01-14  28.535727  80.458925 -104.787337  27.101585 -0.255879  5734.6080   
2016-01-15  23.418016  91.208232  -94.795975  27.425037 -0.048379  6078.6150   
2016-01-18  29.048292  81.180987  -86.669834  27.730879 -0.181086  5768.7577   
2016-01-19  42.120968  64.888665  -57.527264  27.730879 -0.085336  5965.3551   
2016-01-20  39.760280  65.996788  -44.887638  27.730879 -0.056109  6161.2319   
2016-01-21  30.850088  83.372391  -65.671969  27.730879 -0.203549  6031.1695   
2016-01-22  37.146213  68.402321  -73.744777  27.730879 -0.262440  5954.3643   
2016-01-25  41.300762  39.305862  -43.544486  27.730879 -0.278809  6034.0844   
2016-01-26  25.966830  95.045375 -149.960439  27.730879 -0.417006  5966.7049   
2016-01-27  24.176785  72.763250 -186.121428  27.730879 -0.469818  5604.1667   
2016-01-28  19.073601  95.045375 -186.121428  27.730879 -0.500799  5441.7378   
2016-01-29  32.914253  77.250207 -138.482280  27.730879 -0.504185  5262.4596   
2016-02-01  30.696959  84.185109 -110.057693  27.730879 -0.444234  5458.6048   
2016-02-02  43.064463  65.550576  -64.709546  27.730879 -0.480827  5407.7068   
2016-02-03  44.558259  61.129827  -45.422665  27.730879 -0.458559  5520.6415   
2016-02-04  51.740809  43.604704   -9.993732  27.730879 -0.245624  5631.3198   
2016-02-05  47.907693  50.323754   -7.885303  27.730879 -0.183561  5734.4807   
2016-02-15  48.216483  42.388350  -21.501290  27.619241 -0.304867  5485.7230   
2016-02-16  63.890686   2.368782   65.064572  26.835981 -0.208391  5694.7487   
2016-02-17  67.371797   1.093420  103.822963  26.102972  0.081631  5889.5021   
2016-02-18  65.793588   9.880213  116.760709  25.232855  0.292758  6009.7485   
2016-02-19  67.237167   7.737595  118.193982  24.312137  0.349324  5939.6973   
2016-02-22  74.423715   1.131698  131.103834  23.469802  0.367970  6055.6751   
...               ...        ...         ...        ...       ...        ...   
2018-09-13  33.637028  73.579921 -106.862058  23.967377  0.011175  4803.9160   
2018-09-14  25.771144  94.047476 -110.417843  23.802394 -0.162627  4803.9160   
2018-09-17  18.903721  94.957640 -145.284757  23.604222 -0.104195  4803.9160   
2018-09-18  41.556399  67.174662  -97.722838  23.562772 -0.046867  4803.9160   
2018-09-19  52.148974  46.146238  -24.527022  23.667366  0.042052  4803.9160   
2018-09-20  51.434355  47.350759   -3.961792  23.656107  0.216915  4803.9160   
2018-09-21  63.917286   3.526173   45.710485  23.656792  0.216408  4803.9160   
2018-09-25  60.615366   9.062309   71.915944  23.694994  0.204063  4803.9160   
2018-09-26  64.611590  10.133323  111.400793  23.731866  0.273951  4803.9160   
2018-09-27  49.027941  33.552299   65.320924  23.831006  0.265048  4812.6400   
2018-09-28  58.145910  15.924904   81.175132  23.900306  0.105008  4803.9160   
2018-10-08  33.608914  75.877483  -70.958232  24.047857 -0.209880  4803.9160   
2018-10-09  32.375905  80.386365  -98.655856  24.074624 -0.269989  4803.9160   
2018-10-10  31.772980  83.124681 -105.865570  24.133435 -0.235881  4803.9160   
2018-10-11  18.903721  94.229277 -186.121428  24.506854 -0.392253  4803.9160   
2018-10-12  18.903721  79.943835 -186.121428  24.980842 -0.447213  4803.9160   
2018-10-15  18.903721  91.132795 -172.803014  25.518870 -0.490331  4803.9160   
2018-10-16  18.903721  95.045375 -153.141503  26.058143 -0.504185  4803.9160   
2018-10-17  18.903721  87.813142 -122.271966  26.528915 -0.504185  4803.9160   
2018-10-18  18.903721  95.045375 -113.257550  26.914567 -0.504185  4803.9160   
2018-10-19  30.198127  77.250770  -91.039403  27.247880 -0.504185  4803.9160   
2018-10-22  52.654170  47.794098  -31.437480  27.581421 -0.392518  4803.9160   
2018-10-23  45.796012  56.859362  -28.459389  27.730879 -0.204236  4803.9160   
2018-10-24  45.081477  46.464373  -29.723522  27.705536 -0.252139  4803.9160   
2018-10-25  43.952627  33.060135  -43.311376  27.677517 -0.228771  4803.9160   
2018-10-26  43.897815  33.167853   -9.379660  27.553745 -0.023063  4803.9160   
2018-10-29  35.620466  49.894623  -38.266562  27.438440 -0.095220  4803.9160   
2018-10-30  43.181657  40.057138  -21.493525  27.358520 -0.235358  4803.9160   
2018-10-31  53.737102  24.131229   67.666944  27.255069 -0.206124  4803.9160   
2018-11-01  57.425542  17.942113  115.840314  27.045829 -0.184435  4803.9160   

               MAMASS         ATR        volume         money           CHO  
2016-01-05  23.756116  180.783071  1.091625e+10  1.322633e+11  6.065459e+05  
2016-01-06  23.977806  188.709000  1.009304e+10  1.322633e+11  3.236381e+05  
2016-01-07  24.254879  204.201451  4.925074e+09  4.728852e+10  5.836638e+04  
2016-01-08  24.604707  204.201451  1.091625e+10  1.322633e+11 -1.813213e+05  
2016-01-11  25.007304  204.201451  1.091625e+10  1.322633e+11 -6.299919e+05  
2016-01-12  25.408676  204.201451  9.190522e+09  1.057272e+11 -6.304158e+05  
2016-01-13  25.803484  204.201451  8.612340e+09  9.845670e+10 -6.304158e+05  
2016-01-14  26.196945  204.201451  9.225145e+09  1.042551e+11 -6.304158e+05  
2016-01-15  26.575263  204.201451  8.863054e+09  1.022773e+11 -6.304158e+05  
2016-01-18  26.937150  204.201451  7.246979e+09  8.698031e+10 -6.304158e+05  
2016-01-19  27.275037  204.201451  8.798102e+09  1.061855e+11 -6.304158e+05  
2016-01-20  27.587834  204.201451  9.194752e+09  1.133945e+11 -6.304158e+05  
2016-01-21  27.605577  204.201451  8.188276e+09  1.005553e+11 -6.304158e+05  
2016-01-22  27.605577  204.201451  6.983071e+09  8.442937e+10 -6.304158e+05  
2016-01-25  27.605577  204.201451  6.530851e+09  7.867768e+10 -6.304158e+05  
2016-01-26  27.605577  204.201451  8.544392e+09  9.810410e+10 -6.304158e+05  
2016-01-27  27.605577  204.201451  8.792412e+09  9.535493e+10 -5.435846e+05  
2016-01-28  27.605577  204.201451  6.733808e+09  7.315274e+10 -5.641900e+05  
2016-01-29  27.605577  204.201451  7.235909e+09  7.830371e+10 -4.158293e+05  
2016-02-01  27.605577  204.201451  6.471611e+09  7.086333e+10 -5.297102e+05  
2016-02-02  27.605577  204.201451  6.657196e+09  7.567418e+10 -6.304158e+05  
2016-02-03  27.605577  204.201451  6.328073e+09  7.062492e+10 -6.304158e+05  
2016-02-04  27.605577  204.201451  7.108398e+09  8.322622e+10 -5.949604e+05  
2016-02-05  27.605577  204.201451  5.607554e+09  6.532197e+10 -4.587986e+05  
2016-02-15  27.605577  204.201451  5.363125e+09  6.335130e+10 -4.760784e+05  
2016-02-16  27.605577  204.201451  8.287714e+09  9.853490e+10 -1.181450e+05  
2016-02-17  27.605577  204.201451  9.058252e+09  1.078570e+11  4.531126e+04  
2016-02-18  27.074097  204.201451  9.371480e+09  1.101097e+11  4.740938e+05  
2016-02-19  26.391424  204.201451  7.038050e+09  8.684912e+10  7.202841e+05  
2016-02-22  25.595498  182.750086  9.432052e+09  1.108016e+11  1.044428e+06  
...               ...         ...           ...           ...           ...  
2018-09-13  24.337838   76.595714  5.569210e+09  4.728852e+10 -1.301020e+05  
2018-09-14  24.204614   72.445714  5.644793e+09  4.728852e+10 -2.033731e+05  
2018-09-17  24.056001   76.302143  4.933895e+09  4.728852e+10 -3.048443e+05  
2018-09-18  23.907221   80.238571  5.443233e+09  4.728852e+10 -3.900431e+05  
2018-09-19  23.795992   81.472143  6.705586e+09  5.567138e+10 -4.271360e+05  
2018-09-20  23.710040   79.772143  5.639897e+09  4.728852e+10 -4.472038e+05  
2018-09-21  23.658276   80.219286  6.589520e+09  5.250771e+10 -4.188528e+05  
2018-09-25  23.640376   78.121429  5.265424e+09  4.728852e+10 -3.123108e+05  
2018-09-26  23.661650   76.977857  6.471956e+09  5.170976e+10 -1.995860e+05  
2018-09-27  23.706355   76.123571  6.143767e+09  4.728852e+10 -9.968492e+04  
2018-09-28  23.745179   72.405000  5.726864e+09  4.728852e+10 -2.608133e+04  
2018-10-08  23.810470   77.305714  6.327566e+09  4.949483e+10  4.215561e+04  
2018-10-09  23.880109   77.572857  5.043745e+09  4.728852e+10  1.424186e+05  
2018-10-10  23.953182   78.730714  5.133964e+09  4.728852e+10  2.045088e+05  
2018-10-11  24.082347   97.123571  9.056360e+09  6.448032e+10  1.349822e+05  
2018-10-12  24.273986  105.461429  8.152515e+09  5.670063e+10  1.250535e+05  
2018-10-15  24.543747  108.184286  6.229185e+09  4.728852e+10  8.717362e+03  
2018-10-16  24.878795  111.370714  6.107666e+09  4.728852e+10 -1.734683e+05  
2018-10-17  25.287843  112.836429  6.952388e+09  4.728852e+10 -3.353128e+05  
2018-10-18  25.751365  118.666429  5.978391e+09  4.728852e+10 -4.938103e+05  
2018-10-19  26.208203  125.150000  6.956158e+09  4.993703e+10 -5.980369e+05  
2018-10-22  26.641633  139.470000  9.295832e+09  6.951100e+10 -5.939554e+05  
2018-10-23  27.013117  143.976429  7.466806e+09  5.444839e+10 -6.143589e+05  
2018-10-24  27.287683  145.895714  6.282348e+09  4.728852e+10 -6.205348e+05  
2018-10-25  27.479116  152.035000  6.968721e+09  4.837184e+10 -3.933668e+05  
2018-10-26  27.585646  145.995000  6.798362e+09  4.728852e+10 -3.046357e+05  
2018-10-29  27.605577  147.178571  5.745157e+09  4.728852e+10 -1.217705e+05  
2018-10-30  27.580256  152.106429  7.283465e+09  5.230816e+10  1.716301e+05  
2018-10-31  27.498138  133.441429  7.468776e+09  5.324331e+10  4.347835e+05  
2018-11-01  27.388187  128.040714  8.605914e+09  6.500469e+10  7.059026e+05  

[689 rows x 17 columns]
estimator = LinearSVC()
res = f.wrapper_select(data_x=data_x,data_y=data_y,n=5,estimator=estimator)
print(f.columns)
Index(['volume', 'money', 'CHO', 'AMO1', 'AMO2'], dtype='object')
est = LinearSVC(C=0.01,penalty='l1',dual=False)
est1 = RandomForestClassifier()
e_res = f.embedded_select(data_x=data_x,data_y=data_y,estimator=est1)
print(f.columns)
Index(['ACCER', 'BIAS', 'CCI', 'KBJ', 'CYE', 'ROC', 'RSI', 'RSI6', 'WR',
       'MAWR'],
      dtype='object')

全部回复

0/140

量化课程

    移动端课程