请 [注册] 或 [登录]  | 返回主站

量化交易吧 /  数理科学 帖子:3364712 新帖:0

多因子模型(二)-因子检验

美联储发表于:5 月 10 日 16:56回复(1)

前些天发布了,多因子模型(一)-因子生成 (针对聚宽新增多因子相关功能)。
这周按照计划又重新整理了 多因子模型(二)-因子检验 这部分。步入正题:
第一步,读取上一篇因子生成的的因子,如果第一次看到这篇帖子的朋友可有查看 多因子模型(一)-因子生成。
第二步,针对因子进行一系列计算,来判断哪些因子是有效的。其中包括以下几点:

  1. IC的平均值的绝对值,以及ICIR。IC的意思就是每一个时间截面因子暴露和下一期回报的相关系数。举例,在3月1日每只股票的ROE因子,3月1日到4月1日的回报,这两组数字的相关系数。ICIR是IC的绝对值除以IC的标准差。理论上,IC均值的绝对值越大,因子效果越显著,同理,ICIR。
    2.分组收益。对于单个因子,我们把因子从大到小排列。取前10%的股票作为一个组合,然后10%-20%,20%-30%,以此类推,一共10个组合。分别计算各个组合的收益率。分组收益这里有两个方程,一个是每日的组合收益,还有一个是以调仓频率的组合收益。前者用于后续计算一些组合指标。
    3.计算回测区间指数的收益率,目的在于计算超额收益。
    4.进行因子有效性测试。其中包括:
    4.1 组合序列与组合收益的相关性,相关性大于0.5
    4.2 赢家组合明显跑赢市场,输家组合明显跑输市场,程度大于5%
    4.3 高收益组合跑赢基准的概率,低收益组合跑赢基准的概率,概率大小0.5
    这里要注意的是赢家和输家要区分一下,因为有些因子是越大越好,有些是越小越好。
    第三步,我们观测一下统计结果。EffectTestresult,下图未完全显示。
    1.png
    然后观察IC平均值,和ICIR的分布情况。
    2.png
    3.png
    通过观察,我主观决定,筛选标准如下:
    IC均值绝对值大于0.07,ICIR绝对值大于0.4,测试一,测试二-胜者组,测试三-胜者组,必须通过
    测试二、测试三中要至少通过3个。
    欢迎大家采用其他的筛选标准来最优化组合,在第二步的4中,也可以调整一下测试标准,4的检验结果存在变量effect_test_score_dict中
    通过筛选,我们得到了如下因子:
    4.png
    我们发现有很多因子非常相似,比如不同日期的vol,以及各种profit。所以我们需要剔除一些高度相关的因子。
    如何计算因子的相关性呢。我们把股票对应因子数换成该股票所在的分组数。然后计算在同一时期,不同因子的股票分组序列的相关系数。由于我们有很多期,所以需要将这些相关系数矩阵取平均,得到的结果如下:
    5.png
    结合上面因子和测试结果,最后我剔除了['operating_profit_ttm','net_profit_ttm','np_parent_company_owners_ttm','VOL120'],这四个因子。因为IC,ICIR略差些,相比于同类因子。
    对于最后筛选出的因子,画出10分组的收益图,比如说:
    6.png
    因子的分组组合统计,比如说:
    7.png
    因子的超额收益图,比如说:
    8.png
    图看起来基本上都体面。

下一次,就是把这些有效因子带入回测模块了。Good Luck~!

大家多多交流,欢迎批评指正。

# Step I: 读取第一步因子生成的数据数据
import time
import datetime
import jqdata
import datetime
from multiprocessing.dummy import Pool as ThreadPool
from jqfactor import Factor,calc_factors
import pandas as pd
import statsmodels.api as sm
import scipy.stats as st
import pickle
pkl_file = open('MyPackage.pkl', 'rb')
load_Package = pickle.load(pkl_file)
g_univ_dict,return_df,all_return_df,raw_factor_dict,all_factor_dict,all_industry_df=load_Package

univ_dict=g_univ_dict

#factor_dict={}
#factor_dict['cfo_to_ev']=all_factor_dict['cfo_to_ev']
#all_factor_dict=factor_dict
/opt/conda/envs/python3new/lib/python3.6/site-packages/statsmodels/compat/pandas.py:56: FutureWarning: The pandas.core.datetools module is deprecated and will be removed in a future version. Please use the pandas.tseries module instead.
  from pandas.core import datetools
# Step II: 因子筛选用到的函数
def ic_calculator(factor,return_df,univ_dict):
    ic_list=[]
    p_value_list=[]
    for date in list(univ_dict.keys()):   #这里是循环
        univ=univ_dict[date]
        univ=list(set(univ)&set(factor.loc[date].dropna().index)&set(return_df.loc[date].dropna().index))
        if len(univ)<10:
            continue
        factor_se=factor.loc[date,univ]
        return_se=return_df.loc[date,univ]
        ic,p_value=st.spearmanr(factor_se,return_se)
        ic_list.append(ic)
        p_value_list.append(p_value)
    return ic_list

# 1.回测基础数据计算
def all_Group_Return_calculator(factor,univ_dict,all_return_df,GroupNum=10):
    all_date_list=list(all_return_df.index)
    date_list=list(univ_dict.keys())
    all_Group_Ret_df=pd.DataFrame(index=all_date_list,columns=list(np.array(range(GroupNum))))
    for n in range(len(date_list)-1):
        start=date_list[n]
        end=date_list[n+1]
        univ=univ_dict[start]
        univ=set(univ)&set(factor.loc[start].dropna().index)
        factor_se_stock=list(factor.loc[start,univ].dropna().sort_values().index)
        N=len(factor_se_stock)
        for i in range(GroupNum):
            group_stock=factor_se_stock[int(N/GroupNum*i):int(N/GroupNum*(i+1))]
            # 下面两行是关键
            cumret=(all_return_df.loc[start:end,group_stock]+1).cumprod().mean(axis=1)
            all_Group_Ret_df.loc[start:end,i]=cumret.shift(1).fillna(1).pct_change().shift(-1)
            #(((all_return_df.loc[start:end,group_stock]+1).cumprod()-1).mean(axis=1)+1).pct_change().shift(-1)
    all_Group_Ret_df=all_Group_Ret_df[date_list[0]:].shift(1).fillna(0)
    return all_Group_Ret_df

def Group_Return_calculator(factor,univ_dict,return_df,GroupNum=10):
    GroupRet_df=pd.DataFrame(index=list(list(univ_dict.keys())),columns=list(np.array(range(GroupNum))))
    for date in list(univ_dict.keys()):    #这个也是个循环
        univ=univ_dict[date]
        univ=list(set(univ)&set(factor.loc[date].dropna().index)&set(return_df.loc[date].dropna().index))
        factor_se_stock=list(factor.loc[date,univ].sort_values().index)
        N=len(factor_se_stock)
        for i in range(GroupNum):
            group_stock=factor_se_stock[int(N/GroupNum*i):int(N/GroupNum*(i+1))]
            GroupRet_df.loc[date,i]=return_df.loc[date,group_stock].mean()
    return GroupRet_df.shift(1).fillna(0)

def get_index_return(univ_dict,index,count=250):
    trade_date_list=list(univ_dict.keys())
    date=max(trade_date_list)
    price=get_price(index,end_date=date,count=count,fields=['close'])['close']
    price_return=price.loc[trade_date_list[0]:].pct_change().fillna(0)
    price_return_by_tradeday=price.loc[trade_date_list].pct_change().fillna(0)
    return price_return,price_return_by_tradeday

def effect_test(univ_dict,key,group_return,group_excess_return):

    daylength=(list(univ_dict.keys())[-1]-list(univ_dict.keys())[0]).days
    annual_return=np.power(cumprod(group_return+1).iloc[-1,:],365/daylength)
    index_annual_return=np.power((index_return+1).cumprod().iloc[-1],365/daylength)

    # Test One: 组合序列与组合收益的相关性,相关性大于0.5
    sequence=pd.Series(np.array(range(10)))
    test_one_corr=annual_return.corr(sequence)
    test_one_passgrade=0.5
    test_one_pass=abs(test_one_corr)>test_one_passgrade
    
    if test_one_corr<0:
        wingroup,losegroup=0,9
    else:
        wingroup,losegroup=9,0
        
    # Test Two: 赢家组合明显跑赢市场,输家组合明显跑输市场,程度大于5%     
    test_two_passgrade=0.05
    test_two_win_excess=annual_return[wingroup]-index_annual_return
    test_two_win_pass=test_two_win_excess>test_two_passgrade
    test_two_lose_excess=index_annual_return-annual_return[losegroup]
    test_two_lose_pass=test_two_lose_excess>test_two_passgrade
    test_two_pass=test_two_win_pass&test_two_lose_pass

    # Test Tree: 高收益组合跑赢基准的概率,低收益组合跑赢基准的概率,概率大小0.5
    test_three_grade=0.5
    test_three_win_prob=(group_excess_return[wingroup]>0).sum()/len(group_excess_return[wingroup])
    test_three_win_pass=test_three_win_prob>0.5
    test_three_lose_prob=(group_excess_return[losegroup]<0).sum()/len(group_excess_return[losegroup])
    test_three_lose_pass=test_three_lose_prob>0.5
    test_three_pass=test_three_win_pass&test_three_lose_pass

    test_result=[test_one_pass,test_two_win_pass,test_two_lose_pass,test_three_win_pass,test_three_lose_pass]
    test_score=[test_one_corr,test_two_win_excess,test_two_lose_excess,test_three_win_prob,test_three_lose_prob]
    
    return test_result,test_score
# 计算每个因子的评分和筛选结果

starttime=time.clock()

print('\n计算IC_IR:')
count=1
ic_list_dict={}
for key,factor in all_factor_dict.items():
    ic_list=ic_calculator(factor,return_df,univ_dict)
    ic_list_dict[key]=ic_list
    print(count,end=',')
    count=count+1
# 整理结果
ic_df=pd.DataFrame(ic_list_dict,index=list(univ_dict.keys())[:-1])
ic_ir_se=ic_df.mean()/ic_df.std()
ic_avg_se=ic_df.mean().abs()

print('\n计算分组收益:')
count=1
GroupNum=10
all_Factor_Group_Return_dict={}                    ##这个用于计算NAV,再筛选出因子之后再用更效率
Factor_Group_Return_dict={}
for key,factor in all_factor_dict.items():
# 全return    
    all_GroupRet_df=all_Group_Return_calculator(factor,univ_dict,all_return_df,GroupNum)
    all_Factor_Group_Return_dict[key]=all_GroupRet_df
# 调仓期return    
    GroupRet_df=Group_Return_calculator(factor,univ_dict,return_df,GroupNum)   
    Factor_Group_Return_dict[key]=GroupRet_df
    print(count,end=',')
    count=count+1
    
print('\n计算指数收益:')
count=1
index='000300.XSHG'
index_return,index_return_by_tradeday=get_index_return(univ_dict,index)
Factor_Group_Excess_Return_dict={}
for key,group_return in Factor_Group_Return_dict.items():
    Factor_Group_Excess_Return_dict[key]=group_return.subtract(index_return_by_tradeday,axis=0)
    print(count,end=',')
    count=count+1
    
print('\n因子有效性测试:')
count=1
effect_test_result_dict={}
effect_test_score_dict={}
for key,group_return in Factor_Group_Return_dict.items():
    group_excess_return=Factor_Group_Excess_Return_dict[key]   
    effect_test_result_dict[key],effect_test_score_dict[key]=effect_test(univ_dict,key,group_return,group_excess_return)
    print(count,end=',')
    count=count+1

print('\npickle序列化')
Package_ET=[ic_avg_se,ic_ir_se,effect_test_result_dict,effect_test_score_dict,\
           all_Factor_Group_Return_dict,Factor_Group_Return_dict,index_return,index_return_by_tradeday,\
           Factor_Group_Excess_Return_dict]
pkl_file = open('MyPackage_ET.pkl', 'wb')
pickle.dump(Package_ET,pkl_file,0)
pkl_file.close()
  
endtime=time.clock()
runtime=endtime-starttime
print('因子测试运行完成,用时 %.2f 秒' % runtime)
计算IC_IR:
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,
计算分组收益:
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,
计算指数收益:
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,
因子有效性测试:
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,
pickle序列化
因子测试运行完成,用时 721.53 秒
# 读取因子
import pickle
pkl_file = open('MyPackage_ET.pkl', 'rb')
load_Package = pickle.load(pkl_file)
ic_avg_se,ic_ir_se,effect_test_result_dict,effect_test_score_dict,\
all_Factor_Group_Return_dict,Factor_Group_Return_dict,index_return,index_return_by_tradeday,\
Factor_Group_Excess_Return_dict=load_Package
EffectTestresult=pd.concat([ic_avg_se.to_frame('a'),ic_ir_se.to_frame('b'),pd.DataFrame(effect_test_result_dict).T],axis=1)
columns=['IC','ICIR','测试一', '测试二-胜者组', '测试二-败者组', '测试三-胜者组', '测试三-败者组']
EffectTestresult.columns=columns
EffectTestresult2=pd.concat([ic_avg_se.to_frame('a'),ic_ir_se.to_frame('b'),pd.DataFrame(effect_test_score_dict).T],axis=1)
columns=['IC','ICIR','测试一', '测试二-胜者组', '测试二-败者组', '测试三-胜者组', '测试三-败者组']
EffectTestresult2.columns=columns
EffectTestresult
IC ICIR 测试一 测试二-胜者组 测试二-败者组 测试三-胜者组 测试三-败者组
ACCA 0.009401 0.139931 False False False False False
AR 0.013732 -0.078806 False False True True False
ARBR 0.013672 -0.127091 False False False True True
ATR14 0.021501 0.136290 True True False True True
ATR6 0.011871 0.077995 True False False True False
BR 0.016859 -0.109888 False False True True True
DAVOL10 0.001823 -0.012251 False False True False False
DAVOL20 0.004819 0.030647 False False True True True
DAVOL5 0.003037 -0.022354 False False True False False
DEGM 0.012169 0.171164 False False False True False
EBIT 0.063861 0.358740 True True True True False
EBITDA 0.082209 0.470460 True True True True False
Kurtosis120 0.053173 -0.462859 True True False True True
Kurtosis20 0.018192 -0.193767 True False False False False
Kurtosis60 0.042104 -0.364970 True False True False True
MAWVAD 0.006172 0.030128 False False True True False
MLEV 0.008373 0.076413 False False False True False
OperateNetIncome 0.081799 0.412676 True True False True True
OperatingCycle 0.021735 -0.247960 False False False True False
ROAEBITTTM 0.046961 0.311143 True True False True False
Skewness120 0.005261 -0.042912 False False True False True
Skewness20 0.061784 -0.532256 True False False False True
Skewness60 0.043946 -0.376223 True False True False True
TVMA20 0.006112 0.038347 True True True True False
TVMA6 0.002190 -0.013872 True False True True True
TVSTD20 0.016284 -0.108580 False False True True True
TVSTD6 0.011001 -0.077822 False False True True True
VDEA 0.035765 -0.247212 False False True True True
VDIFF 0.030910 -0.196195 False False True True True
VEMA10 0.037545 -0.221708 True False False True False
... ... ... ... ... ... ... ...
operating_profit_ttm 0.087495 0.419978 True True True True False
operating_revenue_growth_rate 0.026559 0.252864 True False False True False
operating_revenue_per_share 0.054008 0.435991 True True True True False
operating_revenue_per_share_ttm 0.055637 0.475639 True True True True True
operating_revenue_ttm 0.068502 0.406241 True False False True False
operating_tax_to_operating_revenue_ratio_ttm 0.013722 0.146272 False False False True False
quick_ratio 0.016010 0.319672 False False False True False
retained_earnings 0.098462 0.514553 True True True True True
retained_earnings_per_share 0.084912 0.602959 True True False True False
retained_profit_per_share 0.085415 0.604552 True True False True False
roa_ttm 0.036300 0.265477 True True False True False
roe_ttm 0.061408 0.361940 True True True True False
sale_expense_to_operating_revenue 0.023429 0.231789 True False False True False
sale_expense_ttm 0.051854 0.345959 True True False True True
sharpe_ratio_120 0.047788 0.261143 True False True True False
sharpe_ratio_20 0.049660 0.277153 False False True True False
sharpe_ratio_60 0.097423 0.609138 True True True True False
super_quick_ratio 0.015263 0.304757 False False False True False
surplus_reserve_fund_per_share 0.060968 0.576047 True True False True False
total_asset_growth_rate 0.022427 0.227200 True False True True False
total_asset_turnover_rate 0.025566 0.299996 False False True True False
total_operating_cost_ttm 0.060801 0.384181 True True False True False
total_operating_revenue_per_share 0.053949 0.435877 True True True True False
total_operating_revenue_per_share_ttm 0.055321 0.474900 True True True True True
total_operating_revenue_ttm 0.068386 0.406090 True False False True False
total_profit_growth_rate 0.019088 0.221173 True False False True False
total_profit_to_cost_ratio 0.003624 -0.045772 False False False True False
total_profit_ttm 0.086498 0.425162 True True True True True
turnover_volatility 0.103638 -0.604179 True False True True True
value_change_profit_ttm 0.011354 0.115067 False True False True False

156 rows × 7 columns

EffectTestresult['IC'].sort_values(ascending=False).hist()
<matplotlib.axes._subplots.AxesSubplot at 0x7f37a7128828>
EffectTestresult['ICIR'].abs().sort_values(ascending=False).hist()
<matplotlib.axes._subplots.AxesSubplot at 0x7f37a706ff98>
#筛选有效因子
# IC大于0.07,ICIR大于0.4,测试一,测试二-胜者组,测试三-胜者组,必须通过
# 测试二、测试三中要至少通过3个。
index_ic=EffectTestresult['IC']>0.07
index_icir=EffectTestresult['ICIR'].abs()>0.4
test_index=all(EffectTestresult.iloc[:,[2,3,5]],axis=1)
test2_index=sum(EffectTestresult.iloc[:,3:6],axis=1)>=3
filter_index=index_ic&index_icir&test_index&test2_index
EffectFactorresult=EffectTestresult.loc[filter_index,:]
# 生成有效因子字典
EffectFactor=list(EffectFactorresult.index)
Effect_factor_dict={key:value for key,value in all_factor_dict.items() if key in EffectFactor}
EffectFactorresult
IC ICIR 测试一 测试二-胜者组 测试二-败者组 测试三-胜者组 测试三-败者组
EBITDA 0.082209 0.470460 True True True True False
VOL120 0.104377 -0.571005 True True True True True
VOL240 0.114056 -0.592371 True True True True True
net_profit_ttm 0.084266 0.414061 True True True True False
np_parent_company_owners_ttm 0.083697 0.409387 True True True True False
operating_profit_per_share_ttm 0.083348 0.477100 True True True True False
operating_profit_ttm 0.087495 0.419978 True True True True False
retained_earnings 0.098462 0.514553 True True True True True
sharpe_ratio_60 0.097423 0.609138 True True True True False
total_profit_ttm 0.086498 0.425162 True True True True True
def Group_Score_calculator(factor,univ_dict,signal,GroupNum=20):
    
    Score_df=pd.DataFrame(index=list(factor.index),columns=list(factor.columns))

    for date in list(univ_dict.keys()):    #这个也是个循环
        univ=univ_dict[date]
        univ=list(set(univ)&set(factor.loc[date].dropna().index))
        factor_se_stock=list(factor.loc[date,univ].sort_values().index)
        N=len(factor_se_stock)
        for i in range(GroupNum):
            group_stock=factor_se_stock[int(N/GroupNum*i):int(N/GroupNum*(i+1))]
            if signal=='ascending':
                Score_df.loc[date,group_stock]=i
            else:
                Score_df.loc[date,group_stock]=GroupNum-i  
   
    return Score_df  

# 计算相关性矩阵
def factor_corr_calculator(Group_Score_Dict,univ_dict):

    Group_Score_dict_by_day={}
    Group_Score_Corr_dict_by_day={}
    
    # 每日的因子序列
    for Date in list(univ_dict.keys()):
        Group_Score_df=pd.DataFrame()
        univ=univ_dict[Date]
        for Factor in list(Group_Score_dict.keys()):
            Group_Score_df=Group_Score_df.append(Group_Score_dict[Factor].loc[Date,univ].to_frame(Factor).T)
        Group_Score_dict_by_day[Date]=Group_Score_df.T.fillna(4.5)
        Group_Score_Corr_dict_by_day[Date]=Group_Score_dict_by_day[Date].corr()

    # 算平均数
    N=len(list(univ_dict.keys()))
    Group_Score_Corr=Group_Score_Corr_dict_by_day[list(univ_dict.keys())[0]]
    for Date in list(univ_dict.keys())[1:]:
        Group_Score_Corr=Group_Score_Corr+Group_Score_Corr_dict_by_day[Date]

    return (Group_Score_Corr/N).round(2)

# 给因子赋值
Group_Score_dict={}
for key,factor in Effect_factor_dict.items():
    signal='ascending' if ic_ir_se[key]>0 else 'descending'
    Group_Score_dict[key]=Group_Score_calculator(factor,univ_dict,signal,20)

# 计算因子相关系数    
factor_corrmatrix=factor_corr_calculator(Group_Score_dict,univ_dict)
factor_corrmatrix    
operating_profit_ttm retained_earnings total_profit_ttm net_profit_ttm EBITDA np_parent_company_owners_ttm VOL120 VOL240 sharpe_ratio_60 operating_profit_per_share_ttm
operating_profit_ttm 1.00 0.76 0.96 0.96 0.54 0.95 0.17 0.22 0.02 0.69
retained_earnings 0.76 1.00 0.77 0.76 0.53 0.76 0.25 0.31 0.02 0.48
total_profit_ttm 0.96 0.77 1.00 0.99 0.56 0.98 0.18 0.24 0.02 0.64
net_profit_ttm 0.96 0.76 0.99 1.00 0.55 0.98 0.18 0.24 0.02 0.64
EBITDA 0.54 0.53 0.56 0.55 1.00 0.52 0.07 0.12 -0.03 0.28
np_parent_company_owners_ttm 0.95 0.76 0.98 0.98 0.52 1.00 0.18 0.23 0.03 0.65
VOL120 0.17 0.25 0.18 0.18 0.07 0.18 1.00 0.93 0.16 0.03
VOL240 0.22 0.31 0.24 0.24 0.12 0.23 0.93 1.00 0.13 0.07
sharpe_ratio_60 0.02 0.02 0.02 0.02 -0.03 0.03 0.16 0.13 1.00 0.01
operating_profit_per_share_ttm 0.69 0.48 0.64 0.64 0.28 0.65 0.03 0.07 0.01 1.00
# 比较后去掉一些因子
# operating_profit_ttm,net_profit_ttm,np_parent_company_owners_ttm,VOL120
removed_factor=['operating_profit_ttm','net_profit_ttm','np_parent_company_owners_ttm','VOL120']
Effect_factor_dict={key:value for key,value in Effect_factor_dict.items() if key not in removed_factor}
def plot_nav(all_return_df,index_return,key):
# Preallocate figures
    fig = plt.figure(figsize=(12,12))
    fig.set_facecolor('white')
    fig.set_tight_layout(True)
    ax1 = fig.add_subplot(211)
    ax2 = fig.add_subplot(212)
    ax1.grid()
    ax2.grid()
    ax1.set_ylabel(u"净值", fontsize=16)
    ax2.set_ylabel(u"对冲净值", fontsize=16)
    ax1.set_title(u"因子选股{} - 净值走势".format(key),fontsize=16)
    ax2.set_title(u"因子选股 - 对冲指数后净值走势", fontsize=16)
# preallocate data    
    date=list(all_return_df.index)
    sequence=all_return_df.columns
# plot nav
    for sq in sequence:
        nav=(1+all_return_df[sq]).cumprod()
        nav_excess=(1+all_return_df[sq]-index_return).cumprod()
        ax1.plot(date,nav,label=str(sq))
        ax2.plot(date,nav_excess,label=str(sq))
    ax1.legend(loc=0,fontsize=12)
    ax2.legend(loc=0,fontsize=12)
    
def polish(x):
    return '%.2f%%' % (x*100)

def result_stats(key,all_return_df,index_return):  

    # Preallocate result DataFrame
    sequences=all_return_df.columns

    cols = [(u'风险指标', u'Alpha'), (u'风险指标', u'Beta'), (u'风险指标', u'信息比率'), (u'风险指标', u'夏普比率'),
            (u'纯多头', u'年化收益'), (u'纯多头', u'最大回撤'), (u'纯多头', u'收益波动率'), 
            (u'对冲后', u'年化收益'), (u'对冲后', u'收益波动率')]
    columns = pd.MultiIndex.from_tuples(cols)
    result_df = pd.DataFrame(index = sequences,columns=columns)
    result_df.index.name = "%s" % (key)

    for sq in sequences:  #循环在这里开始

        # 净值
        return_data=all_return_df[sq]
        return_data_excess=return_data-index_return
        nav=(1+return_data).cumprod()
        nav_excess=(1+return_data_excess).cumprod()
        nav_index=(1+index_return).cumprod()

        # Beta
        beta=return_data.corr(index_return)*return_data.std()/index_return.std()
        beta_excess=return_data_excess.corr(index_return)*return_data_excess.std()/index_return.std()

        #年化收益
        daylength=(return_data.index[-1]-return_data.index[0]).days
        yearly_return=np.power(nav.iloc[-1],1.0*365/daylength)-1
        yearly_return_excess=np.power(nav_excess.iloc[-1],1.0*365/daylength)-1
        yearly_index_return=np.power(nav_index.iloc[-1],1.0*365/daylength)-1

        # 最大回撤 其实这个完全看不懂
        max_drawdown=max([1-v/max(1,max(nav.iloc[:i+1])) for i,v in enumerate(nav)])
        #max_drawdown_excess=max([1-v/max(1,max(nav_excess.iloc[:i+1])) for i,v in enumerate(nav_excess)])

        # 波动率
        vol=return_data.std()*sqrt(252)
        vol_excess=return_data_excess.std()*sqrt(252)

        # Alpha
        rf=0.04
        alpha=yearly_return-(rf+beta*(yearly_return-yearly_index_return))
        alpha_excess=yearly_return_excess-(rf+beta_excess*(yearly_return-yearly_index_return))

        # 信息比率
        ir=(yearly_return-yearly_index_return)/(return_data_excess.std()*sqrt(252))

        # 夏普比率
        sharpe=(yearly_return-rf)/vol

        # 美化打印

        alpha,yearly_return,max_drawdown,vol,yearly_return_excess,vol_excess=\
        map(polish,[alpha,yearly_return,max_drawdown,vol,yearly_return_excess,vol_excess])
        sharpe=round(sharpe,2)
        ir=round(ir,2)
        beta=round(ir,2)

        result_df.loc[sq]=[alpha,beta,ir,sharpe,yearly_return,max_drawdown,vol,yearly_return_excess,vol_excess]
    return result_df

def draw_excess_return(excess_return,key):
    excess_return_mean=excess_return[1:].mean()
    excess_return_mean.index = map(lambda x:int(x)+1,excess_return_mean.index)
    excess_plus=excess_return_mean[excess_return_mean>0]
    excess_minus=excess_return_mean[excess_return_mean<0]

    fig = plt.figure(figsize=(12, 6))
    fig.set_facecolor('white')
    ax1 = fig.add_subplot(111)
    ax1.bar(excess_plus.index, excess_plus.values, align='center', color='r', width=0.35)
    ax1.bar(excess_minus.index, excess_minus.values, align='center', color='g', width=0.35)
    ax1.set_xlim(left=0.5, right=len(excess_return_mean)+0.5)
    ax1.set_ylabel(u'超额收益', fontsize=16)
    ax1.set_xlabel(u'十分位分组', fontsize=16)
    ax1.set_xticks(excess_return_mean.index)
    ax1.set_xticklabels([int(x) for x in ax1.get_xticks()], fontsize=14)
    ax1.set_yticklabels([str(x*100)+'0%' for x in ax1.get_yticks()], fontsize=14)
    ax1.set_title(u"因子选股分组超额收益{}".format(key), fontsize=16)
    ax1.grid()
for key in list(Effect_factor_dict.keys()):
    plot_nav(all_Factor_Group_Return_dict[key],index_return,key)  
/opt/conda/envs/python3new/lib/python3.6/site-packages/matplotlib/figure.py:1743: UserWarning: This figure includes Axes that are not compatible with tight_layout, so its results might be incorrect.
  warnings.warn("This figure includes Axes that are not "
result_dict={}
for key in list(Effect_factor_dict.keys()):
    result_df=result_stats(key,all_Factor_Group_Return_dict[key],index_return)
    result_dict[key]=result_df
    print(result_df)
                      风险指标                        纯多头                  \
                     Alpha  Beta  信息比率  夏普比率     年化收益    最大回撤   收益波动率   
retained_earnings                                                       
0                  -10.34% -1.05 -1.05 -1.26  -16.28%  37.83%  16.11%   
1                  -11.26% -1.24 -1.24  -1.4  -19.52%  39.35%  16.75%   
2                   -9.64% -0.74 -0.74 -1.03  -12.53%  29.36%  16.08%   
3                   -7.44%  0.06  0.06 -0.46   -2.92%  16.07%  15.09%   
4                   -9.39% -0.46 -0.46 -0.85  -10.32%  27.16%  16.87%   
5                   -6.34%  0.82  0.82 -0.03    3.52%  21.94%  14.95%   
6                   -6.73%  0.65  0.65  -0.2    1.14%  21.91%  14.35%   
7                   -6.30%  1.13  1.13  -0.1    2.64%  21.07%  13.35%   
8                   -6.60%  1.23  1.23 -0.01    3.88%  24.56%  14.72%   
9                   -4.46%  2.56  2.56  0.76   15.15%  23.02%  14.73%   

                       对冲后          
                      年化收益   收益波动率  
retained_earnings                   
0                  -12.57%  12.01%  
1                  -13.16%  12.81%  
2                   -7.32%  11.96%  
3                    2.80%  10.87%  
4                   -4.15%  14.65%  
5                   -0.40%   8.66%  
6                    0.02%   7.38%  
7                   -0.80%   5.56%  
8                   -1.78%   6.11%  
9                    2.09%   7.33%  
                     风险指标                        纯多头                      对冲后  \
                    Alpha  Beta  信息比率  夏普比率     年化收益    最大回撤   收益波动率     年化收益   
total_profit_ttm                                                                
0                  -9.85% -0.83 -0.83 -1.08  -14.61%  37.17%  17.17%  -11.52%   
1                 -10.28% -0.98 -0.98  -1.2  -16.41%  36.61%  17.03%  -10.47%   
2                 -10.96% -1.06 -1.06  -1.3  -15.16%  32.79%  14.77%   -7.87%   
3                  -7.66% -0.02 -0.02 -0.47   -3.83%  18.41%  16.59%    2.38%   
4                  -9.00% -0.38 -0.38 -0.81   -8.38%  26.34%  15.36%   -6.14%   
5                  -7.28%  0.13  0.13 -0.44   -2.32%  23.34%  14.27%   -2.11%   
6                  -5.88%  1.07  1.07  0.06    4.85%  22.11%  14.29%   -0.18%   
7                  -6.83%  0.99  0.99 -0.16    1.75%  23.82%  14.29%   -1.46%   
8                  -6.55%  1.15  1.15  0.05    4.74%  24.59%  15.23%   -0.87%   
9                  -5.37%  2.45  2.45  0.61   13.20%  24.32%  15.05%    2.59%   

                          
                   收益波动率  
total_profit_ttm          
0                 13.31%  
1                 13.02%  
2                 10.93%  
3                 12.59%  
4                 12.55%  
5                  9.85%  
6                  7.92%  
7                  5.44%  
8                  7.26%  
9                  6.86%  
           风险指标                        纯多头                      对冲后        
          Alpha  Beta  信息比率  夏普比率     年化收益    最大回撤   收益波动率     年化收益   收益波动率
EBITDA                                                                     
0       -10.17%  -0.8  -0.8  -1.1  -13.70%  31.39%  16.12%   -6.04%  12.68%
1        -8.83% -0.32 -0.32 -0.74   -7.91%  22.01%  16.16%   -1.23%  13.26%
2       -10.68% -0.76 -0.76 -1.13  -12.90%  34.44%  14.98%  -10.99%  12.27%
3        -9.73%  -0.7  -0.7 -1.03  -11.87%  28.83%  15.35%   -4.73%  11.79%
4        -7.43%  0.09  0.09 -0.45   -2.73%  20.98%  15.07%   -0.90%   9.45%
5        -9.50% -0.93 -0.93 -1.15  -15.52%  35.02%  17.04%  -12.86%  12.85%
6        -7.96%  -0.2  -0.2  -0.6   -5.97%  25.39%  16.61%   -5.38%  11.58%
7        -6.21%  2.53  2.53  0.58   13.01%  21.61%  15.59%    3.41%   6.56%
8        -4.46%  2.06  2.06  0.61   12.90%  25.12%  14.61%    3.22%   8.02%
9        -6.42%  1.38  1.38  0.19    6.91%  22.73%  15.59%    0.17%   7.65%
          风险指标                        纯多头                      对冲后        
         Alpha  Beta  信息比率  夏普比率     年化收益    最大回撤   收益波动率     年化收益   收益波动率
VOL240                                                                    
0       -2.11%  1.22  1.22  0.39    8.15%  19.09%  10.61%    0.79%   9.65%
1       -4.45%  1.31  1.31  0.15    5.77%  22.09%  11.74%   -1.89%   7.15%
2       -5.52%  1.29  1.29  0.14    5.96%  23.99%  13.91%   -2.02%   7.43%
3       -6.20%  0.75  0.75 -0.12    2.39%  16.60%  13.92%    2.96%   7.99%
4       -6.82%  0.73  0.73 -0.17    1.57%  24.36%  14.70%   -2.49%   7.16%
5       -7.28%  0.24  0.24 -0.35   -1.48%  21.83%  15.54%   -0.50%   9.01%
6       -8.17% -0.31 -0.31 -0.69   -6.43%  24.71%  15.11%   -5.89%   9.20%
7       -8.20% -0.49 -0.49 -0.76   -8.86%  26.16%  17.01%   -2.02%  10.77%
8       -8.72% -1.18 -1.18 -1.25  -19.68%  42.15%  18.91%  -11.61%  13.59%
9       -9.08% -1.14 -1.14 -1.29  -21.83%  43.53%  20.04%  -12.20%  16.04%
                    风险指标                        纯多头                     对冲后  \
                   Alpha  Beta  信息比率  夏普比率     年化收益    最大回撤   收益波动率    年化收益   
sharpe_ratio_60                                                               
0                -10.48%  -0.9  -0.9 -1.15  -17.76%  37.04%  18.86%  -9.38%   
1                 -9.69% -0.98 -0.98 -1.13  -13.35%  30.02%  15.33%  -7.37%   
2                 -8.27% -0.53 -0.53 -0.78   -8.24%  26.17%  15.61%  -6.71%   
3                 -6.82%  0.65  0.65 -0.14    1.80%  19.22%  15.41%   2.97%   
4                 -8.01% -0.26 -0.26 -0.67   -5.63%  25.32%  14.46%  -5.12%   
5                 -6.55%  0.59  0.59 -0.15    1.69%  16.03%  14.93%   3.96%   
6                 -7.02%  0.33  0.33 -0.33   -0.81%  24.80%  14.43%  -4.21%   
7                 -7.69% -0.03 -0.03 -0.54   -3.94%  32.14%  14.81%  -8.39%   
8                 -5.56%  0.72  0.72     0    4.02%  23.05%  14.54%  -1.92%   
9                 -5.04%   0.7   0.7  0.09    5.43%  17.26%  15.28%   1.24%   

                         
                  收益波动率  
sharpe_ratio_60          
0                15.71%  
1                 9.92%  
2                 8.65%  
3                 8.28%  
4                 7.84%  
5                 9.00%  
6                 8.51%  
7                10.05%  
8                10.67%  
9                12.86%  
                                   风险指标                        纯多头          \
                                  Alpha  Beta  信息比率  夏普比率     年化收益    最大回撤   
operating_profit_per_share_ttm                                               
0                                -9.42% -0.56 -0.56 -0.94   -9.84%  28.07%   
1                                -9.00% -0.59 -0.59 -0.89  -11.09%  31.56%   
2                               -10.18% -0.89 -0.89 -1.15  -14.33%  32.06%   
3                               -10.20% -0.85 -0.85 -1.12  -14.16%  33.57%   
4                                -7.72% -0.03 -0.03 -0.55   -3.98%  17.62%   
5                                -7.95% -0.15 -0.15 -0.64   -4.96%  24.35%   
6                                -7.33%  0.17  0.17 -0.44   -2.24%  25.58%   
7                                -6.52%  1.01  1.01 -0.13    2.16%  18.71%   
8                                -7.02%  1.37  1.37   0.1    5.56%  26.15%   
9                                -4.90%  3.24  3.24   0.9   17.42%  15.38%   

                                            对冲后          
                                 收益波动率     年化收益   收益波动率  
operating_profit_per_share_ttm                           
0                               14.68%   -7.39%  11.03%  
1                               16.86%   -6.36%  12.69%  
2                               15.94%   -7.35%  11.97%  
3                               16.15%  -11.20%  12.42%  
4                               14.57%   -1.54%  10.31%  
5                               13.99%   -4.09%   8.74%  
6                               14.28%   -3.78%   8.04%  
7                               13.74%   -1.76%   5.72%  
8                               15.85%    1.55%   6.70%  
9                               14.94%    6.96%   6.50%  
result_dict['VOL240']
风险指标 纯多头 对冲后
Alpha Beta 信息比率 夏普比率 年化收益 最大回撤 收益波动率 年化收益 收益波动率
VOL240
0 -2.11% 1.22 1.22 0.39 8.15% 19.09% 10.61% 0.79% 9.65%
1 -4.45% 1.31 1.31 0.15 5.77% 22.09% 11.74% -1.89% 7.15%
2 -5.52% 1.29 1.29 0.14 5.96% 23.99% 13.91% -2.02% 7.43%
3 -6.20% 0.75 0.75 -0.12 2.39% 16.60% 13.92% 2.96% 7.99%
4 -6.82% 0.73 0.73 -0.17 1.57% 24.36% 14.70% -2.49% 7.16%
5 -7.28% 0.24 0.24 -0.35 -1.48% 21.83% 15.54% -0.50% 9.01%
6 -8.17% -0.31 -0.31 -0.69 -6.43% 24.71% 15.11% -5.89% 9.20%
7 -8.20% -0.49 -0.49 -0.76 -8.86% 26.16% 17.01% -2.02% 10.77%
8 -8.72% -1.18 -1.18 -1.25 -19.68% 42.15% 18.91% -11.61% 13.59%
9 -9.08% -1.14 -1.14 -1.29 -21.83% 43.53% 20.04% -12.20% 16.04%
for key in list(Effect_factor_dict.keys()):
    draw_excess_return(Factor_Group_Excess_Return_dict[key],key)
 

全部回复

0/140

达人推荐

量化课程

    移动端课程