请 [注册] 或 [登录]  | 返回主站

量化交易吧 /  量化平台 帖子:3364449 新帖:15

【研报复现】再论动量因子

不做外汇索罗斯发表于:9 月 24 日 20:00回复(1)

动量因子在海外(美国、 欧洲) 市场上均是比较受认可的一个具有不错正向收益的风格因子, 在学术界和业界都有较为广泛的研究和应用。 例如在美股市场, 长期收益表现较好的股票(过去 24 个月收益率排名较高) 在未来的一段时间内(1 个月) 仍将具有较高的收益表现。

很多研究指出 A 股市场的动量因子表现则与其他成熟市场的动量因子表现截然不同, 我们跟随光大证券研报《多因子系列报告之二十二——再论动量因子》进行一轮复现。需要说明的是,本篇作品对研报结构做了些许调整,在主体内容不变的情况下,做了一些个人的理解演绎。主要框架如下:

第1部分 基础动量因子分析
包含研报的1、2部分,剔除停牌、ST、上市时间短后的股票,并进行标准化和中性化处理后的原始动量因子和均线趋势动量因子实现了复现;

第2部分 剥离流动性因素后的提纯动量因子分析
对应研报第3部分,基于第1部分的原始动量因子,进一步剥离流动性因素构造提纯动量因子进行分析;

第3部分 风格中性后的残差动量因子
对应研报第4部分,基于剥离流动性因素后的提纯动量因子与Fama-French三因子进行回归,取其残差构造残差动量因子进行分析;

第4部分 改造K线下的动量因子
对应研报第5部分,研报没有交代各因子的具体计算逻辑,在这里个人从经验上基于等成交额K线设计了两类因子:N根改造K线的平均收益率因子、N根改造K线的上涨概率因子;

第5部分 结论
对本次研究范围内的因子进行了对比分析。

研报复现——再论动量因子¶

动量因子在海外(美国、 欧洲) 市场上均是比较受认可的一个具有不错正向收益的风格因子, 在学术界和业界都有较为广泛的研究和应用。 例如在美股市场, 长期收益表现较好的股票(过去 24 个月收益率排名较高) 在未来的一段时间内(1 个月) 仍将具有较高的收益表现。

很多研究指出 A 股市场的动量因子表现则与其他成熟市场的动量因子表现截然不同, 我们跟随光大证券研报《多因子系列报告之二十二——再论动量因子》进行一轮复现。需要说明的是,本篇作品对研报结构做了些许调整,在主体内容不变的情况下,做了一些个人的理解演绎。主要框架如下:

第1部分 基础动量因子分析¶

包含研报的1、2部分,剔除停牌、ST、上市时间短后的股票,并进行标准化和中性化处理后的原始动量因子和均线趋势动量因子实现了复现;

第2部分 剥离流动性因素后的提纯动量因子分析¶

对应研报第3部分,基于第1部分的原始动量因子,进一步剥离流动性因素构造提纯动量因子进行分析;

第3部分 风格中性后的残差动量因子¶

对应研报第4部分,基于剥离流动性因素后的提纯动量因子与Fama-French三因子进行回归,取其残差构造残差动量因子进行分析;

第4部分 改造K线下的动量因子¶

对应研报第5部分,研报没有交代各因子的具体计算逻辑,在这里个人从经验上基于等成交额K线设计了两类因子:N根改造K线的平均收益率因子、N根改造K线的上涨概率因子;

第5部分 结论¶

对本次研究范围内的因子进行了对比分析。

# 导入需要的数据库
from jqfactor import *
from jqdata import *
import pandas as pd
import warnings  
warnings.filterwarnings('ignore') 
from tqdm import tqdm_notebook # 进度条工具
from tqdm import tqdm
import time # 用于计时
from datetime import date, datetime, timedelta
from sklearn import preprocessing, linear_model

# 获取日期列表(War:较get_trade_day(),可配置交易频率)
def get_tradeday_list(start,end,frequency=None,count=None):
    if count != None:
        df = get_price('000001.XSHG',end_date=end,count=count)
    else:
        df = get_price('000001.XSHG',start_date=start,end_date=end)
    if frequency == None or frequency =='day':
        return df.index
    else:
        df['year-month'] = [str(i)[0:7] for i in df.index]
        if frequency == 'month':
            return df.drop_duplicates('year-month').index # War:drop_duplicates默认保留第一个出现的值
        elif frequency == 'quarter':
            df['month'] = [str(i)[5:7] for i in df.index]
            df = df[(df['month']=='01') | (df['month']=='04') | (df['month']=='07') | (df['month']=='10') ]
            return df.drop_duplicates('year-month').index
        elif frequency =='halfyear':
            df['month'] = [str(i)[5:7] for i in df.index]
            df = df[(df['month']=='01') | (df['month']=='06')]
            return df.drop_duplicates('year-month').index 

===设置基本参数===¶

# 设置起止时间(为提高运算速度,取13年以后数据,同样覆盖熊牛市)
start='2015-01-01'
end='2019-05-31'
# 设置调仓周期,月/季/半年
periods=(5,10,20)
# 设置分层数量
quantiles=10
# 股票池(为缩短计算时间,这里取中证500)
stkpool = '000905.XSHG'
# 并剔除选股日 ST/PT、上市不满一年、动量计算期间及选股日停牌)
# 因子预处理及中性化,按聚宽一级行业和市值中性化
nutural_way = ['jq_l1', 'market_cap']
# 获取日期列表
date_list = get_tradeday_list(start=start,end=end,count=None) # 获取回测日期间的所有交易日
date_list[1]
Timestamp('2015-01-06 00:00:00')

===1 基础动量因子分析===¶

  • 进行基础因子函数定义
  • 循环日期获取因子值
  • 因子预处理及中性化
  • 基础动量因子单因子分析

如开头所说常用动量因子也存在单调性不佳,多头收益不稳定的问题,因此参考研报我们尝试从不同角度出发对动量因子进行改造,寻找提升常用动量因子选股效果和稳定性的方法。

在该多因子系列报告中, 曾给出过动量类因子的因子测试结论, 报告中测试的几个常用动量因子,也是我们经常接触到的基础动量因子,明细如下

image.png

由于原始动量因子和研报中提到的结合均线的趋势动量因子(后面统称为“基础动量因子”)计算方式都比较简单,下面我们将以统计周期为21天,对两类动量因子为例进行探索演示

===1.1 计算基础动量因子===¶

考虑到股票停牌往往伴随停牌前后的大幅波动,我们直接将动量计算期间内发生停牌的股票进行剔除处理

# 定义基础动量因子——动量因子(org)和结合均线的趋势动量因子(MA)
def factor_cal(pool,date,N,how=None):
    '''
    说明:N为往前获得N天的股票价格数据
    how='org'或者为空,则计算常规动量因子
    how='MA',则计算结合均线的趋势动量因子
    '''
    df0 = get_price(pool,end_date=date,fq='post',count=N,fields=['close','paused'])
    # 动量计算期间若存在停牌,则剔除
    dfp = df0.iloc[1,:,:].sum() # 若停牌,paused=1
    d_stk = dfp[dfp>0].index.tolist() # 因期间停牌需要剔除的股票列表
#     print(str(date)+' 期间停牌股票为 '+str(d_stk[0:4])+'等')
    df =df0['close'].drop(d_stk,axis=1)
#     print('剔除期间停牌股票后,剩余股数'+str(df.shape[1]))
    if N==0:
        print('输入错误:统计周期N不可以为0')
    else:
        if how=='org' or how==None:
            far = df.iloc[-1,:]/df.iloc[0,:] - 1
        elif how=='MA':
            far = df.iloc[:,:].sum()/N/df.iloc[-1,:]
    return far
factor_cal(['000006.XSHE','000008.XSHE','000009.XSHE'],'2017-12-28',21,'MA')
000008.XSHE    0.980975
000009.XSHE    1.016148
dtype: float64
# 定义剔除上市不满一年、选股日 ST/PT股票的函数
def screener(pool_org, date):
    '''
    注:此处date必须为string格式的日期
    '''
    pool = []
#     print('剔除上市时间断、ST/PT前股票数量'+str(len(pool_org)))
    # 剔除上市不满一年
    for s in pool_org:
        ipo_date = get_security_info(s).start_date
        period = datetime.strptime(date,'%Y-%m-%d').date()-ipo_date
        if period.days>=360:
            pool.append(s)
#     print('剔除上市不满一年股票后股票数量'+str(len(pool)))
    # 剔除选股日 ST/PT
    is_st = get_extras('is_st',pool,count=1,end_date=date)
    pool = [s for s in pool if not is_st[s][0]]
#     print('剔除选股日ST/PT股票后股票数量'+str(len(pool)))
    return pool

为减少循环过程中重复访问股票池数据,将两类基础动量因子写入一个循环获取原始因子数据

# 预设空的dataframe以存放因子值
factor_df_org = pd.DataFrame()
factor_df_MA = pd.DataFrame()
# 循环计算给定日期范围的因子值
mark = 1
for d in tqdm(date_list):
    pool_org = get_index_stocks(stkpool,date=d)
    pool = screener(pool_org,str(d.date()))
    far_org = factor_cal(pool,d,21,'org')
    far_MA = factor_cal(pool,d,21,'MA')
    if mark == 1:
        factor_df_org = far_org
        factor_df_MA = far_MA
        mark = 0
    else:
        factor_df_org = pd.concat([factor_df_org,far_org],axis=1,join='outer') # War:原模板此处写反了
        factor_df_MA = pd.concat([factor_df_MA,far_MA],axis=1)
#将columns更改为可以日期标签
factor_df_org.columns = date_list
factor_df_MA.columns = date_list
factor_df_MA.head(3)
100%|██████████| 1074/1074 [07:16<00:00,  2.32it/s]
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
2015-01-05 00:00:00 2015-01-06 00:00:00 2015-01-07 00:00:00 2015-01-08 00:00:00 2015-01-09 00:00:00 2015-01-12 00:00:00 2015-01-13 00:00:00 2015-01-14 00:00:00 2015-01-15 00:00:00 2015-01-16 00:00:00 2015-01-19 00:00:00 2015-01-20 00:00:00 2015-01-21 00:00:00 2015-01-22 00:00:00 2015-01-23 00:00:00 2015-01-26 00:00:00 2015-01-27 00:00:00 2015-01-28 00:00:00 2015-01-29 00:00:00 2015-01-30 00:00:00 2015-02-02 00:00:00 2015-02-03 00:00:00 2015-02-04 00:00:00 2015-02-05 00:00:00 2015-02-06 00:00:00 2015-02-09 00:00:00 2015-02-10 00:00:00 2015-02-11 00:00:00 2015-02-12 00:00:00 2015-02-13 00:00:00 2015-02-16 00:00:00 2015-02-17 00:00:00 2015-02-25 00:00:00 2015-02-26 00:00:00 2015-02-27 00:00:00 2015-03-02 00:00:00 2015-03-03 00:00:00 2015-03-04 00:00:00 2015-03-05 00:00:00 2015-03-06 00:00:00 ... 2019-04-02 00:00:00 2019-04-03 00:00:00 2019-04-04 00:00:00 2019-04-08 00:00:00 2019-04-09 00:00:00 2019-04-10 00:00:00 2019-04-11 00:00:00 2019-04-12 00:00:00 2019-04-15 00:00:00 2019-04-16 00:00:00 2019-04-17 00:00:00 2019-04-18 00:00:00 2019-04-19 00:00:00 2019-04-22 00:00:00 2019-04-23 00:00:00 2019-04-24 00:00:00 2019-04-25 00:00:00 2019-04-26 00:00:00 2019-04-29 00:00:00 2019-04-30 00:00:00 2019-05-06 00:00:00 2019-05-07 00:00:00 2019-05-08 00:00:00 2019-05-09 00:00:00 2019-05-10 00:00:00 2019-05-13 00:00:00 2019-05-14 00:00:00 2019-05-15 00:00:00 2019-05-16 00:00:00 2019-05-17 00:00:00 2019-05-20 00:00:00 2019-05-21 00:00:00 2019-05-22 00:00:00 2019-05-23 00:00:00 2019-05-24 00:00:00 2019-05-27 00:00:00 2019-05-28 00:00:00 2019-05-29 00:00:00 2019-05-30 00:00:00 2019-05-31 00:00:00
000006.XSHE 0.950024 0.979913 0.978216 0.989396 1.001907 1.040373 1.036306 1.043149 1.017124 1.019259 1.119908 1.079357 1.044283 1.035899 1.02836 0.985367 0.982717 1.000882 1.010515 1.005705 1.015084 1.008618 1.019022 1.019647 1.026898 1.036022 1.021088 1.008687 1.008083 0.989901 0.986288 0.980285 0.986411 0.970987 0.977381 0.967785 0.989077 0.972194 0.965698 0.953382 ... 0.927109 0.910552 0.919580 0.929607 0.936091 0.955798 0.978661 0.976388 1.002039 0.996447 1.005761 1.017732 1.020690 1.048484 1.057451 1.048361 1.098379 1.104177 1.107837 1.100925 1.193448 1.155461 1.154927 1.156090 1.112532 1.110405 1.110574 1.079790 1.064553 1.088672 1.085362 1.060075 1.063479 1.081551 1.051241 0.999907 1.012325 1.014708 1.031307 1.026104
000008.XSHE NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... 0.952488 0.940972 0.933586 0.956217 0.941725 0.961676 0.967068 0.973517 0.994722 0.985372 0.983487 1.003515 0.995976 1.015313 1.015516 1.013070 1.060308 1.057498 1.124529 1.099874 1.176752 1.163509 1.163610 1.156882 1.121202 1.112764 1.118766 1.089577 1.080099 1.103129 1.111350 1.088391 1.091352 1.110239 1.103670 1.071602 1.046497 1.045969 1.021598 1.024626
000009.XSHE NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... 0.950746 0.966187 0.939015 0.984451 0.991303 1.009983 1.034800 1.039174 1.038626 1.027195 1.026748 1.033568 0.971940 0.999465 1.034891 1.019861 1.087234 1.086153 1.151451 1.076605 1.123525 1.103660 1.128448 1.156902 1.111181 1.107258 1.105765 1.070778 1.047464 1.095105 1.091710 1.029660 1.052470 1.022462 1.041329 1.009769 1.001300 0.978845 0.987890 0.999901
factor_df_org.head(3)
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
2015-01-05 00:00:00 2015-01-06 00:00:00 2015-01-07 00:00:00 2015-01-08 00:00:00 2015-01-09 00:00:00 2015-01-12 00:00:00 2015-01-13 00:00:00 2015-01-14 00:00:00 2015-01-15 00:00:00 2015-01-16 00:00:00 2015-01-19 00:00:00 2015-01-20 00:00:00 2015-01-21 00:00:00 2015-01-22 00:00:00 2015-01-23 00:00:00 2015-01-26 00:00:00 2015-01-27 00:00:00 2015-01-28 00:00:00 2015-01-29 00:00:00 2015-01-30 00:00:00 2015-02-02 00:00:00 2015-02-03 00:00:00 2015-02-04 00:00:00 2015-02-05 00:00:00 2015-02-06 00:00:00 2015-02-09 00:00:00 2015-02-10 00:00:00 2015-02-11 00:00:00 2015-02-12 00:00:00 2015-02-13 00:00:00 2015-02-16 00:00:00 2015-02-17 00:00:00 2015-02-25 00:00:00 2015-02-26 00:00:00 2015-02-27 00:00:00 2015-03-02 00:00:00 2015-03-03 00:00:00 2015-03-04 00:00:00 2015-03-05 00:00:00 2015-03-06 00:00:00 ... 2019-04-02 00:00:00 2019-04-03 00:00:00 2019-04-04 00:00:00 2019-04-08 00:00:00 2019-04-09 00:00:00 2019-04-10 00:00:00 2019-04-11 00:00:00 2019-04-12 00:00:00 2019-04-15 00:00:00 2019-04-16 00:00:00 2019-04-17 00:00:00 2019-04-18 00:00:00 2019-04-19 00:00:00 2019-04-22 00:00:00 2019-04-23 00:00:00 2019-04-24 00:00:00 2019-04-25 00:00:00 2019-04-26 00:00:00 2019-04-29 00:00:00 2019-04-30 00:00:00 2019-05-06 00:00:00 2019-05-07 00:00:00 2019-05-08 00:00:00 2019-05-09 00:00:00 2019-05-10 00:00:00 2019-05-13 00:00:00 2019-05-14 00:00:00 2019-05-15 00:00:00 2019-05-16 00:00:00 2019-05-17 00:00:00 2019-05-20 00:00:00 2019-05-21 00:00:00 2019-05-22 00:00:00 2019-05-23 00:00:00 2019-05-24 00:00:00 2019-05-27 00:00:00 2019-05-28 00:00:00 2019-05-29 00:00:00 2019-05-30 00:00:00 2019-05-31 00:00:00
000006.XSHE -0.008411 -0.007265 0.004397 0.027261 -0.00148 -0.078677 -0.048753 -0.063153 -0.0467 -0.064739 -0.130372 -0.074793 -0.009518 0.004789 -0.003155 0.034288 0.021434 -0.013567 -0.037169 -0.078006 -0.09464 -0.062785 -0.077238 -0.070791 -0.070153 -0.045048 -0.032601 -0.012526 -0.03523 -0.013859 0.095421 0.072607 0.036809 0.047625 0.034666 0.001495 -0.025486 0.009169 0.027766 0.039975 ... 0.102529 0.127541 0.119304 0.193618 0.169692 0.132649 0.104508 0.153345 0.062696 0.019368 0.043066 0.032308 0.019781 -0.038337 -0.057033 -0.003074 -0.055042 -0.064124 -0.097798 -0.128549 -0.203133 -0.203900 -0.208981 -0.214379 -0.192957 -0.188552 -0.181940 -0.171979 -0.150428 -0.184215 -0.182894 -0.162450 -0.171622 -0.171770 -0.145716 -0.112458 -0.085789 -0.084840 -0.098505 -0.100000
000008.XSHE NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... 0.029716 0.035768 0.014281 0.089900 0.076066 0.034079 0.059040 0.080718 0.055885 0.043620 0.047941 0.010738 0.021523 0.004265 0.037755 0.039910 -0.011070 -0.002245 -0.088774 -0.091096 -0.179398 -0.189017 -0.205256 -0.189660 -0.187873 -0.174867 -0.183871 -0.165994 -0.150158 -0.185781 -0.203688 -0.180132 -0.197484 -0.205545 -0.209873 -0.198356 -0.149528 -0.158110 -0.087911 -0.111916
000009.XSHE NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... 0.188571 0.115749 0.118131 0.131449 0.071817 0.001425 -0.003091 0.056319 0.039491 0.024557 0.016831 -0.034941 -0.042020 -0.093959 -0.090062 -0.030707 -0.082859 -0.050591 -0.135590 -0.101677 -0.160027 -0.144094 -0.198930 -0.193222 -0.166025 -0.156828 -0.141164 -0.114328 -0.101918 -0.158767 -0.164628 -0.114410 -0.192282 -0.150377 -0.138492 -0.126358 -0.062815 -0.040451 0.010361 -0.063205

===1.2 标准化及中性化===¶

# 对动量因子(org)做标准化及中性化处理
for date in tqdm(date_list):
    se = standardlize(factor_df_org[date], inf2nan=True) # 截面标准化处理,即对每列做标准化
    se = neutralize(se, how=nutural_way, date=date) # 按研报剔除原始因子值与市值、行业相关的部分
    factor_df_org[date] = se
# 进行转置,调整为分析可用的格式(列为股票代码,行为日期)
factor_df_org = factor_df_org.T
factor_df_org.head(3)
100%|██████████| 1074/1074 [06:41<00:00,  2.89it/s]
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
000006.XSHE 000008.XSHE 000009.XSHE 000012.XSHE 000021.XSHE 000025.XSHE 000027.XSHE 000028.XSHE 000030.XSHE 000031.XSHE 000039.XSHE 000049.XSHE 000050.XSHE 000060.XSHE 000061.XSHE 000062.XSHE 000066.XSHE 000078.XSHE 000088.XSHE 000089.XSHE 000090.XSHE 000099.XSHE 000156.XSHE 000158.XSHE 000400.XSHE 000401.XSHE 000415.XSHE 000417.XSHE 000418.XSHE 000422.XSHE 000426.XSHE 000488.XSHE 000501.XSHE 000506.XSHE 000510.XSHE 000511.XSHE 000513.XSHE 000517.XSHE 000519.XSHE 000525.XSHE ... 603025.XSHG 603056.XSHG 603077.XSHG 603169.XSHG 603188.XSHG 603198.XSHG 603225.XSHG 603228.XSHG 603233.XSHG 603328.XSHG 603355.XSHG 603366.XSHG 603369.XSHG 603377.XSHG 603444.XSHG 603486.XSHG 603515.XSHG 603528.XSHG 603555.XSHG 603556.XSHG 603567.XSHG 603568.XSHG 603569.XSHG 603589.XSHG 603658.XSHG 603659.XSHG 603698.XSHG 603699.XSHG 603712.XSHG 603766.XSHG 603799.XSHG 603806.XSHG 603816.XSHG 603866.XSHG 603868.XSHG 603877.XSHG 603883.XSHG 603885.XSHG 603888.XSHG 603899.XSHG
2015-01-05 0.013946 NaN NaN -0.143837 -0.105020 NaN NaN 0.086060 0.308430 1.683335 NaN -1.013979 -1.169741 NaN NaN -2.293080 0.019128 0.665542 0.394881 NaN 0.667314 -0.286523 NaN NaN NaN NaN -1.753577 2.103589 1.045160 1.121370 NaN NaN NaN NaN -1.409236 0.118111 NaN NaN -0.721453 -0.282041 ... NaN NaN -1.065609 NaN NaN NaN NaN NaN NaN NaN NaN -0.088566 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2015-01-06 -0.149779 NaN NaN -0.021512 0.275087 NaN NaN 0.104696 0.607367 2.209466 NaN -1.000357 -1.095631 NaN NaN -2.454673 -0.023587 0.600416 -0.101110 NaN 0.516473 -0.634807 NaN NaN NaN NaN -1.967841 2.010912 1.071662 1.377602 NaN NaN NaN NaN -1.493394 0.442638 NaN NaN -1.250028 -0.122053 ... NaN NaN -1.167812 NaN NaN NaN NaN NaN NaN NaN NaN -0.148033 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2015-01-07 0.081004 NaN NaN -0.297587 0.484962 NaN NaN 0.301947 1.193476 1.915936 NaN -0.983924 -1.145015 NaN NaN -2.134478 0.219510 0.123181 -0.665860 NaN 0.657980 -0.800464 NaN NaN NaN NaN -1.943416 1.757069 0.880711 1.369063 NaN NaN NaN NaN -1.581628 0.523130 NaN NaN -1.605479 -0.145785 ... NaN NaN -1.115302 NaN NaN NaN NaN NaN NaN NaN NaN -0.086662 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
# 对结合均线的趋势动量因子(MA)做标准化及中性化处理
for date in tqdm(date_list):
    se = standardlize(factor_df_MA[date], inf2nan=True)
    se = neutralize(se, how=nutural_way, date=date)
    factor_df_MA[date] = se
factor_df_MA = factor_df_MA.T
factor_df_MA.head(3)
100%|██████████| 1074/1074 [06:23<00:00,  2.86it/s]
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
000006.XSHE 000008.XSHE 000009.XSHE 000012.XSHE 000021.XSHE 000025.XSHE 000027.XSHE 000028.XSHE 000030.XSHE 000031.XSHE 000039.XSHE 000049.XSHE 000050.XSHE 000060.XSHE 000061.XSHE 000062.XSHE 000066.XSHE 000078.XSHE 000088.XSHE 000089.XSHE 000090.XSHE 000099.XSHE 000156.XSHE 000158.XSHE 000400.XSHE 000401.XSHE 000415.XSHE 000417.XSHE 000418.XSHE 000422.XSHE 000426.XSHE 000488.XSHE 000501.XSHE 000506.XSHE 000510.XSHE 000511.XSHE 000513.XSHE 000517.XSHE 000519.XSHE 000525.XSHE ... 603025.XSHG 603056.XSHG 603077.XSHG 603169.XSHG 603188.XSHG 603198.XSHG 603225.XSHG 603228.XSHG 603233.XSHG 603328.XSHG 603355.XSHG 603366.XSHG 603369.XSHG 603377.XSHG 603444.XSHG 603486.XSHG 603515.XSHG 603528.XSHG 603555.XSHG 603556.XSHG 603567.XSHG 603568.XSHG 603569.XSHG 603589.XSHG 603658.XSHG 603659.XSHG 603698.XSHG 603699.XSHG 603712.XSHG 603766.XSHG 603799.XSHG 603806.XSHG 603816.XSHG 603866.XSHG 603868.XSHG 603877.XSHG 603883.XSHG 603885.XSHG 603888.XSHG 603899.XSHG
2015-01-05 -0.401815 NaN NaN -0.167571 -0.322829 NaN NaN -0.527833 -0.832913 -1.700941 NaN 1.187482 0.862059 NaN NaN 2.506084 -0.417010 -0.230359 0.790768 NaN -1.601405 0.066056 NaN NaN NaN NaN 2.254179 -2.233973 -0.795932 -0.983162 NaN NaN NaN NaN 3.063470 0.469329 NaN NaN 0.852489 -0.065061 ... NaN NaN 0.870548 NaN NaN NaN NaN NaN NaN NaN NaN -0.258739 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2015-01-06 -0.155174 NaN NaN -0.213172 -0.603942 NaN NaN -0.474294 -1.282202 -2.333651 NaN 0.913158 1.123063 NaN NaN 1.899162 -0.258477 -0.035266 1.089346 NaN -1.609079 -0.205338 NaN NaN NaN NaN 2.240717 -2.428228 -0.558901 -1.483344 NaN NaN NaN NaN 3.480824 0.193345 NaN NaN 0.829735 -0.416236 ... NaN NaN 0.639597 NaN NaN NaN NaN NaN NaN NaN NaN -0.195431 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2015-01-07 -0.183641 NaN NaN 0.114402 -0.866043 NaN NaN -0.562701 -2.027371 -1.818329 NaN 1.046441 1.117883 NaN NaN 2.078740 -0.803251 -0.002718 1.323627 NaN -1.308368 -0.111260 NaN NaN NaN NaN 2.224118 -1.847005 -0.448435 -1.336816 NaN NaN NaN NaN 3.348240 0.056470 NaN NaN 1.029472 -0.245422 ... NaN NaN 0.887545 NaN NaN NaN NaN NaN NaN NaN NaN -0.079536 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

===1.3 查看基础动量因子效果===¶

# 对动量因子(org)进行单因子分析
t0 = time.time() # 记录开始时间
far = analyze_factor(factor=factor_df_org, start_date=date_list[0], end_date=date_list[-1], weight_method='avg', industry='jq_l1', quantiles=quantiles, periods=periods,max_loss=0.3)
t1 = time.time()
print('完成单因子分析,耗时 %.2f 分钟' %round((t1-t0)/60,2))
far.create_full_tear_sheet(demeaned=False, group_adjust=False, by_group=False, turnover_periods=None, avgretplot=(5, 15), std_bar=False)
完成单因子分析,耗时 0.74 分钟
分位数统计
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
min max mean std count count %
factor_quantile
1 -7.284595 -0.743610 -1.430808 0.469322 47729 10.101589
2 -1.279000 -0.489865 -0.822297 0.113447 47199 9.989418
3 -0.832998 -0.299444 -0.557822 0.082225 47082 9.964655
4 -0.611668 -0.096137 -0.356250 0.074784 47201 9.989841
5 -0.401478 0.065747 -0.177068 0.074974 47320 10.015027
6 -0.223962 0.255828 -0.000367 0.077188 46962 9.939258
7 -0.057693 0.471181 0.193111 0.085139 47082 9.964655
8 0.128140 0.831028 0.433438 0.103703 47201 9.989841
9 0.340346 1.435803 0.795767 0.162497 47080 9.964232
10 0.631898 13.686161 1.856274 0.918769 47634 10.081483
-------------------------

收益分析
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
period_5 period_10 period_20
Ann. alpha -0.138 -0.121 -0.092
beta -0.046 -0.045 -0.027
Mean Period Wise Return Top Quantile (bps) -7.326 -5.886 -3.213
Mean Period Wise Return Bottom Quantile (bps) 7.742 6.907 6.183
Mean Period Wise Spread (bps) -14.779 -12.537 -9.463
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
-------------------------

IC 分析
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
period_5 period_10 period_20
IC Mean -0.060 -0.065 -0.066
IC Std. 0.133 0.130 0.123
IR -0.452 -0.500 -0.533
t-stat(IC) -14.801 -16.377 -17.472
p-value(IC) 0.000 0.000 0.000
IC Skew -0.396 -0.444 -0.440
IC Kurtosis 0.263 0.354 0.071
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
-------------------------

换手率分析
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
period_10 period_20 period_5
Quantile 1 Mean Turnover 0.700 0.900 0.520
Quantile 2 Mean Turnover 0.834 0.911 0.750
Quantile 3 Mean Turnover 0.862 0.907 0.806
Quantile 4 Mean Turnover 0.868 0.898 0.823
Quantile 5 Mean Turnover 0.874 0.896 0.831
Quantile 6 Mean Turnover 0.869 0.888 0.823
Quantile 7 Mean Turnover 0.866 0.896 0.817
Quantile 8 Mean Turnover 0.860 0.908 0.797
Quantile 9 Mean Turnover 0.830 0.912 0.736
Quantile 10 Mean Turnover 0.647 0.890 0.464
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
period_5 period_10 period_20
Mean Factor Rank Autocorrelation 0.678 0.415 -0.064
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
-------------------------

<Figure size 432x288 with 0 Axes>
# 对趋势动量因子(MA)进行单因子分析
t0 = time.time()
far = analyze_factor(factor=factor_df_MA, start_date=date_list[0], end_date=date_list[-1], weight_method='avg', industry='jq_l1', quantiles=quantiles, periods=periods,max_loss=0.3)
t1 = time.time()
print('完成单因子分析,耗时 %.2f 分钟' %round((t1-t0)/60,2))
far.create_full_tear_sheet(demeaned=False, group_adjust=False, by_group=False, turnover_periods=None, avgretplot=(5, 15), std_bar=False)
完成单因子分析,耗时 -1.79 分钟
分位数统计
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
min max mean std count count %
factor_quantile
1 -8.836144 -0.747926 -1.786318 0.718698 47729 10.101589
2 -1.558564 -0.390727 -0.842835 0.162386 47199 9.989418
3 -0.898396 -0.132429 -0.476033 0.105573 47082 9.964655
4 -0.553705 0.057143 -0.228207 0.090370 47201 9.989841
5 -0.328192 0.208110 -0.028845 0.085349 47320 10.015027
6 -0.122795 0.453388 0.153216 0.086615 46962 9.939258
7 0.008312 0.646818 0.338246 0.090208 47082 9.964655
8 0.179132 0.908924 0.550091 0.098819 47201 9.989841
9 0.360729 1.343562 0.834682 0.128127 47080 9.964232
10 0.674404 12.106571 1.547719 0.636463 47634 10.081483
-------------------------

收益分析
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
period_5 period_10 period_20
Ann. alpha 0.153 0.123 0.096
beta 0.064 0.053 0.036
Mean Period Wise Return Top Quantile (bps) 7.587 6.470 6.085
Mean Period Wise Return Bottom Quantile (bps) -7.285 -5.051 -3.271
Mean Period Wise Spread (bps) 14.340 11.016 9.217
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
-------------------------

IC 分析
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
period_5 period_10 period_20
IC Mean 0.064 0.063 0.066
IC Std. 0.147 0.137 0.130
IR 0.434 0.457 0.504
t-stat(IC) 14.233 14.990 16.528
p-value(IC) 0.000 0.000 0.000
IC Skew 0.428 0.660 0.554
IC Kurtosis 0.370 1.197 0.289
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
-------------------------

换手率分析
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
period_10 period_20 period_5
Quantile 1 Mean Turnover 0.764 0.891 0.575
Quantile 2 Mean Turnover 0.866 0.911 0.794
Quantile 3 Mean Turnover 0.881 0.906 0.834
Quantile 4 Mean Turnover 0.884 0.899 0.846
Quantile 5 Mean Turnover 0.880 0.888 0.847
Quantile 6 Mean Turnover 0.886 0.896 0.852
Quantile 7 Mean Turnover 0.884 0.899 0.847
Quantile 8 Mean Turnover 0.883 0.910 0.831
Quantile 9 Mean Turnover 0.863 0.911 0.781
Quantile 10 Mean Turnover 0.758 0.899 0.558
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
period_5 period_10 period_20
Mean Factor Rank Autocorrelation 0.584 0.26 -0.055
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
-------------------------

<Figure size 432x288 with 0 Axes>

===2 剥离流动性因素后的提纯动量因子分析===¶

  • 对原始动量因子进行流动性剥离
  • 对提纯动量因子进行单因子分析
# 在此仅对动量因子(org)剥离流动性进行分析
factor_df_org = factor_df_org.T # 将因子矩阵进行转置,以保持中性化代码结构一致
for date in tqdm(date_list):
    se = neutralize(factor_df_org[date], how='liquidity', date=date)
    factor_df_org[date] = se
factor_df_org = factor_df_org.T
factor_df_org.head(3)
100%|██████████| 1074/1074 [06:41<00:00,  2.72it/s]
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
000006.XSHE 000008.XSHE 000009.XSHE 000012.XSHE 000021.XSHE 000025.XSHE 000027.XSHE 000028.XSHE 000030.XSHE 000031.XSHE 000039.XSHE 000049.XSHE 000050.XSHE 000060.XSHE 000061.XSHE 000062.XSHE 000066.XSHE 000078.XSHE 000088.XSHE 000089.XSHE 000090.XSHE 000099.XSHE 000156.XSHE 000158.XSHE 000400.XSHE 000401.XSHE 000415.XSHE 000417.XSHE 000418.XSHE 000422.XSHE 000426.XSHE 000488.XSHE 000501.XSHE 000506.XSHE 000510.XSHE 000511.XSHE 000513.XSHE 000517.XSHE 000519.XSHE 000525.XSHE ... 603025.XSHG 603056.XSHG 603077.XSHG 603169.XSHG 603188.XSHG 603198.XSHG 603225.XSHG 603228.XSHG 603233.XSHG 603328.XSHG 603355.XSHG 603366.XSHG 603369.XSHG 603377.XSHG 603444.XSHG 603486.XSHG 603515.XSHG 603528.XSHG 603555.XSHG 603556.XSHG 603567.XSHG 603568.XSHG 603569.XSHG 603589.XSHG 603658.XSHG 603659.XSHG 603698.XSHG 603699.XSHG 603712.XSHG 603766.XSHG 603799.XSHG 603806.XSHG 603816.XSHG 603866.XSHG 603868.XSHG 603877.XSHG 603883.XSHG 603885.XSHG 603888.XSHG 603899.XSHG
2015-01-05 -0.008479 NaN NaN -0.170441 -0.112027 NaN NaN 0.105461 0.298858 1.658255 NaN -1.021086 -1.198874 NaN NaN -2.292949 0.024247 0.637717 0.374412 NaN 0.660046 -0.319087 NaN NaN NaN NaN -1.766258 2.090494 1.037755 1.097747 NaN NaN NaN NaN -1.437081 0.073616 NaN NaN -0.742774 -0.284120 ... NaN NaN -1.090905 NaN NaN NaN NaN NaN NaN NaN NaN -0.110164 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2015-01-06 -0.180133 NaN NaN -0.057621 0.266002 NaN NaN 0.130720 0.593937 2.173279 NaN -1.009164 -1.135003 NaN NaN -2.452825 -0.015573 0.563696 -0.127945 NaN 0.506486 -0.679344 NaN NaN NaN NaN -1.983488 1.992872 1.061723 1.344940 NaN NaN NaN NaN -1.531070 0.382150 NaN NaN -1.277783 -0.123599 ... NaN NaN -1.201198 NaN NaN NaN NaN NaN NaN NaN NaN -0.176585 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2015-01-07 0.063399 NaN NaN -0.318703 0.480049 NaN NaN 0.317905 1.184968 1.894427 NaN -0.988615 -1.168278 NaN NaN -2.131636 0.224478 0.102165 -0.681062 NaN 0.652307 -0.825651 NaN NaN NaN NaN -1.951751 1.746724 0.875269 1.349885 NaN NaN NaN NaN -1.603674 0.487171 NaN NaN -1.620048 -0.145789 ... NaN NaN -1.134254 NaN NaN NaN NaN NaN NaN NaN NaN -0.102795 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
# 对提纯动量因子进行单因子分析
t0 = time.time() # 记录开始时间
far = analyze_factor(factor=factor_df_org, start_date=date_list[0], end_date=date_list[-1], weight_method='avg', industry='jq_l1', quantiles=quantiles, periods=periods,max_loss=0.3)
t1 = time.time()
print('完成单因子分析,耗时 %.2f 分钟' %round((t1-t0)/60,2))
# 打印信息比率(IC)相关表
far.plot_information_table(group_adjust=False, method='rank')
完成单因子分析,耗时 0.79 分钟
IC 分析
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
period_5 period_10 period_20
IC Mean -0.056 -0.060 -0.059
IC Std. 0.131 0.129 0.123
IR -0.427 -0.468 -0.485
t-stat(IC) -14.003 -15.348 -15.884
p-value(IC) 0.000 0.000 0.000
IC Skew -0.374 -0.439 -0.422
IC Kurtosis 0.252 0.436 0.160
# 对提纯动量因子进行单因子分析——各分位数平均收益图
far.plot_quantile_returns_bar(by_group=False, demeaned=0, group_adjust=False)
<Figure size 432x288 with 0 Axes>
# 对提纯动量因子进行单因子分析——换手率表
far.plot_turnover_table()
换手率分析
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
period_10 period_20 period_5
Quantile 1 Mean Turnover 0.690 0.893 0.512
Quantile 2 Mean Turnover 0.830 0.907 0.745
Quantile 3 Mean Turnover 0.860 0.906 0.802
Quantile 4 Mean Turnover 0.868 0.900 0.822
Quantile 5 Mean Turnover 0.870 0.897 0.829
Quantile 6 Mean Turnover 0.868 0.891 0.827
Quantile 7 Mean Turnover 0.863 0.894 0.819
Quantile 8 Mean Turnover 0.857 0.903 0.796
Quantile 9 Mean Turnover 0.830 0.913 0.738
Quantile 10 Mean Turnover 0.652 0.894 0.469
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
period_5 period_10 period_20
Mean Factor Rank Autocorrelation 0.682 0.422 -0.052
# 对提纯动量因子进行单因子分析——累计收益走势图
# 计算指定调仓周期的各分位数每日累积收益
df = far.calc_cumulative_return_by_quantile(period=10)
# 进行数据展示
df.plot(figsize=(15,6))
<matplotlib.axes._subplots.AxesSubplot at 0x2000cfad860>

===3 风格中性后的残差动量因子===¶

  • 定义Fama-French因子函数
  • 在提纯动量因子基础上剥离三因子影响
  • 对残差动量因子进行单因子分析

传统的价格动量效应,其实包含了对经典 Fama-French 三因子的较高暴露。 反之,基于残差的动量组合, 由于排序标准是剔除了承担系统性风险所获补偿的超额收益, 因此, 据此构造的组合,将不会对风险因子有系统性的暴露,恰恰相反,该组合是纯粹基于股票的异质性表现来构造的。以下,我们将剥离流动性因素后的提纯动量因子与Fama-French三因子进行回归,取其残差构造残差动量因子。

需要说明的是,不同于因子效果检查和中性化时采用因子与收益截面回归的策略,Fama-French模型采取的是因子(市场风险溢价MPre、市值分组收益差SMB、B/P分组收益差HMI)和因变量(提纯动量因子)的时间序列回归。

# 获取单日Fama-French因子
def FFfactor(date,mkt_bench='000905.XSHG',rf=0.04):
    '''
    默认以中证500作为市场基准,可修改为其他指数
    默认年化无风险收益率为4%
    '''
    pool = get_index_stocks(mkt_bench,date=date)
    q = query(
        valuation.code,
        valuation.market_cap,
        (balance.total_owner_equities/valuation.market_cap).label("BTM")
    ).filter(valuation.code.in_(pool))
    cap_BTM = get_fundamentals(q,date)
    # 股票池中各股前一天收益率(注:count=3,避免未来函数)
    p = get_price(pool,end_date=date,fq='post',count=3,fields=['close'])['close'].iloc[0:2,:]
    r = pd.DataFrame(p.iloc[1,:]).T # 框架用以存放各股票单日收益率
    r.iloc[0,:]=np.diff(p,axis=0)/np.array(p.iloc[0,:])
    # 筛选出最小市值的1/3股票和最大市值的1/3股票
    small_cap = cap_BTM.sort_values('market_cap',ascending=True)['code'][:int(len(cap_BTM)/3)].tolist()
    big_cap = cap_BTM.sort_values('market_cap',ascending=True)['code'][len(cap_BTM)-int(len(cap_BTM)/3):].tolist()
    # 筛选出最小B/P的1/3股票和最大B/P的1/3股票
    LBP = cap_BTM.sort_values('BTM',ascending=True)['code'][:int(len(cap_BTM)/3)].tolist()
    HBP = cap_BTM.sort_values('BTM',ascending=True)['code'][len(cap_BTM)-int(len(cap_BTM)/3):].tolist()
    # 计算因子值SMB、HMI、MP
    SMB = r[small_cap].mean(axis=1)-r[big_cap].mean(axis=1)
    HMI = r[LBP].mean(axis=1)-r[HBP].mean(axis=1)
    bench_price = get_price(mkt_bench,end_date=date,fq='post',count=3).iloc[0:2,:]['close']
    MP = np.diff(bench_price)/bench_price[0]-rf/360
    return pd.DataFrame({"MP":MP,"SMB":SMB,"HMI":HMI})
# 演示
FFfactor('2019-08-08') 
# Tips:输出数据为查询日前一天的数据,日期同
# 若输入交易日当天日期,交易时间同输入日期,且每次执行返回不同值,则说明引入未来函数
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
MP SMB HMI
2019-08-07 -0.005315 0.003247 -0.002462
# 在剥离流动性后提纯动量因子基础上剔除Fama-French三因子
factor_df = factor_df_org
# 设中证500作为市场基准,用以计算三因子
mkt_bench='000905.XSHG'
# 设定回归天数(即因子-因变量组数)为21天
reg_period=21

# 获取所有交易日三因子的值    
FF_factor = pd.DataFrame()
for i in tqdm(factor_df.index):
    FF_factor = FF_factor.append(FFfactor(i,mkt_bench=mkt_bench))
FF_factor.head(3)
100%|██████████| 1074/1074 [08:29<00:00,  2.54it/s]
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
MP SMB HMI
2014-12-31 0.008916 -0.006210 -0.005595
2015-01-05 0.017607 0.003128 -0.026897
2015-01-06 0.011489 -0.000875 0.016359
# 计算剔除三因子影响后的残差
# 注意:输入的factor_df要求格式列为股票代码,行为日期
reg = linear_model.LinearRegression()
postFFfactor = pd.DataFrame()

for s in tqdm(factor_df.columns):
    fct = []
    for i in range(reg_period,len(factor_df)+1):
        y = array(factor_df[s][i-reg_period:i]) # 选取(i+1)-reg_period到i日因子,共reg_preriod天
        x = FF_factor.iloc[i-reg_period:i,:].values
        x = preprocessing.scale(x, axis=1)
        # 对y进行缺失值处理
        if np.isnan(y[reg_period-1]): # 若因子当天为nan,则将当日残差因子也设为nan
            fct = fct+[np.nan]
        else: # 若期间y取值有nan,则将nan用均值填充
            y = array(pd.DataFrame(y).fillna(np.nanmean(y))[0])
            reg.fit(x,y)
            fct = fct+[y[reg_period-1]-reg.predict(x[reg_period-1].reshape(1,-1))[0]] # 单样本predict需reshape
    postFFfactor[s] = fct
    postFFfactor.index = factor_df.index[20:] # 注:剔除FF因子后,因子列表会损失的行数等于回归天数
postFFfactor.head(3)
100%|██████████| 835/835 [12:40<00:00,  1.32it/s]
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
000006.XSHE 000008.XSHE 000009.XSHE 000012.XSHE 000021.XSHE 000025.XSHE 000027.XSHE 000028.XSHE 000030.XSHE 000031.XSHE 000039.XSHE 000049.XSHE 000050.XSHE 000060.XSHE 000061.XSHE 000062.XSHE 000066.XSHE 000078.XSHE 000088.XSHE 000089.XSHE 000090.XSHE 000099.XSHE 000156.XSHE 000158.XSHE 000400.XSHE 000401.XSHE 000415.XSHE 000417.XSHE 000418.XSHE 000422.XSHE 000426.XSHE 000488.XSHE 000501.XSHE 000506.XSHE 000510.XSHE 000511.XSHE 000513.XSHE 000517.XSHE 000519.XSHE 000525.XSHE ... 603025.XSHG 603056.XSHG 603077.XSHG 603169.XSHG 603188.XSHG 603198.XSHG 603225.XSHG 603228.XSHG 603233.XSHG 603328.XSHG 603355.XSHG 603366.XSHG 603369.XSHG 603377.XSHG 603444.XSHG 603486.XSHG 603515.XSHG 603528.XSHG 603555.XSHG 603556.XSHG 603567.XSHG 603568.XSHG 603569.XSHG 603589.XSHG 603658.XSHG 603659.XSHG 603698.XSHG 603699.XSHG 603712.XSHG 603766.XSHG 603799.XSHG 603806.XSHG 603816.XSHG 603866.XSHG 603868.XSHG 603877.XSHG 603883.XSHG 603885.XSHG 603888.XSHG 603899.XSHG
2015-02-02 -0.215437 NaN NaN 0.035785 -0.434561 NaN NaN -0.500197 -0.707878 -1.423034 NaN 0.900817 NaN NaN NaN 1.349938 -0.068392 -0.425942 0.021346 NaN -0.909016 -0.112270 NaN NaN NaN NaN NaN -1.152074 -0.210789 -0.619638 NaN -0.167950 NaN NaN NaN 0.067360 -0.131599 NaN 0.488153 -0.062884 ... NaN NaN 0.446176 NaN NaN NaN NaN NaN NaN NaN NaN -0.036531 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN -0.170191 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2015-02-03 -0.319576 NaN NaN 0.064992 -0.500294 NaN NaN -0.702747 -1.281377 -1.894077 NaN 0.783954 NaN NaN NaN 2.046266 0.173069 -0.208910 0.156750 NaN -0.653628 -0.343267 NaN NaN NaN NaN NaN -1.410249 0.097710 -1.064239 NaN -0.202712 NaN NaN NaN 0.074171 -0.112891 NaN 0.600730 -0.203503 ... NaN NaN 0.435094 NaN NaN NaN NaN NaN NaN NaN NaN -0.196938 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN -0.464311 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2015-02-04 -0.213178 NaN NaN 0.297484 -0.739842 NaN NaN -0.714644 -1.824890 -2.144693 NaN 0.671365 NaN NaN NaN 2.432871 -0.994020 -0.438547 0.071572 NaN -1.407600 -0.804749 NaN NaN NaN NaN NaN -1.301233 -0.390758 -0.763800 NaN -0.438873 NaN NaN NaN -0.154703 0.016449 NaN 0.553664 -0.258343 ... NaN NaN 0.290801 NaN NaN NaN NaN NaN NaN NaN NaN -0.126536 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN -0.349972 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
# 对残差动量因子进行单因子分析
t0 = time.time() # 记录开始时间
far = analyze_factor(factor=postFFfactor, start_date=date_list[0], end_date=date_list[-1], weight_method='avg', industry='jq_l1', quantiles=quantiles, periods=periods,max_loss=0.3)
t1 = time.time()
print('完成单因子分析,耗时 %.2f 分钟' %round((t1-t0)/60,2))
# 打印信息比率(IC)相关表
far.plot_information_table(group_adjust=False, method='rank')
完成单因子分析,耗时 0.81 分钟
IC 分析
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
period_5 period_10 period_20
IC Mean -0.037 -0.040 -0.050
IC Std. 0.104 0.099 0.095
IR -0.353 -0.405 -0.528
t-stat(IC) -11.468 -13.137 -17.126
p-value(IC) 0.000 0.000 0.000
IC Skew -0.289 -0.373 -0.207
IC Kurtosis 0.716 1.738 0.127
# 对残差动量因子进行单因子分析——各分位数平均收益图
far.plot_quantile_returns_bar(by_group=False, demeaned=0, group_adjust=False)
<Figure size 432x288 with 0 Axes>
# 对残差动量因子进行单因子分析——累计收益走势图
# 计算指定调仓周期的各分位数每日累积收益
df = far.calc_cumulative_return_by_quantile(period=10)
# 进行数据展示
df.plot(figsize=(15,6))
<matplotlib.axes._subplots.AxesSubplot at 0x20008b898d0>

===4 改造K线下的动量因子===¶

  • 定义改造K线因子计算函数
  • 循环日期获取因子值
  • 对改造K线动量因子进行单因子分析

直觉上来说,在相同成交额情况下,如果股票价格上涨,则意味着多方市场,如果下跌则意味着空方市场。以下研究基于这一直觉,采取成交额等分切片的方式改造K线。对改造后的成交额等分K线,我们从两个角度设计因子评估多方市场和空方市场:

(1)统计N根改造K线的平均收益率; (2)统计N根改造K线的上涨概率;

与前文相对照,我们同样以21天为观察周期,为了使期间K线样本具有统计显著性,我们把21天时间等分K线改造为约42根成交额等分K线(考虑到切分颗粒度较大可能使每根K线交易额均小于阈值;以及跨日计算可能会引入未来函数,若最后一根K线累计交易金额未能达到阈值,则以前日收盘作为截止;),阈值设置为:

交易金额阈值 ≈ 21天总成交额/42

# 定义单只股票改造K线因子计算函数
def fct_alterK(date,s,N=21,frq='5m'):
    '''
    说明:N为交易日往前N天作为观察期,默认21天
    frq为获取交易额的时间段颗粒度,支持如下周期:'1m', '5m', '15m', '30m', '60m',默认'5m'
    '''
    if get_price(s, count=1, end_date=date, fields=['paused'])['paused'].values[0]==1:
        fct_alterK = pd.DataFrame({"avg_return":np.nan,"prob_up":np.nan},index=[date])
    else:
        k_cnt = int(N*240/float(frq.split('m')[0])) # 获取观察周期颗粒度K线总数
        df0 = get_price([s], count = k_cnt, end_date=date, frequency=frq, fields=['open','close','money'],fq='post')
        # 对每只股票获取其改造后每根K线对应的成交额、收益率及涨跌符号
        limit = df0['money'][s].sum()/N # 定义阈值
        k_info = pd.DataFrame()
        agr_money = 0
        p_open = df0['open'][s][0]
        for i in range(k_cnt):
            agr_money = agr_money +df0['money'][s][i]
            if i<k_cnt-1:
                if agr_money+df0['money'][s][i+1]>limit:
                    p_close = df0['close'][s][i]
                    return_rate = p_close/p_open-1
                    k_info = k_info.append({'agr_money':agr_money,'return_rate':return_rate,
                                            'sign':np.sign(return_rate)},ignore_index=True)
                    agr_money = 0
                    p_open = p_close
            elif i==k_cnt-1:
                p_close = df0['close'][s][i]
                return_rate = p_close/p_open-1
                k_info = k_info.append({'agr_money':agr_money,'return_rate':return_rate,
                                        'sign':np.sign(return_rate)},ignore_index=True)
        # 统计N根改造K线的平均收益率,以及N根改造K线的上涨概率
        avg_return = np.mean(k_info['return_rate'])
        prob_up = len(k_info.loc[k_info['sign']>0])/len(k_info)
        fct_alterK = pd.DataFrame({"avg_return":avg_return,"prob_up":prob_up},index=[date])
    return fct_alterK
fct_alterK('2017-12-28','000006.XSHE',N=21,frq='5m')
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
avg_return prob_up
2017-12-28 NaN NaN
# 计算每只股票每个交易日的改造K线因子
factor_df_alterK = pd.DataFrame()
# 循环计算给定日期范围的因子值
# 特别说明:for准点交稿,这边就暂时只跑改造K线的平均收益率因子,太费时了(T_T)
fct_avg_r = pd.DataFrame()
#--fct_prob_up = pd.DataFrame()
# 特别说明:由于改造K线因子计算耗时,在此仅试算近150个交易日的因子值(算是包含了小牛熊市场吧)
for d in tqdm(date_list[-150:]):
    pool_org = get_index_stocks(stkpool,date=d)
    pool = screener(pool_org,str(d.date()))
    avg_r = pd.DataFrame()
#--    prob_up = pd.DataFrame()
    for s in pool:
        # 特别说明:考虑到计算耗时太久,将观察期临时设为一周(即5个交易日),并将K线颗粒度改为30min
        fct = fct_alterK(d,s,5,'30m')
        avg_r = avg_r.append(fct[['avg_return']])
#--        prob_up = prob_up.append(fct[['prob_up']])
    avg_r.index = pool
    avg_r.columns = [d]
#--    prob_up.index = pool
#--    prob_up.columns = [d]
    fct_avg_r = fct_avg_r.append(avg_r.T)
#--    fct_prob_up = fct_prob_up.append(prob_up.T)
fct_avg_r.head(3)
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
000006.XSHE 000008.XSHE 000009.XSHE 000012.XSHE 000021.XSHE 000025.XSHE 000027.XSHE 000028.XSHE 000031.XSHE 000039.XSHE 000049.XSHE 000060.XSHE 000061.XSHE 000062.XSHE 000066.XSHE 000078.XSHE 000089.XSHE 000090.XSHE 000156.XSHE 000158.XSHE 000400.XSHE 000401.XSHE 000418.XSHE 000426.XSHE 000488.XSHE 000501.XSHE 000513.XSHE 000519.XSHE 000528.XSHE 000536.XSHE 000537.XSHE 000541.XSHE 000543.XSHE 000547.XSHE 000552.XSHE 000553.XSHE 000559.XSHE 000563.XSHE 000564.XSHE 000566.XSHE ... 601880.XSHG 601928.XSHG 601958.XSHG 601966.XSHG 601969.XSHG 603000.XSHG 603019.XSHG 603025.XSHG 603056.XSHG 603077.XSHG 603169.XSHG 603188.XSHG 603198.XSHG 603225.XSHG 603228.XSHG 603233.XSHG 603328.XSHG 603355.XSHG 603369.XSHG 603377.XSHG 603444.XSHG 603486.XSHG 603515.XSHG 603556.XSHG 603568.XSHG 603569.XSHG 603589.XSHG 603658.XSHG 603659.XSHG 603712.XSHG 603766.XSHG 603806.XSHG 603816.XSHG 603866.XSHG 603868.XSHG 603877.XSHG 603883.XSHG 603885.XSHG 603888.XSHG 603899.XSHG
2018-10-19 0.003524 -0.019183 -0.014150 -0.012339 -0.011813 0.007397 -0.005344 -0.002978 NaN -0.008944 0.007896 NaN -0.027450 -0.005081 -0.012454 -0.012485 -0.001816 -0.006863 -0.008680 -0.023063 -0.004989 -0.004277 NaN -0.025757 -0.006518 -0.001727 -0.006861 -0.012573 -0.013868 -0.020849 NaN -0.008227 -0.001636 -0.015173 -0.009553 -0.038649 NaN -0.009236 NaN -0.010347 ... -0.004336 -0.006181 NaN -0.016807 -0.014866 -0.016270 0.001125 -0.005525 NaN -0.003033 -0.018806 NaN 0.004215 -0.009116 0.004013 NaN -0.011718 -0.000448 -0.011199 -0.002790 -0.007677 NaN -0.006472 -0.006724 -0.011092 -0.019942 -0.007506 -0.003742 NaN NaN -0.023485 -0.007894 0.000401 NaN -0.015333 -0.015507 -0.007111 -0.013098 -0.036882 -0.015469
2018-10-22 -0.002156 -0.002403 -0.008648 -0.004354 -0.003758 0.000424 -0.002587 -0.005816 NaN -0.001963 0.004181 NaN -0.019455 -0.003151 -0.001677 -0.031937 -0.003806 -0.001451 -0.005992 -0.010300 0.001538 -0.008149 NaN -0.013783 -0.005408 0.004733 -0.006246 -0.007383 -0.011757 -0.009613 NaN -0.007488 -0.002497 -0.012715 -0.010565 -0.027730 NaN -0.001263 NaN -0.007993 ... 0.000889 -0.004304 NaN -0.010857 -0.004712 -0.004774 0.004936 0.006870 NaN -0.001729 -0.008110 NaN 0.003504 -0.009315 0.006019 NaN 0.001979 0.006372 -0.014886 -0.003312 -0.001895 NaN 0.000367 -0.003852 -0.013124 -0.012511 0.000827 -0.012686 NaN NaN -0.012838 0.002657 0.006732 NaN -0.012320 -0.012633 -0.013548 -0.006786 -0.037746 -0.011672
2018-10-23 0.010127 0.007962 0.006941 0.006880 0.008540 0.011586 0.008248 0.005447 NaN 0.004208 0.022247 NaN -0.004352 0.007751 0.009977 -0.025706 0.007775 0.007161 0.004303 0.001064 0.012684 0.002396 NaN 0.001277 0.009308 0.014421 0.006969 0.004026 -0.001196 0.003500 NaN 0.001829 0.007945 0.005492 0.002248 -0.012780 NaN 0.016827 NaN 0.005235 ... 0.006443 0.002028 NaN 0.002908 0.004653 0.010617 0.012819 0.019067 NaN 0.006636 0.001859 NaN 0.011551 -0.001241 0.012924 NaN 0.015968 0.012549 -0.002480 0.011275 0.010805 NaN 0.017576 0.005297 0.018696 -0.004995 0.009201 0.009193 NaN NaN -0.000069 0.011968 0.022628 NaN -0.007833 0.008600 0.007705 0.010669 -0.018778 0.001957
# 对改造K线因子进行单因子分析
t0 = time.time() # 记录开始时间
far = analyze_factor(factor=fct_avg_r, start_date=date_list[0], end_date=date_list[-1], weight_method='avg', industry='jq_l1', quantiles=quantiles, periods=periods,max_loss=0.5)
t1 = time.time()
print('完成单因子分析,耗时 %.2f 分钟' %round((t1-t0)/60,2))
# 打印信息比率(IC)相关表
far.plot_information_table(group_adjust=False, method='rank')
完成单因子分析,耗时 0.11 分钟
IC 分析
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
period_5 period_10 period_20
IC Mean -0.027 -0.033 -0.062
IC Std. 0.172 0.160 0.160
IR -0.154 -0.207 -0.390
t-stat(IC) -1.885 -2.539 -4.776
p-value(IC) 0.061 0.012 0.000
IC Skew -0.828 -0.537 -0.453
IC Kurtosis 1.349 0.304 1.091
# 对改造K线因子进行单因子分析——各分位数平均收益图
far.plot_quantile_returns_bar(by_group=False, demeaned=0, group_adjust=False)
<Figure size 432x288 with 0 Axes>
# 对改造K线因子进行单因子分析——累计收益走势图
# 计算指定调仓周期的各分位数每日累积收益
df = far.calc_cumulative_return_by_quantile(period=20)
# 进行数据展示
df.plot(figsize=(15,6))
<matplotlib.axes._subplots.AxesSubplot at 0x20013fcc278>

===结论===¶

本研究时间范围为2015年1月1日~2019年5月31日(期间不同,因子效果有时会有显著的差异,这点在残差动量因子上表现明显) 调用create_full_tear_sheet可以进行因子分析全览,但在此仅以最直观的三个方面对各项因子做一对比分析。

(1)IC/IR分析:本次复现研究并没有获得与研报相似的结论,以21天作为观察期来看,效果最好的因子是原始动量因子,持有期20天IC绝对值达到6.6%,且IR绝对值也高达0.533,是全场表现最好的因子。紧随其后的是持有期为20天的趋势动量因子,IR略逊于原始动量因子,达0.5。 (2)单调性分析:持有期5、10、20天,各分位数平均收益单调性最明显的亦为原始动量因子和趋势动量因子;其次是提纯动量因子,各持有期亦表现出很好的单调性; (3)累计收益走势分析:该图可以通过各分组走势的远离情况判断因子区分能力,并通过各分组走势的排序判断单调性。从这两个角度来讲,原始动量因子和趋势动量因子占优;

对于本次表现不好的残差动量因子来说,其表现不佳主要可以从各分位数平均收益图中看出,收益率按分组排序呈现出先增后减的非线性趋势,二阶导数单调性稳定,可以将其改造成动态因子,笔者认为这将是一个值得研究的方向;

另外,对于改造K线因子,虽然只取了150个交易日做试算,但20天持有期呈现出一定程度的单调性,相应的累计收益走势图也呈现出一定程度的区分度。在样本天数扩大的情况下可能能取得更好的结果。

 

全部回复

0/140

达人推荐

量化课程

    移动端课程