量化投资从时间长度上可分为长线投资和短线投资,不同的时间长度采用的分析数据和方法有所区别。长线投资通常指价值投资,一般分析宏观数据、企业基本面信息和低频的量价关系,判断宏观经济走势用于择时,分析基本面信息和低频量价关系用于选股。短线投资由于频率高,需要对短期数据深入分析,挖掘规律。短线投资的择时多依据技术因子,选股则要分析股票价格、成交量等信息,找到盈利因子。本文内容即是短线投资选股因子探索。
传统的多因子量化在A股市场中得到广泛应用,历史上长期有效的因子由于被大量使用,近几年已经失效,对新因子的挖掘提出迫切需求。
挖掘选股因子,本质上是从某一维度上找到股票的特异性,以此为标准对股票分类,获取超额收益。常用的数据有开盘价、收盘价、最高价、最低价、成交量等。常用的因子构建方法有:动量型,趋势型,反转型,相关性分析,统计规律分析,拟合度及残差分析,高阶距分析等。本文采用的是高阶距分析,使用日内高频数据,挖掘有效因子。
1.采用日内高频数据,探索高频数据在构建因子时的有效性。
2.对高频数据差分后进行高阶距计算,探索高阶距在构建因子时的有效性。
3.采用周频调仓,本文实现部分可以调整参数,修改调仓周期,探索不同调仓周期下因子的有效性。
研报中在计算因子时,采用日内5分钟价格数据,取对数差分,其本质反应的是5分钟的收益情况。对短时间的收益作高阶距运算,偏度反映统计数据相对于正态分布的偏离程度,负偏度呈现左偏形态,负偏度特点是:众数 > 中位数 > 平均数,正偏度相反。从偏度形态上推测,负偏度时更多的点落在右侧,也就是数值较大的一侧,即收益相对较高的点较多,如果历史规律有效,那么以此选股收益应该较高。这就解释了偏度越小超额收益越大。
由于考虑到运行时间,本文没有按照研报的时间和样本跑数据,而是选择了部分区间。
样本区间:2015年1月1日至2019年3月27日
样本范围:中证500历史成分股,剔除上市不满一年的股票,剔除ST股票、*ST股票,剔除交易日停牌的股票
数据频率:个股每个交易日5分钟频率的收盘价
分档方式:根据当期个股计算的因子值:已实现波动(Realized Volatility)𝑅𝑉𝑜𝑙𝑡,已实现偏度(Realized Skewness)𝑅𝑆𝑘𝑒𝑤𝑡、已实现峰度(RealizedKurtosis)𝑅𝐾𝑢𝑟𝑡𝑡,从小到大分为5档,按照股票数量平均分档
调仓周期:周频换仓,Q1档为因子值最小的,Q5档为因子值最大的。
1.在周频调仓下,偏度因子选股策略具有明显的超额收益。
2.2015年至今,负IC占比均在60%以上,偏度因子与下期收益具有负相关性。
3.多空组合策略在最大回撤和波动率指标上有明显改善,表现出色。
本文最后结果与研报结果趋势相同,但具体数据并不相同,原因可能如下:
1.调仓周期本文采用时间序列每隔五天采样,由于节假日影响,调仓日并非固定的每周第几天(例如,固定周一调仓)
2.未考虑到的其他因素影响。
1.研报直接对数据进行高阶距分析,其实可以进行基本的相关性分析,即计算个股价格差分值和基准价格差分值相关性,相关性一定程度反映个股的特异性,在一些高频策略中,相关性分析表现较好。
2.计算周期可以做调整尝试,包括因子计算的N和n选择,以及调仓周期。
import numpy as np
import pandas as pd
import datetime
import statsmodels.api as sm
from jqdata import *
import matplotlib.pyplot as plt
import time
import warnings
import pickle
import seaborn as sns
warnings.filterwarnings('ignore')
matplotlib.rcParams['axes.unicode_minus']=False
warnings.filterwarnings('ignore')
#参数定义
start_date = '2015-01-01'
end_date = '2019-03-27'
N = 48 #每日数据次数
n = 5 #向前取n天
cycle = 5 #调仓周期
def get_tradeday_list(start, end, frequency=None, count=None):
'''
获取日期列表
input:
start:str or datetime,起始时间,与count二选一
end:str or datetime,终止时间
frequency:str, day,month,quarter,halfyear,默认为day
count:int,与start二选一,默认使用start
'''
if isinstance(frequency, int):
all_trade_days = get_trade_days(start, end)
trade_days = all_trade_days[::frequency]
days = [datetime.datetime.strftime(i, '%Y-%m-%d') for i in trade_days]
return days
if count != None:
df = get_price('000001.XSHG', end_date=end, count=count)
else:
df = get_price('000001.XSHG', start_date=start, end_date=end)
if frequency == None or frequency == 'day':
days = df.index
else:
df['year-month'] = [str(i)[0:7] for i in df.index]
if frequency == 'month':
days = df.drop_duplicates('year-month').index
elif frequency == 'quarter':
df['month'] = [str(i)[5:7] for i in df.index]
df = df[(df['month'] == '01') | (df['month'] == '04') | (df['month'] == '07') | (df['month'] == '10')]
days = df.drop_duplicates('year-month').index
elif frequency == 'halfyear':
df['month'] = [str(i)[5:7] for i in df.index]
df = df[(df['month'] == '01') | (df['month'] == '06')]
days = df.drop_duplicates('year-month').index
trade_days = [datetime.datetime.strftime(i, '%Y-%m-%d') for i in days]
return trade_days
def cut_data_with_quantile(data,ncut= 5):
'''
根据分位数对数据分组
input:
data:pd.Series,index为股票代码,values为因子值
ncut:分组的数量
output:
res:list,元素为分组值,list类型,按因子值从小到大排列
'''
if isinstance(data,pd.DataFrame):
col = list(data.columns)[0]
data = data[col]
q = 1/ncut
l_q = []
l_q.append(data.min()-1)
for i in range(ncut):
qan = data.quantile(q*(i+1))
l_q.append(qan)
res = []
for n in range(ncut):
r = data[(data>l_q[n])&(data<=l_q[n+1])]
ind = list(r.index)
res.append(ind)
return res
def cut_data_with_num(data,ncut=5):
'''
基于数量分组,按从小到大等数量分组
input:
data:pd.Series,index为股票代码,values为因子值
ncut:分组的数量
output:
res:list,元素为分组值,list类型,按因子值从小到大排列
'''
if isinstance(data,pd.Series):
data = data.to_frame()
col = list(data.columns)[0]
data = data.sort_values(by=col)
length = len(data)
ind = list(data.index)
res = []
for i in range(ncut):
r = ind[int((i*length)/ncut):int((i+1)*length/ncut)]
res.append(r)
return res
#计算最大回撤
def find_max_drawdown(returns):
'''
returns:Series,输入为累计收益
'''
# 定义最大回撤的变量
result = 0
# 记录最高的回报率点
historical_return = 0
# 遍历所有日期
for i in range(len(returns)):
# 最高回报率记录
historical_return = max(historical_return, returns[i])
# 最大回撤记录
drawdown = 1 - (returns[i]) / (historical_return)
# 记录最大回撤
result = max(drawdown, result)
# 返回最大回撤值
return result
股票过滤函数
def stocks_filter(stocks,date,n=250):
'''
剔除上市不满n天,ST,*ST,停牌股票
input:
stocks: list,股票列表
date:str, 日期
n:int,上市不满n天
output:
list,过滤后股票列表
'''
#剔除ST股
st_data = get_extras('is_st', stocks, count = 1, end_date=date)
st_stocks = [stock for stock in stocks if st_data[stock][0]]
#剔除停牌股
paused = get_price(stocks,end_date=date,count=1,fields='paused')['paused']
paused_stocks = [stock for stock in stocks if paused[stock][0]]
tmpList = []
#剔除上市不满n天的股票
date = datetime.datetime.strptime(date,'%Y-%m-%d').date()
for stock in stocks:
days_public = date - get_security_info(stock).start_date
days_public = days_public.days
if days_public < n:
tmpList.append(stock)
remove_stocks = set(st_stocks) | set(paused_stocks) | set(tmpList)
sel_stocks = set(stocks) - remove_stocks
#剔除停牌、新股及退市股票
return sel_stocks
def get_filter_stocks_period(date_list,index='000982.XSHG',n=250):
'''
在时间序列上获取过滤股票
input:
date_list:list,时间序列
index:股票指数,默认为中证500,若为all,则全市场选股
n:int,剔除上市不满n天的股票
output:
dic,keys为时间,value为对应过滤后股票
'''
dic = {}
if index != 'all': #中证500
for date in date_list:
stocks = get_index_stocks(index,date)
filter_stocks = stocks_filter(stocks,date,n)
dic[date] = filter_stocks
else:
stocks = list(get_all_securities().index) #全市场选股
for date in date_list:
filter_stocks = stocks_filter(stocks,date,n)
dic[date] = filter_stocks
return dic
#获取时间序列
date_list = get_tradeday_list(start_date,end_date)
#获取时间序列上过滤后的股票并保存
stocks_dic = get_filter_stocks_period(date_list,index='000982.XSHG',n=250)
with open('stocks_dic.pkl','wb') as pk_file:
pickle.dump(stocks_dic,pk_file)
"\n#获取时间序列上过滤后的股票并保存\nstocks_dic = get_filter_stocks_period(date_list,index='000982.XSHG',n=250)\nwith open('stocks_dic.pkl','wb') as pk_file:\n pickle.dump(stocks_dic,pk_file)\n"
with open('stocks_dic.pkl','rb') as pk_file:
stocks_dic = pickle.load(pk_file)
#此代码用于测试使用
date_list_sel = date_list[:8]
stocks_dic_sel = {}
for date in date_list_sel:
stocks_dic_sel[date] = stocks_dic[date]
#因子计算时,使用当天加过去n-1天数据,计算收益时使用当天和未来第五天收盘价,实操中,当天收盘前卖掉持仓股票,买入新股票
def caculate_factor(stocks_dic,N=48,n=5,fre='5m'):
'''
因子值计算,
input:
datstocks_dic:dic,key为交易时间,value为对应过滤后的股票代码列表
N:int,和fre对应,默认fre每5分钟取一次数据,每天交易时间4小时,计算得N=48
n:int,向前计算天数
fre:默认5m
output:
dic,key为日期,values为dataframe,index为股票列表,values为因子值
'''
st = time.time()
n = n + 1 #研报中计算累计实现因子时,当天加上之前n天的和,所以在计算时实际是n+1个数据相加
date_list = list(stocks_dic.keys())
len_date = len(date_list)
dic = {}
for i in range(n,len_date-1):
date = date_list[i-1]
pre_date = date_list[i-n : i]
pre_date_for_minute_fre = date_list[i-n+1 : i+1] #get_price取分钟频率时,取end_date前一天数据
#过去n天均满足条件的股票,取交集
original_set = set(stocks_dic[pre_date[0]])
for date in pre_date[1:]:
stocks_list = stocks_dic[date]
stocks = original_set & set(stocks_list)
original_set = stocks
rvol_l = []
rskew_l = []
rkurt_l = []
stock_l = []
for stock in stocks:
date_var = []
date_skew = []
date_kurt = []
date_l = []
for date_minute in pre_date_for_minute_fre:
price = get_price(stock,end_date=date_minute,count=N,frequency=fre,fields='close')['close']
price = np.log(price)
last_price = price.shift()
r = (price - last_price).dropna()
#计算波动率
RDVar = sum(r ** 2)
if RDVar == 0: #涨停的股票波动率为0
continue
RDSkew = (len(r)**(1/2)) * sum(r**3) / (RDVar ** (3/2)) #计算偏度
RDKurt = len(r) * sum(r**4) / (RDVar ** 2) #计算峰度
date_var.append(RDVar)
date_skew.append(RDSkew)
date_kurt.append(RDKurt)
date_l.append(date_minute) #如果出现停牌股票,需要记录时间长度
if len(date_var) == 0:
break
date_df = pd.DataFrame(date_var,index=date_l,columns=['var'])
date_df['skew'] = date_skew
date_df['kurt'] = date_kurt
length = len(date_df) - 1
if length == 0:
length = 1
rvol = ((242/length)*sum(date_df['var'])) ** (1/2)
rskew = sum(date_df['skew'])/length
rkurt = sum(date_df['kurt'])/length
rvol_l.append(rvol)
rskew_l.append(rskew)
rkurt_l.append(rkurt)
stock_l.append(stock)
stocks_df = pd.DataFrame(rvol_l,index=stock_l,columns=['RVol'])
stocks_df['RSkew'] = rskew_l
stocks_df['RKurt'] = rkurt_l
dic[date] = stocks_df
et = time.time()
t = (et - st) / 60
print('time:',t)
return dic
#次函数耗时较长,保存以备下次方便使用
res_day = caculate_factor(stocks_dic,N=N,n=n)
with open('factor_dic_day_new.pkl','wb') as pk_file:
pickle.dump(res_day,pk_file)
"\nres_day = caculate_factor(stocks_dic,N=N,n=n)\nwith open('factor_dic_day_new.pkl','wb') as pk_file:\n pickle.dump(res_day,pk_file)\n"
with open('factor_dic_day_new.pkl','rb') as pk_file:
res = pickle.load(pk_file)
keys = list(res.keys())
#历史数据中的分布特征
resample_keys = keys[::10] #0.1采样
resample_l = []
for key in resample_keys:
resample_l.append(res[key])
resample_df = pd.concat(resample_l)
figure = plt.figure(figsize=(18,8))
ax1 = plt.subplot(131)
plt.title('Market Volitility')
sns.kdeplot(resample_df['RVol'], shade=True, color="g", label="RVol",alpha=.7)
ax2 = plt.subplot(132)
plt.title('Market Skewness')
plt.yticks()
sns.kdeplot(resample_df['RSkew'], shade=True, color="g", label="RSkew",alpha=.7)
ax3 = plt.subplot(133)
plt.title('Market Kurtosis')
sns.kdeplot(resample_df['RKurt'], shade=True, color="g", label="RKurt",alpha=.7)
<matplotlib.axes._subplots.AxesSubplot at 0x7f7c226d3cf8>
百分位数据获取及图形
#百分位数据获取及图形
q1 = []
q2 = []
q3 = []
for key in resample_keys:
quantile = res[key].quantile([0.1,0.25,0.5,0.75,0.9])
qvol = quantile['RVol']
qvol.name = key
q1.append(qvol)
qskew = quantile['RSkew']
qskew.name = key
q2.append(qskew)
qkurt = quantile['RKurt']
qkurt.name = key
q3.append(qkurt)
q1_df = pd.concat(q1,axis=1)
q1_df = q1_df.stack().unstack(0)
q2_df = pd.concat(q2,axis=1)
q2_df = q2_df.stack().unstack(0)
q3_df = pd.concat(q3,axis=1)
q3_df = q3_df.stack().unstack(0)
figure = plt.figure(figsize=(12,15))
ax1 = plt.subplot(311)
plt.title('Volitility percentiles')
q1_df.plot(ax=ax1,xticks=range(0,len(q1_df),20))
ax2 = plt.subplot(312)
plt.title('Skewness percentiles')
q2_df.plot(ax=ax2,xticks=range(0,len(q2_df),20))
ax3 = plt.subplot(313)
plt.title('Kurtosis percentiles')
q3_df.plot(ax=ax3,xticks=range(0,len(q3_df),20))
<matplotlib.axes._subplots.AxesSubplot at 0x7f7c21423400>
def group_profit(week_list,stocks_dic,factor_dic):
'''
计算分组收益
input:
week_list:list,时间序列,本策略使用周频
factot_dic:dic,key为时间,value为因子值
output:
六个dataframe,分别为各个因子的收益和累计收益
'''
length = len(week_list)
profit_vol_list = []
profit_skew_list = []
profit_kurt_list = []
date_l = []
for i in range(length-1):
date = week_list[i]
date_pro = week_list[i+1]
stocks = list(stocks_dic[date])
#计算未来一周收益
price = get_price(stocks,end_date=date_pro,count=2,frequency='5d',fields=['close'])['close']
profit = price.pct_change().dropna(how='all')
profit = profit.stack().unstack(0)
factor = factor_dic[date]
factor_vol = factor['RVol']
factor_skew = factor['RSkew']
factor_kurt = factor['RKurt']
cut_list_vol = cut_data_with_num(factor_vol)
cut_list_skew = cut_data_with_num(factor_skew)
cut_list_kurt = cut_data_with_num(factor_kurt)
cut_vol_l = []
for sel_stocks in cut_list_vol:
stocks_mean_profit = profit.ix[sel_stocks].mean()
cut_vol_l.append(stocks_mean_profit)
cut_vol_df = pd.concat(cut_vol_l,axis=1)
profit_vol_list.append(cut_vol_df)
cut_skew_l = []
for sel_stocks in cut_list_skew:
stocks_mean_profit = profit.ix[sel_stocks].mean()
cut_skew_l.append(stocks_mean_profit)
cut_skew_df = pd.concat(cut_skew_l,axis=1)
profit_skew_list.append(cut_skew_df)
cut_kurt_l = []
for sel_stocks in cut_list_kurt:
stocks_mean_profit = profit.ix[sel_stocks].mean()
cut_kurt_l.append(stocks_mean_profit)
cut_kurt_df = pd.concat(cut_kurt_l,axis=1)
profit_kurt_list.append(cut_kurt_df)
date_l.append(date)
profit_df_vol = pd.concat(profit_vol_list)
profit_df_vol.index = date_l
profit_df_vol.columns = range(1,6)
profit_cump_vol = (profit_df_vol + 1).cumprod()
profit_df_skew = pd.concat(profit_skew_list)
profit_df_skew.index = date_l
profit_df_skew.columns = range(1,6)
profit_cump_skew = (profit_df_skew + 1).cumprod()
profit_df_kurt = pd.concat(profit_kurt_list)
profit_df_kurt.index = date_l
profit_df_kurt.columns = range(1,6)
profit_cump_kurt = (profit_df_kurt + 1).cumprod()
return profit_df_vol,profit_cump_vol,profit_df_skew,profit_cump_skew,profit_df_kurt,profit_cump_kurt
#周频调仓
week_list = keys[::cycle] #每隔5天采样
profit_df_vol,profit_cump_vol,profit_df_skew,profit_cump_skew,profit_df_kurt,profit_cump_kurt = group_profit(week_list,stocks_dic,res)
#注意:此函数在修改调仓周期后,fre参数要对应修改
def get_base_profit(week_list,fre=str(cycle)+'d',base_index='000982.XSHG'):
'''
计算基准收益
input:
week_list:list,时间序列
fre:本策略中使用’5d',此数值必须和week_list中的时间间距对应
base_index:股票指数代码,默认中证500
output:
dataframe,index为时间,value为收益
'''
length = len(week_list)
base_profit_l = []
date_l = []
#计算基准收益
for i in range(length-1):
date = week_list[i]
date_pro = week_list[i+1]
price = get_price(base_index,end_date=date_pro,count=2,frequency=fre,fields=['close'])['close']
profit = price.pct_change().dropna(how='all')
base_profit_l.append(profit)
date_l.append(date)
base_profit = pd.concat(base_profit_l)
base_profit.index = date_l
return base_profit
figure = plt.figure(figsize=(12,16))
ax1 = plt.subplot(311)
plt.title('Volitility profit')
profit_cump_vol.plot(ax=ax1)
ax2 = plt.subplot(312)
plt.title('Skewness profit')
profit_cump_skew.plot(ax=ax2)
ax3 = plt.subplot(313)
plt.title('Kurtosis profit')
profit_cump_kurt.plot(ax=ax3,xticks=range(0,len(profit_cump_kurt),30))
<matplotlib.axes._subplots.AxesSubplot at 0x7f7c20da2a58>
#计算IC
#周频调仓
def caculate_IC(week_list,factor_dic):
'''
计算IC
input:
week_list:list,时间序列,本策略使用周频
factot_dic:dic,key为时间,value为因子值
output:
dataframe,index为时间,values为IC值
'''
length = len(week_list)
ic_l = []
date_l = []
for i in range(length-1):
date = week_list[i]
date_pro = week_list[i+1]
stocks = list(stocks_dic[date])
price = get_price(stocks,end_date=date_pro,count=2,frequency='5d',fields=['close'])['close']
profit = price.pct_change().dropna(how='all') #此时index保留为下一期,实际收益应该为当期
profit = profit.stack().unstack(0)
factor = factor_dic[date]['RSkew']
ic_day = profit.corrwith(factor)
ic_l.append(ic_day.values)
date_l.append(date)
ic_df = pd.DataFrame(ic_l,index=date_l,columns=['IC'])
return ic_df
ic_df = caculate_IC(week_list,res)
计算全周期IC指标
#全周期内IC
max_ic = ic_df.max()
min_ic = ic_df.min()
std_ic = ic_df.std()
nag_value = (ic_df < 0).astype(int)
nag_ratio = nag_value.sum() / len(nag_value)
print('最小IC:',min_ic.values[0])
print('最大IC:',max_ic.values[0])
print('IC标准差:',std_ic.values[0])
print('负IC占比:',nag_ratio.values[0])
最小IC: -0.3173311157771765 最大IC: 0.22862344636517615 IC标准差: 0.09732507632155726 负IC占比: 0.6617647058823529
#IC值及移动平均线
ic_rolling_mean = ic_df.rolling(12).mean()
ic_merge_df = pd.concat([ic_df,ic_rolling_mean],axis=1).dropna()
ic_merge_df.columns = ['IC','IC rolling mean']
figure = plt.figure(figsize=(12,6))
ax = plt.subplot(111)
plt.title('IC and rolling mean')
ic_merge_df.plot(ax=ax,xticks=range(0,len(ic_merge_df),20))
<matplotlib.axes._subplots.AxesSubplot at 0x7f7c20bece80>
#计算基准收益
base_profit = get_base_profit(week_list)
base_profit_cump = (base_profit + 1).cumprod()
#策略收益
buy_profit = profit_df_vol.iloc[:,0]
buy_profit_cump = (buy_profit + 1).cumprod()
#收益差
delta_profit = buy_profit - base_profit
figure = plt.figure(figsize=(12,6))
ax = plt.subplot(111)
ax_sub = ax.twinx()
plt.title('收益曲线')
ax.plot(base_profit_cump,'r',label='基准收益')
ax.plot(buy_profit_cump,'g',label='做多收益')
ax_sub.plot(delta_profit,label='差额收益(右轴)')
plt.xticks(list(base_profit_cump.index)[::30])
ax.legend(loc='upper left')
ax_sub.legend(loc='upper right')
<matplotlib.legend.Legend at 0x7f7c20876940>
年度指标计算
#分年度指标
def indicator_caculate(week_list,profit_df_vol,factor_dic,base_index='000982.XSHG'):
base_profit = get_base_profit(week_list,base_index=base_index)
ind = list(profit_df_vol.index)
profit_buy = profit_df_vol.iloc[:,0]
profit_sell = profit_df_vol.iloc[:,-1]
week_date_set = list(set(week_list) & set(ind))
week_date_set.sort()
year_num = set([y[:4] for y in week_date_set])
year_num = list(year_num)
year_num.sort()
cumprod_profit_l = []
buy_sell_cumprod_profit_l = []
buy_maxdrowdown_l = []
buy_sell_maxdrowdowm_l = []
buy_volitility_l = []
buy_sell_volitility_l = []
ir_l = []
for y in year_num:
year_date = [i for i in week_date_set if y == i[:4]]
profit_buy_year = profit_buy.ix[year_date]
base_profit_year = base_profit.ix[year_date]
profit_buy_year_cump = (profit_buy_year + 1).cumprod()
profit_sell_year = profit_sell.ix[year_date]
#计算收益
buy_sell = profit_buy_year - profit_sell_year
buy_sell_cump = (buy_sell + 1).cumprod()
cumprod_profit = profit_buy_year_cump.iloc[-1] -1
cumprod_profit_l.append(cumprod_profit)
buy_sell_cumprod_profit = buy_sell_cump.iloc[-1] -1
buy_sell_cumprod_profit_l.append(buy_sell_cumprod_profit)
#计算最大回撤
buy_maxdrowdown = find_max_drawdown(profit_buy_year_cump)
buy_maxdrowdown_l.append(buy_maxdrowdown)
buy_sell_maxdrowdowm = find_max_drawdown(buy_sell_cump)
buy_sell_maxdrowdowm_l.append(buy_sell_maxdrowdowm)
#计算年化波动率
buy_volitility = profit_buy_year.std()
buy_volitility_l.append(buy_volitility*len(profit_buy_year)**(1/2))
buy_sell_volitility = buy_sell.std()
buy_sell_volitility_l.append(buy_sell_volitility*len(buy_sell)**(1/2))
#计算信息比率
base_profit_cump = (base_profit_year + 1).cumprod() #基准累计收益
base_last_profit = base_profit_cump.iloc[-1] -1 #基准年化收益
delta_last_profit = cumprod_profit - base_last_profit #策略与基准年化收益差
delta_profit = profit_buy_year - base_profit_year #略与基准每日收益差值
ir = delta_last_profit / (delta_profit.std()*len(delta_profit)**(1/2)) #标准差作年化处理
ir_l.append(ir)
indicator_df = pd.DataFrame(cumprod_profit_l,index=year_num,columns=['收益'])
indicator_df['多空收益'] = buy_sell_cumprod_profit_l
indicator_df['最大回撤'] = buy_maxdrowdown_l
indicator_df['多空最大回撤'] = buy_sell_maxdrowdowm_l
indicator_df['年化波动率'] = buy_volitility_l
indicator_df['多空年化波动率'] = buy_sell_volitility_l
indicator_df['信息比率'] = ir_l
columns = indicator_df.columns
for col in columns[:-1]:
indicator_df[col] = indicator_df[col].apply(lambda x: '%.2f%%' % (x*100))
return indicator_df
indicator_df = indicator_caculate(week_list,profit_df_vol,res)
indicator_df
收益 | 多空收益 | 最大回撤 | 多空最大回撤 | 年化波动率 | 多空年化波动率 | 信息比率 | |
---|---|---|---|---|---|---|---|
2015 | 72.48% | 62.31% | 41.11% | 9.56% | 52.10% | 24.25% | 3.625162 |
2016 | 5.13% | 23.44% | 11.96% | 7.93% | 26.21% | 14.46% | 1.637960 |
2017 | -6.79% | -5.10% | 12.61% | 13.18% | 14.50% | 14.72% | -0.644916 |
2018 | -31.77% | 9.33% | 34.73% | 8.92% | 24.99% | 11.95% | 0.644675 |
2019 | 28.35% | -9.45% | 2.51% | 16.15% | 9.58% | 11.09% | -1.110282 |
IC年度指标计算
def IC_indicator(week_list,factor_dic):
year_num = set([y[:4] for y in week_list])
year_num = list(year_num)
year_num.sort()
ic_mean_l = []
ic_min_l = []
ic_max_l = []
ic_std_l = []
ic_nag_l = []
for y in year_num:
year_date = [i for i in week_list if y == i[:4]]
ic = caculate_IC(year_date,factor_dic)
ic_mean_l.append(ic.mean().values[0])
ic_min_l.append(ic.min().values[0])
ic_max_l.append(ic.max().values[0])
ic_std_l.append(ic.std().values[0])
nag_value = (ic < 0).astype(int)
nag_ratio = nag_value.sum().values[0] / len(nag_value)
ic_nag_l.append(nag_ratio)
ic_indicator = pd.DataFrame(ic_mean_l,index=year_num,columns=['IC均值'])
ic_indicator['IC最小值'] = ic_min_l
ic_indicator['IC最大值'] = ic_max_l
ic_indicator['IC标准差'] = ic_std_l
ic_indicator['负IC占比'] = ic_nag_l
return ic_indicator
IC_indicator(week_list,res)
IC均值 | IC最小值 | IC最大值 | IC标准差 | 负IC占比 | |
---|---|---|---|---|---|
2015 | -0.045475 | -0.262305 | 0.171658 | 0.101779 | 0.723404 |
2016 | -0.044675 | -0.317331 | 0.228623 | 0.098602 | 0.666667 |
2017 | -0.034935 | -0.182469 | 0.130435 | 0.079650 | 0.687500 |
2018 | -0.032929 | -0.253466 | 0.202934 | 0.094992 | 0.595745 |
2019 | 0.018365 | -0.196028 | 0.223170 | 0.150785 | 0.600000 |
1、利用个股高频价格数据构建了个股高阶距;
2、实证结果表明,波动率和峰度在周频换仓的情况下对个股收益率区分度不高,而偏度在中证500成分股中的分档收益区分度明显,分档收益单调性明显;
3.2015年至今,负IC占比均在60%以上,偏度因子与下期收益具有负相关性。
4.多空组合策略在最大回撤和波动率指标上有明显改善,表现出色。
本社区仅针对特定人员开放
查看需注册登录并通过风险意识测评
5秒后跳转登录页面...