运用到期货上,拆了@一梦春秋的代码,最后结论都懒得copy了
参考: 广发证券《基于日内高频数据的短周期选股因子研究-高频数据因子研究系列一》;
@一梦春秋 https://www.joinquant.com/view/community/detail/417356f8e6a03f2b42952844e0c587b8?type=1
因子构建过程摘自研报,具体因子指标构建如下:
对于每个个股在交易日t,首先计算个股在特定分钟频率下第i个的收益率 $r_{t,i}$, $r_{t,i}$ = $p_{t,i}$ - $p_{t,i-1}$,其中$p_{t,i}$表示在交易日t,个股在第i个特定分钟频率下的对数价格,$p_{i,i-1}$表示在交易日t,个股在第i-1个特定分钟频率下的对数价格。
对于每个个股,根据rt,i分别计算个股在交易日t下的已实现方差(Realized Variance) $RDVar_t$、已实现偏度(Realized Skewness)$RDSkew_t$,已实现峰度(Realized kurtosis) $RDKurt_t$。其中:
$$RDVar_t = \sum\limits_{ i=1}^{n}r_{t,i}^2$$
$$RDSkew_t = \frac {\sqrt N\sum\limits_{ i=1}^{n}r_{t,i}^3}{RDVar_t^{3/2}}$$
$$RDKurt_t = \frac {N \sum\limits_{ i=1}^{n}r_{t,i}^4}{RDVar_t^2}$$
其中N表示个股在交易日t中特定频率的分钟级别数据个数,如在1分钟行情级别下,数据个数N为60*4=240;在五分钟行情级别下,数据个数N为240/5=48。
$$RSkew = \frac{1}{n}{\sum\limits_{ i=0}^{n}}RDSkew_{t-i}$$
$$RKurt_t = \frac{1}{n}{\sum\limits_{ i=0}^{n}}RDKur_{t-i}$$
import numpy as np
import pandas as pd
import math
from jqdata import *
import matplotlib.pyplot as plt
from datetime import date, timedelta
#上期所
shang = {'AG8888.XSGE':'白银期货指数', 'PB8888.XSGE':'铅期货指数',
'AU8888.XSGE':'黄金期货指数', 'RB8888.XSGE':'螺纹钢期货指数',
'AL8888.XSGE':'铝期货指数', 'RU8888.XSGE':'天然橡胶期货指数',
'BU8888.XSGE':'石油沥青期货指数', 'SN8888.XSGE':'锡期货指数',
'CU8888.XSGE':'铜期货指数',
'FU8888.XSGE':'燃料油期货指数', 'ZN8888.XSGE':'锌期货指数',
'HC8888.XSGE':'热轧卷板期货指数', 'NI8888.XSGE':'镍期货指数',
'SP8888.XSGE':'纸浆主力合约',}
#郑商所
zheng = {'RM8888.XZCE':'菜籽粕期货指数',
'CF8888.XZCE':'棉花期货指数', 'FG8888.XZCE':'玻璃期货指数',
'SF8888.XZCE':'硅铁期货指数',
'SM8888.XZCE':'锰硅期货指数', 'MA8888.XZCE':'甲醇期货指数',
'SR8888.XZCE':'白糖期货指数',
'TA8888.XZCE':'PTA期货指数', 'OI8888.XZCE':'菜籽油期货指数',
'ZC8888.XZCE':'动力煤期货指数',
'AP8888.XZCE':'苹果期货指数', 'CJ8888.XZCE':'红枣合约',}
#大商所
da = {'A8888.XDCE':'豆一期货指数', 'JD8888.XDCE':'鸡蛋期货指数',
'B8888.XDCE':'豆二期货指数', 'JM8888.XDCE':'焦煤期货指数',
'L8888.XDCE':'聚乙烯期货指数',
'C8888.XDCE':'玉米期货指数', 'M8888.XDCE':'豆粕期货指数',
'CS8888.XDCE':'玉米淀粉期货指数', 'P8888.XDCE':'棕榈油期货指数',
'PP8888.XDCE':'聚丙烯期货指数',
'I8888.XDCE':'铁矿石期货指数', 'V8888.XDCE':'聚氯乙烯期货指数',
'J8888.XDCE':'焦炭期货指数', 'Y8888.XDCE':'豆油期货指数',
'EG8888.XDCE':'乙二醇期货指数',}
futures = list(shang.keys()) + list(zheng.keys()) + list(da.keys())
futures[:5]
['AG8888.XSGE', 'PB8888.XSGE', 'AU8888.XSGE', 'RB8888.XSGE', 'AL8888.XSGE']
future_list = []
date_ = date(2014,1,1)
for future in futures:
start_date = get_security_info(future).start_date
if start_date < (date_ - timedelta(days=365)):
future_list.append(future)
print(len(future_list))
future_list[:5]
24
['AG8888.XSGE', 'PB8888.XSGE', 'AU8888.XSGE', 'RB8888.XSGE', 'AL8888.XSGE']
n=5
trade_days = get_trade_days(start_date='2018-01-01', end_date='2019-06-01')
panel_dict = {}
for i in range(1, len(trade_days)):
daily_start = str(trade_days[i - 1])+' 21:31:00'
daily_end = str(trade_days[i])+' 15:05:00'
factor_df_index = []
factor_df_data = []
for future in future_list:
price = get_price(future,start_date=daily_start,end_date=daily_end,frequency='5m',fields=['close'],
fq='pre')
sum_rt2 = 0.0
sum_rt3 = 0.0
sum_rt4 = 0.0
for j in range(1, len(price)):
pi = math.log(price.iloc[j]['close'])
pi_1 = math.log(price.iloc[j - 1]['close'])
rt = pi - pi_1
sum_rt2 += math.pow(rt, 2)
sum_rt3 += math.pow(rt, 3)
sum_rt4 += math.pow(rt, 4)
rd_var = sum_rt2
if sum_rt3 == 0:
rd_skew = 0
else:
rd_skew = math.sqrt(len(price)) * sum_rt3 / (math.pow(rd_var, 3 / 2))
if sum_rt4 == 0:
rd_kurt = 0
else:
rd_kurt = len(price) * sum_rt4 / (math.pow(rd_var, 2))
factor_df_index.append(future)
factor_df_data.append([price.close.iloc[-1], rd_var, rd_skew, rd_kurt])
factor_df = pd.DataFrame(data=factor_df_data, index=factor_df_index,
columns=['close', 'rd_var', 'rd_skew', 'rd_kurt'])
panel_dict[trade_days[i]] = factor_df
panel = pd.Panel(panel_dict)
panel
/opt/conda/lib/python3.6/site-packages/IPython/core/interactiveshell.py:3267: FutureWarning: Panel is deprecated and will be removed in a future version. The recommended way to represent these types of 3-dimensional data are with a MultiIndex on a DataFrame, via the Panel.to_frame() method Alternatively, you can use the xarray package http://xarray.pydata.org/en/stable/. Pandas provides a `.to_xarray()` method to help automate this conversion. exec(code_obj, self.user_global_ns, self.user_ns)
<class 'pandas.core.panel.Panel'> Dimensions: 341 (items) x 24 (major_axis) x 4 (minor_axis) Items axis: 2018-01-03 to 2019-05-31 Major_axis axis: AG8888.XSGE to Y8888.XDCE Minor_axis axis: close to rd_kurt
panel.major_xs('AG8888.XSGE')
2018-01-03 | 2018-01-04 | 2018-01-05 | 2018-01-08 | 2018-01-09 | 2018-01-10 | 2018-01-11 | 2018-01-12 | 2018-01-15 | 2018-01-16 | 2018-01-17 | 2018-01-18 | 2018-01-19 | 2018-01-22 | 2018-01-23 | 2018-01-24 | 2018-01-25 | 2018-01-26 | 2018-01-29 | 2018-01-30 | 2018-01-31 | 2018-02-01 | 2018-02-02 | 2018-02-05 | 2018-02-06 | 2018-02-07 | 2018-02-08 | 2018-02-09 | 2018-02-12 | 2018-02-13 | 2018-02-14 | 2018-02-22 | 2018-02-23 | 2018-02-26 | 2018-02-27 | 2018-02-28 | 2018-03-01 | 2018-03-02 | 2018-03-05 | 2018-03-06 | ... | 2019-04-02 | 2019-04-03 | 2019-04-04 | 2019-04-08 | 2019-04-09 | 2019-04-10 | 2019-04-11 | 2019-04-12 | 2019-04-15 | 2019-04-16 | 2019-04-17 | 2019-04-18 | 2019-04-19 | 2019-04-22 | 2019-04-23 | 2019-04-24 | 2019-04-25 | 2019-04-26 | 2019-04-29 | 2019-04-30 | 2019-05-06 | 2019-05-07 | 2019-05-08 | 2019-05-09 | 2019-05-10 | 2019-05-13 | 2019-05-14 | 2019-05-15 | 2019-05-16 | 2019-05-17 | 2019-05-20 | 2019-05-21 | 2019-05-22 | 2019-05-23 | 2019-05-24 | 2019-05-27 | 2019-05-28 | 2019-05-29 | 2019-05-30 | 2019-05-31 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
close | 3896.929000 | 3884.829000 | 3889.934000 | 3891.012000 | 3895.981000 | 3873.110000 | 3885.140000 | 3890.233000 | 3926.186000 | 3907.923000 | 3877.015000 | 3853.076000 | 3835.058000 | 3839.049000 | 3848.178000 | 3837.149000 | 3917.340000 | 3890.659000 | 3876.701000 | 3824.738000 | 3833.709000 | 3831.796000 | 3826.264000 | 3731.234000 | 3769.407000 | 3733.506000 | 3676.579000 | 3689.624000 | 3692.757000 | 3714.779000 | 3720.807000 | 3677.940000 | 3699.089000 | 3729.238000 | 3722.266000 | 3687.433000 | 3679.198000 | 3689.655000 | 3704.325000 | 3695.468000 | ... | 3565.935000 | 3584.237000 | 3586.444000 | 3594.315000 | 3607.918000 | 3603.018000 | 3593.508000 | 3557.054000 | 3536.476000 | 3548.285000 | 3549.532000 | 3539.680000 | 3548.146000 | 3565.847000 | 3543.072000 | 3520.911000 | 3548.540000 | 3569.960000 | 3564.559000 | 3557.919000 | 3558.014000 | 3570.186000 | 3581.351000 | 3577.475000 | 3570.274000 | 3575.036000 | 3611.352000 | 3595.858000 | 3606.801000 | 3565.411000 | 3545.635000 | 3550.679000 | 3549.994000 | 3564.392000 | 3577.204000 | 3577.151000 | 3570.683000 | 3554.734000 | 3551.236000 | 3572.920000 |
rd_var | 0.000028 | 0.000051 | 0.000015 | 0.000020 | 0.000013 | 0.000014 | 0.000010 | 0.000051 | 0.000038 | 0.000015 | 0.000024 | 0.000039 | 0.000026 | 0.000017 | 0.000011 | 0.000045 | 0.000051 | 0.000071 | 0.000021 | 0.000027 | 0.000024 | 0.000019 | 0.000028 | 0.000098 | 0.000112 | 0.000033 | 0.000049 | 0.000049 | 0.000091 | 0.000032 | 0.000049 | 0.000015 | 0.000023 | 0.000029 | 0.000026 | 0.000061 | 0.000021 | 0.000063 | 0.000028 | 0.000015 | ... | 0.000019 | 0.000017 | 0.000015 | 0.000009 | 0.000012 | 0.000012 | 0.000017 | 0.000022 | 0.000026 | 0.000018 | 0.000019 | 0.000015 | 0.000008 | 0.000011 | 0.000010 | 0.000020 | 0.000020 | 0.000017 | 0.000015 | 0.000017 | 0.000008 | 0.000016 | 0.000012 | 0.000015 | 0.000016 | 0.000013 | 0.000019 | 0.000026 | 0.000010 | 0.000032 | 0.000012 | 0.000013 | 0.000018 | 0.000013 | 0.000017 | 0.000011 | 0.000010 | 0.000017 | 0.000037 | 0.000019 |
rd_skew | -0.095043 | -4.612178 | -0.602814 | -1.070272 | -0.511970 | -1.511876 | -0.426560 | -1.922123 | 3.371732 | -1.203940 | -0.831105 | -4.053758 | -0.198022 | 2.548430 | 0.557565 | -2.177223 | 0.646749 | -5.753084 | -0.471969 | -0.922546 | 1.094639 | 1.014053 | 1.290568 | -1.661635 | 2.478196 | 1.192341 | -0.773851 | 0.292206 | 5.248286 | 0.789771 | 2.318899 | 0.367297 | 0.160604 | 1.241179 | 1.280568 | -1.161667 | -0.382821 | 3.096876 | 0.958635 | 0.284492 | ... | -0.687768 | 1.530637 | 1.065321 | 0.487290 | -0.497250 | 0.392057 | -0.222383 | -1.309289 | -2.145156 | 1.165266 | 0.694311 | -0.565016 | -0.425492 | 0.539512 | -1.331245 | -0.412768 | 0.525464 | 0.700912 | -0.358176 | 1.481828 | -0.240613 | 0.484989 | 1.365373 | -0.824448 | 0.745331 | -0.408234 | 0.303964 | -1.572724 | -0.171120 | -1.217455 | -0.296289 | -1.535254 | 0.219072 | 0.311837 | 0.425111 | 0.546481 | -0.176504 | 1.168907 | -1.742897 | 1.595885 |
rd_kurt | 3.628721 | 34.083203 | 4.045513 | 6.473492 | 4.336460 | 7.774328 | 3.220249 | 15.008555 | 21.212018 | 6.424850 | 5.267951 | 27.223348 | 5.279877 | 18.254578 | 4.824025 | 17.177066 | 4.211245 | 47.769682 | 3.741693 | 3.477501 | 7.859363 | 9.918721 | 7.878246 | 6.548559 | 15.890135 | 9.007872 | 5.369693 | 4.543329 | 43.210418 | 5.970432 | 14.208564 | 4.585190 | 4.178374 | 5.324524 | 11.035961 | 14.480319 | 5.628787 | 18.476909 | 4.583459 | 5.970725 | ... | 3.671092 | 5.198868 | 4.020347 | 3.710831 | 6.239751 | 4.043122 | 4.190775 | 11.694857 | 13.463961 | 6.641241 | 6.581335 | 3.466956 | 3.202595 | 3.486636 | 5.666463 | 5.929159 | 5.626616 | 4.646697 | 4.673658 | 7.751242 | 4.040043 | 4.143220 | 7.647355 | 7.431755 | 5.194859 | 3.527151 | 3.473310 | 12.271466 | 2.912957 | 4.486629 | 2.771105 | 12.777991 | 5.051282 | 4.585726 | 3.388234 | 3.775833 | 3.487026 | 5.840408 | 12.939675 | 8.240106 |
rvol = np.sqrt(panel.minor_xs('rd_var').T.rolling(5).mean()*242).shift(1)
rvol.head(10)
AG8888.XSGE | PB8888.XSGE | AU8888.XSGE | RB8888.XSGE | AL8888.XSGE | RU8888.XSGE | CU8888.XSGE | FU8888.XSGE | ZN8888.XSGE | RM8888.XZCE | CF8888.XZCE | FG8888.XZCE | SR8888.XZCE | TA8888.XZCE | OI8888.XZCE | A8888.XDCE | B8888.XDCE | L8888.XDCE | C8888.XDCE | M8888.XDCE | P8888.XDCE | V8888.XDCE | J8888.XDCE | Y8888.XDCE | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2018-01-03 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2018-01-04 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2018-01-05 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2018-01-08 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2018-01-09 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2018-01-10 | 0.078532 | 0.105625 | 0.060853 | 0.163671 | 0.116236 | 0.161929 | 0.082414 | 0.0 | 0.098106 | 0.086055 | 0.078949 | 0.177890 | 0.061341 | 0.122266 | 0.078103 | 0.090980 | 0.098327 | 0.120537 | 0.076204 | 0.076046 | 0.082018 | 0.145638 | 0.246453 | 0.081067 |
2018-01-11 | 0.074320 | 0.106632 | 0.056683 | 0.168901 | 0.111546 | 0.180640 | 0.080016 | 0.0 | 0.098854 | 0.087240 | 0.078779 | 0.171566 | 0.062342 | 0.127703 | 0.082621 | 0.087136 | 0.159171 | 0.127870 | 0.073437 | 0.075131 | 0.087239 | 0.145208 | 0.255219 | 0.088806 |
2018-01-12 | 0.059529 | 0.102069 | 0.046859 | 0.168358 | 0.098163 | 0.168535 | 0.074416 | 0.0 | 0.100184 | 0.087271 | 0.076613 | 0.168290 | 0.058443 | 0.108304 | 0.082248 | 0.089587 | 0.169835 | 0.127178 | 0.073047 | 0.076681 | 0.084758 | 0.142930 | 0.238597 | 0.085892 |
2018-01-15 | 0.072649 | 0.105241 | 0.046285 | 0.166876 | 0.099854 | 0.159496 | 0.076432 | 0.0 | 0.100922 | 0.085507 | 0.076038 | 0.162798 | 0.054206 | 0.104343 | 0.079189 | 0.083143 | 0.153919 | 0.119289 | 0.073983 | 0.073661 | 0.091048 | 0.143651 | 0.224717 | 0.084393 |
2018-01-16 | 0.078394 | 0.103543 | 0.043652 | 0.149066 | 0.094901 | 0.166233 | 0.082110 | 0.0 | 0.098571 | 0.106478 | 0.081554 | 0.129357 | 0.055094 | 0.098014 | 0.085089 | 0.093870 | 0.167715 | 0.121203 | 0.063419 | 0.103468 | 0.098195 | 0.149519 | 0.247190 | 0.088007 |
rskew = panel.minor_xs('rd_skew').T.rolling(5).mean().shift(1)
rskew.tail()
AG8888.XSGE | PB8888.XSGE | AU8888.XSGE | RB8888.XSGE | AL8888.XSGE | RU8888.XSGE | CU8888.XSGE | FU8888.XSGE | ZN8888.XSGE | RM8888.XZCE | CF8888.XZCE | FG8888.XZCE | SR8888.XZCE | TA8888.XZCE | OI8888.XZCE | A8888.XDCE | B8888.XDCE | L8888.XDCE | C8888.XDCE | M8888.XDCE | P8888.XDCE | V8888.XDCE | J8888.XDCE | Y8888.XDCE | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2019-05-27 | -0.175104 | -0.098253 | -0.067989 | -0.126786 | -0.776771 | -1.223528 | -0.050564 | -0.043689 | 0.022579 | 0.120305 | -0.535873 | 0.367370 | -0.696847 | -0.097802 | 0.075797 | -0.119445 | -0.372945 | 0.637237 | -0.393494 | -0.163280 | 0.075275 | 0.714072 | -0.196096 | -0.483185 |
2019-05-28 | -0.006550 | -0.748347 | 0.133832 | -0.461609 | -0.317813 | -0.210288 | 0.293991 | 0.105746 | 0.190589 | -0.014780 | -0.014979 | 0.281588 | -0.558965 | 1.058356 | 0.351417 | -0.046069 | -0.363897 | 0.709613 | -0.570543 | -0.067322 | 0.241694 | 0.908246 | -0.372345 | -0.349835 |
2019-05-29 | 0.265200 | -1.010981 | 0.356060 | -0.850822 | -0.205528 | -0.665941 | -0.010998 | 0.466548 | 0.012581 | -0.387606 | -0.203280 | 0.113810 | -0.508173 | 0.844013 | 0.261749 | -0.162345 | -0.402435 | 0.893345 | -0.300750 | 0.088379 | 0.113034 | 1.102814 | -0.733780 | -0.499796 |
2019-05-30 | 0.455167 | -1.282686 | 0.772734 | -0.400378 | 0.020745 | -0.301800 | 0.082120 | 0.375143 | -0.143340 | 0.849013 | 0.335597 | -0.236364 | 0.012732 | 0.914475 | 0.508935 | 0.807184 | 1.057709 | 0.906341 | 0.358698 | 1.797040 | 0.954387 | 1.186858 | -0.458667 | 0.227020 |
2019-05-31 | 0.044220 | -1.042543 | 0.236909 | -0.673350 | 0.108086 | -0.186491 | 0.338462 | 1.008258 | 0.145217 | 0.793839 | 0.125658 | -0.559639 | 0.429029 | 1.009303 | 0.470637 | 0.880507 | 1.265849 | 0.284821 | -0.002156 | 1.864890 | 1.071081 | 1.222379 | -0.505347 | 0.292943 |
rkurt = panel.minor_xs('rd_kurt').T.rolling(5).mean().shift(1)
rkurt.tail()
AG8888.XSGE | PB8888.XSGE | AU8888.XSGE | RB8888.XSGE | AL8888.XSGE | RU8888.XSGE | CU8888.XSGE | FU8888.XSGE | ZN8888.XSGE | RM8888.XZCE | CF8888.XZCE | FG8888.XZCE | SR8888.XZCE | TA8888.XZCE | OI8888.XZCE | A8888.XDCE | B8888.XDCE | L8888.XDCE | C8888.XDCE | M8888.XDCE | P8888.XDCE | V8888.XDCE | J8888.XDCE | Y8888.XDCE | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2019-05-27 | 5.714868 | 5.263076 | 4.175594 | 5.434553 | 8.984689 | 14.605471 | 6.469306 | 8.652241 | 6.597865 | 7.190487 | 8.400903 | 4.350366 | 8.539037 | 5.570203 | 5.104087 | 5.356956 | 4.898469 | 4.481038 | 7.210029 | 5.263679 | 4.323041 | 5.571098 | 8.871790 | 4.862249 |
2019-05-28 | 5.915813 | 8.190477 | 4.475250 | 5.714088 | 9.321683 | 8.728391 | 6.623149 | 9.779292 | 6.630368 | 6.719982 | 9.439344 | 4.459139 | 9.884842 | 12.135892 | 5.551983 | 5.146355 | 4.721424 | 4.640307 | 7.243167 | 5.504183 | 4.582042 | 5.952238 | 8.419125 | 4.833600 |
2019-05-29 | 4.057620 | 8.103784 | 5.011199 | 5.429023 | 9.786781 | 8.017176 | 5.610571 | 9.621175 | 6.497516 | 6.578347 | 8.446911 | 4.381520 | 10.637737 | 11.701365 | 5.238671 | 5.357968 | 5.186984 | 5.361547 | 4.999149 | 6.607914 | 4.152856 | 7.096729 | 9.038384 | 4.691304 |
2019-05-30 | 4.215445 | 8.806066 | 6.570760 | 4.096090 | 7.248866 | 6.798045 | 4.945815 | 8.878251 | 6.544527 | 7.979803 | 9.368163 | 3.938108 | 9.928262 | 10.666351 | 5.783691 | 6.469350 | 9.963528 | 5.673635 | 5.970576 | 13.926214 | 6.608399 | 7.063728 | 7.509983 | 4.911892 |
2019-05-31 | 5.886235 | 8.139018 | 8.681616 | 5.786130 | 7.281216 | 5.842953 | 4.987463 | 7.730391 | 7.101016 | 8.029187 | 9.445432 | 4.831540 | 7.433998 | 10.217890 | 5.842374 | 6.384113 | 9.910771 | 7.673921 | 6.090323 | 13.627367 | 7.032325 | 6.679847 | 7.705261 | 4.456813 |
import matplotlib.dates as mdate
# 设置字体 用来正常显示中文标签
plt.rcParams['font.sans-serif'] = ['Arial Unicode MS']
# 用来正常显示负号
plt.rcParams['axes.unicode_minus'] = False
"""
绘制直方图
data:必选参数,绘图数据
bins:直方图的长条形数目,可选项,默认为10
density:是否将得到的直方图向量归一化,可选项,默认为0,代表不归一化,显示频数。normed=1,表示归一化,显示频率。
facecolor:长条形的颜色
edgecolor:长条形边框的颜色
alpha:透明度
"""
# 这个颜色是我把研报的图截图用取色器取出来的,为什么画出来还是有色差?
color = "#1F77B4"
plt.hist(rvol, bins=40, density=0, facecolor=color, edgecolor=None, alpha=1)
# 显示横轴标签
plt.xlabel("区间")
# 显示纵轴标签
plt.ylabel("频数")
# 显示图标题
plt.title("个股波动率分布")
plt.show()
plt.hist(rskew, bins=40, density=0, facecolor=color, edgecolor=None, alpha=1)
# 显示横轴标签
plt.xlabel("区间")
# 显示纵轴标签
plt.ylabel("频数")
# 显示图标题
plt.title("个股偏度分布")
plt.show()
plt.hist(rkurt, bins=40, density=0, facecolor=color, edgecolor=None, alpha=1)
# 显示横轴标签
plt.xlabel("区间")
# 显示纵轴标签
plt.ylabel("频数")
# 显示图标题
plt.title("个股峰度分布")
plt.show()
/opt/conda/lib/python3.6/site-packages/matplotlib/axes/_axes.py:6575: RuntimeWarning: All-NaN slice encountered xmin = min(xmin, np.nanmin(xi)) /opt/conda/lib/python3.6/site-packages/matplotlib/axes/_axes.py:6576: RuntimeWarning: All-NaN slice encountered xmax = max(xmax, np.nanmax(xi)) /opt/conda/lib/python3.6/site-packages/numpy/lib/function_base.py:780: RuntimeWarning: invalid value encountered in greater_equal keep = (tmp_a >= first_edge) /opt/conda/lib/python3.6/site-packages/numpy/lib/function_base.py:781: RuntimeWarning: invalid value encountered in less_equal keep &= (tmp_a <= last_edge)
从以上因子分布三图看出,整个期货市场品种的波动率分布整体上呈现右偏分布;各品种的偏度分布,整体偏度水平保持在零附近,呈现较为明显厚尾状态;各品种的峰度分布与个股波动率水平类似,分布整体上右偏,且样本内的峰度水平大部分大于3,呈现厚尾的现象。
# 百分位走势5档颜色 蓝 橙 绿 红 紫
color_list = ['#5698c6', '#ff9e4a', '#60b760', '#e05c5d', '#ae8ccd']
label_list = ['10', '25', 'median', '75', '90']
all_df = [rvol, rskew, rkurt]
title_name = ['rvol', 'rskew', 'rkurt']
num = 0
for df in all_df:
# 这里需要用每天的因子数据分档,计算出5个折线
y_list = [[], [], [], [], []]
q_list = [0.10, 0.25, 0.50, 0.75, 0.90]
for i in range(len(df)):
factor = df.iloc[i].sort_values()
for j in range(len(q_list)):
num_signal = int(len(q_list) * q_list[j])
factor_value = factor.iloc[num_signal]
y_list[j].append(factor_value)
# 可以设置生成图片的大小
fig = plt.figure(figsize=(12, 8))
plt.title(title_name[num]+' 百分位走势')
num += 1
for i in range(len(y_list)):
plt.plot(y_list[i], color_list[i], label=label_list[i])
x = np.arange(0, len(df),50)
x_label = []
for i in range(0,len(df)):
if i in x:
date = list(df.index)[i]
x_label.append(date)
plt.xticks(x, x_label, rotation='vertical')
plt.xticks(rotation=360)
plt.xlabel("TRADE_DT")
plt.ylabel("因子值")
legend()
all_df = [rvol, rskew, rkurt]
Q_list = ['Q1', 'Q2', 'Q3', 'Q4', 'Q5']
df_list = []
num = 0
for i in range(len(all_df)):
price_close = panel.minor_xs('close').T
price_pct = price_close.pct_change().dropna().iloc[4:]
groups = [[], [], [], [], []]
length = int(len(all_df[i].T) / 5)
for j in range(len(price_pct)):
df = all_df[i].dropna().iloc[0:]
index_list = list(df.iloc[j].sort_values().index)
daily_price = price_pct[index_list].iloc[j]
groups[0].append(daily_price[0:5].mean())
groups[1].append(daily_price[5:9].mean())
groups[2].append(daily_price[9:14].mean())
groups[3].append(daily_price[14:19].mean())
groups[4].append(daily_price[19:].mean())
df_group = pd.DataFrame(groups).T
df_group.index = price_pct.index
df_group = df_group.cumsum()
df_group.columns = ['Q1', 'Q2', 'Q3', 'Q4', 'Q5']
df_list.append(df_group)
fig = plt.figure(figsize=(12, 8))
plt.title(title_name[num]+' 累计收益率')
plt.plot(df_group['Q1'])
plt.plot(df_group['Q2'])
plt.plot(df_group['Q3'])
plt.plot(df_group['Q4'])
plt.plot(df_group['Q5'])
plt.legend()
num += 1
num = 0
for df in df_list:
df = df['Q3'] - df['Q2']
plt.figure(figsize=(12, 8))
plt.title(title_name[num]+' 累计收益率')
plt.plot(df, label=title_name[num])
plt.legend()
num += 1
import scipy.stats as st
all_df = [rvol, rskew, rkurt]
name = ['rvol', 'rskew', 'rkurt']
color_list = ['#2B4C80', '#B00004']
label_list = ['IC', 'IC均值(12期)']
for i in range(len(all_df)):
# 每天的ic
ic_list = []
# ic均值(12期)
ic_ma_list = []
y_list = [ic_list, ic_ma_list]
for j in range(len(all_df[i].iloc[5:])):
ic = st.pearsonr(price_pct.iloc[j].values, all_df[i].iloc[5:].iloc[j].values)[0]
ic_list.append(ic)
ic_list = np.array(ic_list)
print("%s ic 小于0的个数占比:%s" % (name[i], np.sum(ic_list < 0) / len(ic_list)))
for z in range(len(ic_list)):
if z < 12:
ic_ma_list.append(np.nan)
continue
ic_ma = np.array(ic_list[z - 12:z]).mean()
ic_ma_list.append(ic_ma)
fig = plt.figure(figsize=(12, 8))
ax = fig.add_subplot(1, 1, 1)
ax.set_title(name[i] + "因子ic")
for i in range(len(y_list)):
yi = y_list[i]
ax.plot(yi, color_list[i], label=label_list[i])
# 绘制Y轴的网格线便于查看IC
plt.grid(axis='y')
plt.show()
rvol ic 小于0的个数占比:0.48214285714285715
rskew ic 小于0的个数占比:0.5327380952380952
rkurt ic 小于0的个数占比:0.5178571428571429
本社区仅针对特定人员开放
查看需注册登录并通过风险意识测评
5秒后跳转登录页面...
移动端课程