请 [注册] 或 [登录]  | 返回主站

量化交易吧 /  量化平台 帖子:3365780 新帖:12

因子正交与系统性风险*

外汇工厂发表于:6 月 20 日 02:36回复(1)

因子正交与系统性风险*¶


在文中,作者提出了一种新颖的因子正交方法,该方法相对于传统的主成分分宜以及施密特政教方法具有显著的有点,一是对所有的因子平等对待,而是该方法政教后得到的因子与原始因子保持最大的相似性。基于该方法正交后的因子,我们可以方便地将系统性风险*到各因子上,并衡量各因子对于资产系统性风险的相对贡献。

下面我们开始编写新的正交方法¶

假设一个风险资产j的收益可以被K个因子 $f^k$ 线性解释,例如市场(RM),规模 (SMB),价值(HML),动量(Mom),长期反转(Rev),线性因子模型表达式如下: $$ 𝑟^𝑡_j= 𝛼_𝑗 + 𝛴^K_{𝑘=1}𝛽_{𝑘_𝑗}𝑓_𝑡^k+ 𝜀_𝑡^j $$ 其中$ 𝑓_𝑘$ 和残差项 $𝜀_𝑗 $不相关,但是因子之间可能相关。 资产 𝑗 的收益方差可以表示为 $$ 𝜎_{𝑠𝑗}^2 = 𝛴^K_{k=1}𝛽_{𝑘_𝑗}𝛽_{𝑙_𝑗}𝐶𝑜𝑣(𝑓^𝑘 , 𝑓^𝑙 ) $$

我们通过正交的方式获得正交后的系数𝛽T,则收益方差 $$ 𝜎̂_{𝑠_𝑗}^2 = 𝛴^K_{k=1} (𝛽T_{k_j}𝜎̂T_{f^k})^2=𝜎̂^2_j-𝜎̂^2_{𝜀_j} $$

以下为实现步骤:¶

1.输入A,A的维数为T*K,K表示因子个数,T表示时间节点

2.用A的各因子列减去因子的均值得到$Fmean$

3.计算$Fmean$的协方差阵$𝛴_{K \times K}$,则重叠矩阵为$M_{K \times K}=(T-1)𝛴_{K \times K}$

4.为了获得Fmean正交矩阵FmeanT: $$ FmeanT' \times FmeanT=(FmeanT \times S_{K \times K})' FmeanS_{K \times K}=S_{K \times K}'(FmeanT'Fmean)S_{K \times K}=S_{K \times K}'M_{K \times K}S_{K \times K}=I_{K \times K} $$ $$ S_{K \times K}S_{K \times K}'=M_{K \times K}^{-1} $$

5.这里我们通过对$M_{K \times K}$特征*得到特征向量$O_{K \times K}$和特征根 对角阵$D_{K \times K}$,满足$M_{K \times K}=O_{K \times K}D_{K \times K}O_{K \times K}'$ 可以求出$S_{K \times K}=O_{K \times K}D_{K \times K}^{-1/2}O_{K \times K}'$

6.进行缩放 $S_{K \times K}->S_{K \times K}\sqrt{T-1}L_{K \times K}$ 其中$L_{K \times K}$为K个因子标准差构成的的对角阵

7.最后A的正交阵$AT=AS_{K \times K}$

更具体的过程请参考:Klein, Rudolf F., and V. K. Chow. "Orthogonalized factors and systematic risk decomposition." Quarterly Review of Economics & Finance 53.2(2013):175-187.

import pandas as pdimport numpy as npimport scipy.stats as scsfrom sklearn.decomposition import PCA  from statsmodels.formula.api import olsfrom itertools import combinations
###新的正交方法###def new_zj(A):T=len(A[:,0])K=len(A[0,:])fmean=np.zeros_like(A)FF=np.zeros_like(A)tzg=np.zeros((K,K))for i in arange(K):fmean[:,i]=A[:,i]-mean(A[:,i])fmean=mat(fmean)M=(T-1)*numpy.cov(fmean.T) #获得重叠矩阵M=mat(M)u,v=np.linalg.eig(M) #u是特征值,v是特征向量sk1=np.dot( np.dot(v ,np.linalg.inv(np.diag(u**0.5)) ), v.T)sk2=np.dot(sk1*((T-1)**0.5),np.diag(A.var(axis=0)**0.5))####F=np.dot(A,sk2)for i in arange(K):FF[:,i]=[float(j) for j in F[:,i]]return FF
####pca正交化###def pca2(A):T,K=shape(A)fmean=np.zeros_like(A)#FF=np.zeros_like(A)#tzg=np.zeros((K,K))for i in arange(K):fmean[:,i]=A[:,i]-mean(A[:,i])getcov=np.cov(fmean.T)u,v=np.linalg.eig(getcov) #u是特征值,v是特征向量b = np.argsort(u)v2=v.T[b,:]result=np.dot(v2,A.T)result=result*(np.sign(A[0,0])/np.sign(result[0,0]))##做一个符号统一化return result.T
####pca 正交###def pca(A):pca=PCA(n_components=2)pca.fit(A)result=pca.fit_transform(A)#   result=result*(sign(A[0,0]*sign(result[0,0])))return result
###函数,gram-schmidt正交化def gram_schmidt(A):T,K=shape(A)Q=np.zeros_like(A)Q[:,0]=A[:,0]for i in arange(1,K):for j in arange(i):uu=np.dot(np.dot(Q[:,j].T,A[:,i]),Q[:,j])/np.dot(Q[:,j],Q[:,j].T)hh=A[:,i]-uuQ[:,i]=hhreturn Q
###函数,获得三种正交化后变量的相关系数def three_orth(name1):A=factors[name1].as_matrix()index=[]index2=[]####PCA主成分回归,输出为第一个因子与第二个因子正交前后的相关系数result=pca(A)index.append(np.corrcoef(list(factors[name1[0]]),list(result[:,0]))[0][1])index.append(np.corrcoef(list(factors[name1[1]]),list(result[:,1]))[0][1])index2.append(np.linalg.norm(result))###Gram-Schmidt正交Q=gram_schmidt(A)index.append(np.corrcoef(list(factors[name1[0]]),Q[:,0])[0][1])index.append(np.corrcoef(list(factors[name1[1]]),Q[:,1])[0][1])index2.append(np.linalg.norm(Q))###新的正交new=new_zj(A)index.append(np.corrcoef(new[:,0],list(factors[name1[0]]))[0][1])index.append(np.corrcoef(new[:,1],list(factors[name1[1]]))[0][1])index2.append(np.linalg.norm(new))return index,index2
###输入数据factors=pd.read_csv('Five Factor.CSV')factors.head()factors.columns=['time','RM-RF','SMB','HML','RF','Mom','Rev']RM_RF=list(factors['RM-RF'])RF=list(factors['RF'])RM=[RM_RF[i]+RF[i] for i in arange(len(RF))]factors['RM']=RM
###输出相关系数test_data = ['RM','Mom','Rev','HML','SMB']print 'Note: corrcoef of the original factor and orthogonal factor'print 120*'-'print "%5s %30s %30s %30s" %('','PCA','GS','New')print "%14s  %30s  %30s  %30s" % ('','-','-','-')print "%14s %15s %15s %15s %15s %15s %15s" % ('','(1)','(2)','(1)','(2)','(1)','(2)')for i in combinations(test_data, 2):i=list(i)corrindex=three_orth(i)[0]print "%14s %15.5f %15.5f %15.5f %15.5f %15.5f %15.5f" % (i[0]+'&'+i[1],corrindex[0],corrindex[1],corrindex[2],corrindex[3],corrindex[4],corrindex[5])
Note: corrcoef of the original factor and orthogonal factor

                                 PCA                             GS                            New
                       -         -         -
                           (1)             (2)             (1)             (2)             (1)             (2)
        RM&Mom         0.89315        -0.68569         1.00000         0.95121         0.98645         0.98313
        RM&Rev         0.98675         0.91315         1.00000         0.96517         0.99484         0.98797
        RM&HML         0.98505         0.90869         1.00000         0.96374         0.99469         0.98810
        RM&SMB         0.98789         0.88404         1.00000         0.94446         0.99222         0.97890
       Mom&Rev        -0.97695         0.90147         1.00000         0.97644         0.99519         0.99098
       Mom&HML        -0.95773         0.75221         1.00000         0.91919         0.98345         0.97003
       Mom&SMB        -0.99423         0.97119         1.00000         0.99271         0.99855         0.99686
       Rev&HML         0.89920        -0.41602         1.00000         0.76766         0.94008         0.94245
       Rev&SMB         0.87917        -0.61790         1.00000         0.91611         0.98078         0.97768
       HML&SMB         0.91833         0.86050         1.00000         0.99086         0.99815         0.99777
#####输出范数#####print 'Note:Frobenius norm values'print '-'*70print "%14s %15s %15s %15s" %('','PCA','GS','New')for i in combinations(test_data, 2):i=list(i)corrindex=three_orth(i)[1]print "%14s %15.5f %15.5f %15.5f" % (i[0]+'&'+i[1],corrindex[0],corrindex[1],corrindex[2])
Note:Frobenius norm values

                           PCA              GS             New
        RM&Mom       229.28022       227.13690       233.40926
        RM&Rev       204.09788       204.52081       206.31689
        RM&HML       205.32148       205.74586       207.59102
        RM&SMB       199.82695       199.39138       202.05923
       Mom&Rev       189.66259       189.34010       191.11510
       Mom&HML       190.97870       186.98608       193.24317
       Mom&SMB       185.05881       185.82403       186.30073
       Rev&HML       159.87305       142.96471       160.35140
       Rev&SMB       152.75239       147.49523       153.08458
       HML&SMB       154.38348       154.53620       155.02464

从范数来看,新方法的范数更大¶

##函数,用来打印统计特征##def print_stat(dt):sta=scs.describe(dt,axis=0)print (100*"-")print "%14s %15s %15s %15s %15s %15s" % ('statistics',dt.columns[0],dt.columns[1],dt.columns[2],dt.columns[3],dt.columns[4])print "%14s %15.5f %15.5f %15.5f %15.5f% 15.5f"  % ('min',sta[1][0][0],sta[1][0][1],sta[1][0][2],sta[1][0][3],sta[1][0][4])print "%14s %15.5f %15.5f %15.5f %15.5f% 15.5f"  % ('max',sta[1][1][0],sta[1][1][1],sta[1][1][2],sta[1][1][3],sta[1][1][4])print "%14s %15.5f %15.5f %15.5f %15.5f% 15.5f"  % ('mean',sta[2][0],sta[2][1],sta[2][2],sta[2][3],sta[2][4])print "%14s %15.5f %15.5f %15.5f %15.5f% 15.5f"  % ('std',sta[3][0],sta[3][1],sta[3][2],sta[3][3],sta[3][4])print "%14s %15.5f %15.5f %15.5f %15.5f% 15.5f"  % ('skew',sta[4][0],sta[4][1],sta[4][2],sta[4][3],sta[4][4])print "%14s %15.5f %15.5f %15.5f %15.5f% 15.5f"  % ('kurtosis',sta[5][0],sta[5][1],sta[5][2],sta[5][3],sta[5][4])
##数据的处理test=factors[['RM','Mom','Rev','HML','SMB']].as_matrix()before=pd.DataFrame(test)before.columns=['RM','Mom','Rev','HML','SMB']after=pd.DataFrame(new_zj(test))after.columns=['RMT','MomT','RevT','HMLT','SMBT']
##打印统计特征##print 'Original'print_stat(before)print print print 'Orthogonal'print_stat(after)
Original

    statistics              RM             Mom             Rev             HML             SMB
           min       -29.10000       -52.26000       -13.95000       -13.28000      -17.28000
           max        38.95000        18.33000        36.36000        35.46000       36.70000
          mean         0.95748         0.59543         0.30870         0.40792        0.26307
           std        27.87247        22.43319        11.98968        12.46907       10.33883
          skew         0.26774        -3.13415         2.94240         2.18544        2.01414
      kurtosis         8.37924        28.14925        25.05201        19.15865       19.98738


Orthogonal

    statistics             RMT            MomT            RevT            HMLT            SMBT
           min       -29.07851       -43.45487       -18.41538       -14.30052      -18.09761
           max        35.23136        18.50452        37.40221        26.74093       32.00083
          mean         1.04070         0.91282         0.10737         0.48244        0.15287
           std        27.84582        22.41174        11.97822        12.45715       10.32895
          skew        -0.25533        -2.20092         2.05794         0.73530        1.29000
      kurtosis         4.89541        16.10147        18.54849         6.10281       12.93129
print before.corr().round(2)print after.corr().round(2)
       RM   Mom   Rev   HML   SMB
RM   1.00 -0.34  0.25  0.25  0.32
Mom -0.34  1.00 -0.23 -0.41 -0.13
Rev  0.25 -0.23  1.00  0.64  0.40
HML  0.25 -0.41  0.64  1.00  0.13
SMB  0.32 -0.13  0.40  0.13  1.00
      RMT  MomT  RevT  HMLT  SMBT
RMT   1.0   0.0  -0.0  -0.0   0.0
MomT  0.0   1.0   0.0   0.0   0.0
RevT -0.0   0.0   1.0  -0.0  -0.0
HMLT -0.0   0.0  -0.0   1.0   0.0
SMBT  0.0   0.0  -0.0   0.0   1.0

上面是正交前后五因子的协方差矩阵¶

###读取资产组合数据Portfolios=pd.read_csv('Fama_Portfolio.csv')Portfolios.tail()Portfolios=Portfolios.rename(columns={Portfolios.columns.values[0]:'time'})newdata1=pd.merge(Portfolios,factors)after['time']=newdata1[['time']]newdata=pd.merge(newdata1,after)newdata.columns
Index([u'time', u'SmallGrowth', u'SmallNeutral', u'SmallValue', u'BigGrowth',
       u'BigNeutral', u'BigValue', u'SMALLDown', u'SmallMedium', u'SmallUp',
       u'BigDown', u'BigMedium', u'BigUp', u'SmallLowRev', u'ME1 PRIOR2',
       u'SmallHiRev', u'BigLowRev', u'ME2 PRIOR2', u'BigHiRv', u'RM-RF',
       u'SMB', u'HML', u'RF', u'Mom', u'Rev', u'RM', u'RMT', u'MomT', u'RevT',
       u'HMLT', u'SMBT'],
      dtype='object')
inputnames=['SmallValue','SmallGrowth','SMALLDown','SmallLowRev','BigValue','BigGrowth','BigDown','BigLowRev']print "%14s %15s %15s %15s %15s %15s" % ('Original factors','RM','Mom','Rev','HML','SMB')print '-'*110for i in inputnames:reg=ols(i+'~ RM+Mom+Rev+HML+SMB',data=newdata).fit()reg.params=reg.params[1:]print "%14s    %15.3f %15.3f %15.3f %15.3f %15.3f"  % (i, reg.params[0],reg.params[1],reg.params[2],reg.params[3],reg.params[4])print "%14s           (%6.3f)         (%6.3f)      (%6.3f)         (%6.3f)         (%6.3f)"  % ('(tvalue)', reg.tvalues[0],reg.tvalues[1],reg.tvalues[2],reg.tvalues[3],reg.tvalues[4])print printprint "%14s%15s %15s %15s %15s %15s" % ('Orthogonal factors','RMT','MomT','RevT','HMLT','SMBT')print '-'*110for i in inputnames:reg=ols(i+'~ RMT+MomT+RevT+HMLT+SMBT',data=newdata).fit()reg.params=reg.params[1:]print "%14s    %15.3f %15.3f %15.3f %15.3f %15.3f"  % (i, reg.params[0],reg.params[1],reg.params[2],reg.params[3],reg.params[4])print "%14s           (%6.3f)         (%6.3f)      (%6.3f)         (%6.3f)         (%6.3f)"  % ('(tvalue)', reg.tvalues[0],reg.tvalues[1],reg.tvalues[2],reg.tvalues[3],reg.tvalues[4])
Original factors              RM             Mom             Rev             HML             SMB

    SmallValue              1.012          -0.028           0.029           0.761           0.918
      (tvalue)           ( 1.801)         (219.495)      (-5.284)         ( 3.246)         (87.182)
   SmallGrowth              1.090          -0.048           0.011          -0.227           1.033
      (tvalue)           (-3.699)         (145.284)      (-5.634)         ( 0.757)         (-15.956)
     SMALLDown              1.072          -0.587           0.013           0.239           0.992
      (tvalue)           (-2.307)         (122.253)      (-58.905)         ( 0.760)         (14.405)
   SmallLowRev              1.068          -0.102           0.389           0.382           1.051
      (tvalue)           ( 0.633)         (131.973)      (-11.132)         (24.807)         (24.970)
      BigValue              1.092          -0.034          -0.010           0.778           0.020
      (tvalue)           (-2.537)         (135.808)      (-3.669)         (-0.646)         (51.094)
     BigGrowth              1.014          -0.013           0.008          -0.235          -0.096
      (tvalue)           ( 3.365)         (217.349)      (-2.485)         ( 0.873)         (-26.586)
       BigDown              1.077          -0.673          -0.008           0.038          -0.062
      (tvalue)           ( 3.216)         (132.585)      (-72.766)         (-0.509)         ( 2.441)
     BigLowRev              1.058          -0.039           0.806           0.038          -0.127
      (tvalue)           (-0.529)         (117.992)      (-3.850)         (46.347)         ( 2.225)


Orthogonal factors            RMT            MomT            RevT            HMLT            SMBT

    SmallValue              1.163          -0.364           0.577           0.844           1.094
      (tvalue)           ( 1.801)         (282.919)      (-79.558)         (92.022)         (137.393)
   SmallGrowth              1.174          -0.234           0.266          -0.045           1.202
      (tvalue)           (-3.699)         (175.525)      (-31.408)         (26.059)         (-4.549)
     SMALLDown              1.273          -0.830           0.460           0.498           1.194
      (tvalue)           (-2.307)         (162.899)      (-95.341)         (38.647)         (42.664)
   SmallLowRev              1.242          -0.411           0.818           0.642           1.303
      (tvalue)           ( 0.633)         (172.255)      (-51.138)         (74.438)         (59.553)
      BigValue              1.134          -0.353           0.388           0.845           0.240
      (tvalue)           (-2.537)         (158.328)      (-44.143)         (35.534)         (78.910)
     BigGrowth              0.961          -0.146           0.036          -0.086           0.102
      (tvalue)           ( 3.365)         (231.151)      (-31.480)         ( 5.679)         (-13.905)
       BigDown              1.150          -0.841           0.185           0.314           0.182
      (tvalue)           ( 3.216)         (158.816)      (-104.236)         (16.777)         (28.967)
     BigLowRev              1.090          -0.276           0.857           0.432           0.247
      (tvalue)           (-0.529)         (136.433)      (-31.000)         (70.334)         (36.169)

上表为8个投资组合分别关于正交前以及正交后的因子回归得到的beta结果,对比t值我们可以发现,用正交后的因子做回归得到的beta的t值绝对值更大,因此回归系数更加显著,即beta更为稳定。¶

print 'Note: value= β^2 * σ^2'print "%14s %15s %15s %15s %15s %15s %15s %15s" % ('Original factors','RM','Mom','Rev','HML','SMB','Sum','σ^2(sj)')print '-'*130names=['RM','Mom','Rev','HML','SMB']for i in inputnames:reg=ols(i+'~ RM+Mom+Rev+HML+SMB',data=newdata).fit()reg.params=reg.params[1:]sumindex=(reg.params[0]**2)*var(newdata[names[0]])+(reg.params[1]**2)*var(newdata[names[1]])+(reg.params[2]**2)*var(newdata[names[2]])+(reg.params[3]**2)*var(newdata[names[3]])+(reg.params[4]**2)*var(newdata[names[4]])print "%14s    %15.3f %15.3f %15.3f %15.3f %15.3f %15.3f %15.3f"  % (i, (reg.params[0]**2)*var(newdata[names[0]]),(reg.params[1]**2)*var(newdata[names[1]]),(reg.params[2]**2)*var(newdata[names[2]]),(reg.params[3]**2)*var(newdata[names[3]]),(reg.params[4]**2)*var(newdata[names[4]]),sumindex,var(newdata[[i]]))print printprint "%14s%15s %15s %15s %15s %15s %15s %15s" % ('Orthogonal factors','RMT','MomT','RevT','HMLT','SMBT','Sum','σ^2(sj)')print '-'*130names=['RMT','MomT','RevT','HMLT','SMBT']for i in inputnames:reg=ols(i+'~ RMT+MomT+RevT+HMLT+SMBT',data=newdata).fit()reg.params=reg.params[1:]sumindex=(reg.params[0]**2)*var(newdata[names[0]])+(reg.params[1]**2)*var(newdata[names[1]])+(reg.params[2]**2)*var(newdata[names[2]])+(reg.params[3]**2)*var(newdata[names[3]])+(reg.params[4]**2)*var(newdata[names[4]])print "%14s    %15.3f %15.3f %15.3f %15.3f %15.3f %15.3f %15.3f"  % (i, (reg.params[0]**2)*var(newdata[names[0]]),(reg.params[1]**2)*var(newdata[names[1]]),(reg.params[2]**2)*var(newdata[names[2]]),(reg.params[3]**2)*var(newdata[names[3]]),(reg.params[4]**2)*var(newdata[names[4]]),sumindex,var(newdata[[i]]))
Note: value= β^2 * σ^2
Original factors              RM             Mom             Rev             HML             SMB             Sum        σ^2(sj)

    SmallValue             28.541           0.017           0.010           7.215           8.700          44.483          66.258
   SmallGrowth             33.096           0.052           0.001           0.640          11.024          44.812          56.608
     SMALLDown             31.991           7.735           0.002           0.712          10.158          50.598          82.579
   SmallLowRev             31.771           0.235           1.817           1.823          11.400          47.046          78.876
      BigValue             33.209           0.025           0.001           7.532           0.004          40.772          51.354
     BigGrowth             28.646           0.004           0.001           0.687           0.095          29.432          26.888
       BigDown             32.316          10.138           0.001           0.018           0.039          42.511          56.106
     BigLowRev             31.198           0.035           7.790           0.018           0.166          39.207          48.386


Orthogonal factors            RMT            MomT            RevT            HMLT            SMBT             Sum        σ^2(sj)

    SmallValue             37.606           2.974           3.978           8.869          12.342          65.770          66.258
   SmallGrowth             38.312           1.227           0.844           0.026          14.906          55.315          56.608
     SMALLDown             45.048          15.431           2.535           3.090          14.709          80.813          82.579
   SmallLowRev             42.927           3.783           8.016           5.131          17.514          77.371          78.876
      BigValue             35.797           2.783           1.803           8.892           0.594          49.869          51.354
     BigGrowth             25.696           0.477           0.016           0.093           0.107          26.388          26.888
       BigDown             36.774          15.841           0.410           1.223           0.341          54.590          56.106
     BigLowRev             33.082           1.708           8.792           2.325           0.631          46.537          48.386
正交后风险成功*为各部分因子风险之和¶
 

全部回复

0/140

量化课程

    移动端课程