请 [注册] 或 [登录]  | 返回主站

量化交易吧 /  量化平台 帖子:3364694 新帖:15

基于Keras深度学习LSTM模型 预测黄金主力收盘价

英雄就是我发表于:5 月 9 日 19:30回复(1)

深度学习框架 Keras,深度学习LSTM模型

1 数据源:黄金主力数据 来源于JQData (数据由JQData支持 )

2 数据清洗

3 使用黄金主力数据 进?预测的2个实验

数据集:70%用做训练集 训练模型 ;30%测试集。

模型:Keras框架, 用LSTM模型对收盘价进行预测
循环神经?网络,RNN(Recurrent Neural Network)中的LSTM(Long Short-Term Memory)

实验结果:是测试集的结果。test为测试集的真实收盘价,pred为模型预测的收盘价

实验1:

使用历史前5个时刻的收盘价

预测当前时刻的收盘价
每组输入包括5个step,每个step对应?一收盘价,输出?一维,即 [None, 5, 1] => [None, 1]

实验结果:是测试集的结果。test为测试集的真实收盘价,pred为模型预测的收盘价
实验1.png

实验2:

使?历史前5个时刻的 open close high low volume money
预测当前时刻的收盘价,
即 [None, 5, 6] => [None, 1]

实验结果:是测试集的结果。test为测试集的真实收盘价,pred为模型预测的收盘价
实验2.png
结果看源代码

from jqdatasdk import *
#jqdata的账号密码
auth('我的邮箱' : 'jiaohaibin@ruc.edu.cn',有问题欢迎与我交流)
df_data_5minute= get_price('AU9999.XSGE',   start_date='2016-01-01', end_date='2018-01-01', frequency='5m')
auth success
df_data_5minute.to_csv('黄金主力5分钟数据.csv')
df_data_5minute
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
open close high low volume money
2016-01-04 09:05:00 226.70 226.65 226.85 226.45 5890.0 1.335146e+09
2016-01-04 09:10:00 226.75 226.50 226.75 226.40 2562.0 5.804133e+08
2016-01-04 09:15:00 226.45 226.45 226.60 226.40 1638.0 3.709666e+08
2016-01-04 09:20:00 226.45 226.25 226.50 226.20 3162.0 7.157891e+08
2016-01-04 09:25:00 226.25 226.25 226.30 226.20 1684.0 3.809907e+08
2016-01-04 09:30:00 226.25 226.30 226.35 226.20 922.0 2.086313e+08
2016-01-04 09:35:00 226.30 226.35 226.40 226.20 2476.0 5.603541e+08
2016-01-04 09:40:00 226.30 226.45 226.45 226.25 2516.0 5.695246e+08
2016-01-04 09:45:00 226.45 226.35 226.45 226.30 1344.0 3.042327e+08
2016-01-04 09:50:00 226.30 226.30 226.35 226.20 1414.0 3.199363e+08
2016-01-04 09:55:00 226.35 226.45 226.50 226.30 1610.0 3.645328e+08
2016-01-04 10:00:00 226.45 226.40 226.50 226.40 972.0 2.200957e+08
2016-01-04 10:05:00 226.40 226.50 226.55 226.35 2004.0 4.538166e+08
2016-01-04 10:10:00 226.50 226.45 226.55 226.40 780.0 1.766423e+08
2016-01-04 10:15:00 226.45 226.45 226.50 226.40 1530.0 3.464690e+08
2016-01-04 10:35:00 226.55 226.45 226.65 226.45 2564.0 5.807784e+08
2016-01-04 10:40:00 226.45 226.50 226.55 226.45 900.0 2.038475e+08
2016-01-04 10:45:00 226.55 226.70 226.80 226.50 3008.0 6.817039e+08
2016-01-04 10:50:00 226.70 226.65 226.85 226.60 2510.0 5.691306e+08
2016-01-04 10:55:00 226.65 226.60 226.65 226.60 930.0 2.107595e+08
2016-01-04 11:00:00 226.65 226.75 226.75 226.60 1184.0 2.683818e+08
2016-01-04 11:05:00 226.75 226.65 226.75 226.60 1044.0 2.366603e+08
2016-01-04 11:10:00 226.65 226.60 226.70 226.60 342.0 7.751130e+07
2016-01-04 11:15:00 226.60 226.60 226.65 226.55 640.0 1.450196e+08
2016-01-04 11:20:00 226.60 226.65 226.70 226.60 502.0 1.137778e+08
2016-01-04 11:25:00 226.65 226.95 226.95 226.65 3222.0 7.308042e+08
2016-01-04 11:30:00 226.90 226.90 226.95 226.80 1472.0 3.339398e+08
2016-01-04 13:35:00 227.10 227.25 227.25 227.00 4894.0 1.111496e+09
2016-01-04 13:40:00 227.25 227.55 227.60 227.20 5338.0 1.214103e+09
2016-01-04 13:45:00 227.60 227.75 228.00 227.50 8612.0 1.961599e+09
... ... ... ... ... ... ...
2017-12-29 10:35:00 278.05 277.95 278.05 277.90 448.0 1.245318e+08
2017-12-29 10:40:00 277.90 277.95 278.00 277.90 506.0 1.406423e+08
2017-12-29 10:45:00 277.95 277.95 278.00 277.95 180.0 5.003790e+07
2017-12-29 10:50:00 277.95 278.00 278.05 277.95 936.0 2.602273e+08
2017-12-29 10:55:00 278.05 277.90 278.05 277.90 942.0 2.618281e+08
2017-12-29 11:00:00 277.85 277.90 277.95 277.85 518.0 1.439454e+08
2017-12-29 11:05:00 277.95 277.95 277.95 277.90 614.0 1.706443e+08
2017-12-29 11:10:00 277.90 277.90 277.95 277.85 1046.0 2.906776e+08
2017-12-29 11:15:00 277.95 277.90 277.95 277.90 206.0 5.725350e+07
2017-12-29 11:20:00 277.90 277.90 277.95 277.85 740.0 2.056435e+08
2017-12-29 11:25:00 277.90 277.85 277.90 277.85 200.0 5.557570e+07
2017-12-29 11:30:00 277.90 277.90 277.95 277.85 756.0 2.100840e+08
2017-12-29 13:35:00 277.90 278.00 278.00 277.90 490.0 1.362097e+08
2017-12-29 13:40:00 278.00 278.05 278.15 278.00 768.0 2.135675e+08
2017-12-29 13:45:00 278.10 278.15 278.15 278.05 252.0 7.008070e+07
2017-12-29 13:50:00 278.10 278.05 278.10 278.00 800.0 2.224430e+08
2017-12-29 13:55:00 278.00 278.00 278.05 277.95 184.0 5.115390e+07
2017-12-29 14:00:00 278.00 277.95 278.00 277.90 474.0 1.317464e+08
2017-12-29 14:05:00 277.95 277.95 277.95 277.90 334.0 9.282880e+07
2017-12-29 14:10:00 277.95 277.90 277.95 277.90 332.0 9.226560e+07
2017-12-29 14:15:00 277.90 277.95 277.95 277.90 672.0 1.867720e+08
2017-12-29 14:20:00 277.90 277.85 277.95 277.85 994.0 2.762458e+08
2017-12-29 14:25:00 277.90 277.90 277.95 277.85 352.0 9.781830e+07
2017-12-29 14:30:00 277.90 277.80 277.95 277.80 784.0 2.178426e+08
2017-12-29 14:35:00 277.85 277.80 277.85 277.75 920.0 2.555711e+08
2017-12-29 14:40:00 277.80 277.80 277.85 277.75 606.0 1.683349e+08
2017-12-29 14:45:00 277.80 277.85 277.85 277.80 560.0 1.555840e+08
2017-12-29 14:50:00 277.85 277.85 277.90 277.80 802.0 2.228271e+08
2017-12-29 14:55:00 277.85 277.75 277.90 277.75 1236.0 3.433855e+08
2017-12-29 15:00:00 277.80 277.80 277.90 277.70 1790.0 4.972797e+08

53310 rows × 6 columns

df=df_data_5minute
close = df['close']
df.drop(labels=['close'], axis=1,inplace = True)
df.insert(0, 'close', close)
df
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
close open high low volume money
2016-01-04 09:05:00 226.65 226.70 226.85 226.45 5890.0 1.335146e+09
2016-01-04 09:10:00 226.50 226.75 226.75 226.40 2562.0 5.804133e+08
2016-01-04 09:15:00 226.45 226.45 226.60 226.40 1638.0 3.709666e+08
2016-01-04 09:20:00 226.25 226.45 226.50 226.20 3162.0 7.157891e+08
2016-01-04 09:25:00 226.25 226.25 226.30 226.20 1684.0 3.809907e+08
2016-01-04 09:30:00 226.30 226.25 226.35 226.20 922.0 2.086313e+08
2016-01-04 09:35:00 226.35 226.30 226.40 226.20 2476.0 5.603541e+08
2016-01-04 09:40:00 226.45 226.30 226.45 226.25 2516.0 5.695246e+08
2016-01-04 09:45:00 226.35 226.45 226.45 226.30 1344.0 3.042327e+08
2016-01-04 09:50:00 226.30 226.30 226.35 226.20 1414.0 3.199363e+08
2016-01-04 09:55:00 226.45 226.35 226.50 226.30 1610.0 3.645328e+08
2016-01-04 10:00:00 226.40 226.45 226.50 226.40 972.0 2.200957e+08
2016-01-04 10:05:00 226.50 226.40 226.55 226.35 2004.0 4.538166e+08
2016-01-04 10:10:00 226.45 226.50 226.55 226.40 780.0 1.766423e+08
2016-01-04 10:15:00 226.45 226.45 226.50 226.40 1530.0 3.464690e+08
2016-01-04 10:35:00 226.45 226.55 226.65 226.45 2564.0 5.807784e+08
2016-01-04 10:40:00 226.50 226.45 226.55 226.45 900.0 2.038475e+08
2016-01-04 10:45:00 226.70 226.55 226.80 226.50 3008.0 6.817039e+08
2016-01-04 10:50:00 226.65 226.70 226.85 226.60 2510.0 5.691306e+08
2016-01-04 10:55:00 226.60 226.65 226.65 226.60 930.0 2.107595e+08
2016-01-04 11:00:00 226.75 226.65 226.75 226.60 1184.0 2.683818e+08
2016-01-04 11:05:00 226.65 226.75 226.75 226.60 1044.0 2.366603e+08
2016-01-04 11:10:00 226.60 226.65 226.70 226.60 342.0 7.751130e+07
2016-01-04 11:15:00 226.60 226.60 226.65 226.55 640.0 1.450196e+08
2016-01-04 11:20:00 226.65 226.60 226.70 226.60 502.0 1.137778e+08
2016-01-04 11:25:00 226.95 226.65 226.95 226.65 3222.0 7.308042e+08
2016-01-04 11:30:00 226.90 226.90 226.95 226.80 1472.0 3.339398e+08
2016-01-04 13:35:00 227.25 227.10 227.25 227.00 4894.0 1.111496e+09
2016-01-04 13:40:00 227.55 227.25 227.60 227.20 5338.0 1.214103e+09
2016-01-04 13:45:00 227.75 227.60 228.00 227.50 8612.0 1.961599e+09
... ... ... ... ... ... ...
2017-12-29 10:35:00 277.95 278.05 278.05 277.90 448.0 1.245318e+08
2017-12-29 10:40:00 277.95 277.90 278.00 277.90 506.0 1.406423e+08
2017-12-29 10:45:00 277.95 277.95 278.00 277.95 180.0 5.003790e+07
2017-12-29 10:50:00 278.00 277.95 278.05 277.95 936.0 2.602273e+08
2017-12-29 10:55:00 277.90 278.05 278.05 277.90 942.0 2.618281e+08
2017-12-29 11:00:00 277.90 277.85 277.95 277.85 518.0 1.439454e+08
2017-12-29 11:05:00 277.95 277.95 277.95 277.90 614.0 1.706443e+08
2017-12-29 11:10:00 277.90 277.90 277.95 277.85 1046.0 2.906776e+08
2017-12-29 11:15:00 277.90 277.95 277.95 277.90 206.0 5.725350e+07
2017-12-29 11:20:00 277.90 277.90 277.95 277.85 740.0 2.056435e+08
2017-12-29 11:25:00 277.85 277.90 277.90 277.85 200.0 5.557570e+07
2017-12-29 11:30:00 277.90 277.90 277.95 277.85 756.0 2.100840e+08
2017-12-29 13:35:00 278.00 277.90 278.00 277.90 490.0 1.362097e+08
2017-12-29 13:40:00 278.05 278.00 278.15 278.00 768.0 2.135675e+08
2017-12-29 13:45:00 278.15 278.10 278.15 278.05 252.0 7.008070e+07
2017-12-29 13:50:00 278.05 278.10 278.10 278.00 800.0 2.224430e+08
2017-12-29 13:55:00 278.00 278.00 278.05 277.95 184.0 5.115390e+07
2017-12-29 14:00:00 277.95 278.00 278.00 277.90 474.0 1.317464e+08
2017-12-29 14:05:00 277.95 277.95 277.95 277.90 334.0 9.282880e+07
2017-12-29 14:10:00 277.90 277.95 277.95 277.90 332.0 9.226560e+07
2017-12-29 14:15:00 277.95 277.90 277.95 277.90 672.0 1.867720e+08
2017-12-29 14:20:00 277.85 277.90 277.95 277.85 994.0 2.762458e+08
2017-12-29 14:25:00 277.90 277.90 277.95 277.85 352.0 9.781830e+07
2017-12-29 14:30:00 277.80 277.90 277.95 277.80 784.0 2.178426e+08
2017-12-29 14:35:00 277.80 277.85 277.85 277.75 920.0 2.555711e+08
2017-12-29 14:40:00 277.80 277.80 277.85 277.75 606.0 1.683349e+08
2017-12-29 14:45:00 277.85 277.80 277.85 277.80 560.0 1.555840e+08
2017-12-29 14:50:00 277.85 277.85 277.90 277.80 802.0 2.228271e+08
2017-12-29 14:55:00 277.75 277.85 277.90 277.75 1236.0 3.433855e+08
2017-12-29 15:00:00 277.80 277.80 277.90 277.70 1790.0 4.972797e+08

53310 rows × 6 columns

df.head()
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
close open high low volume money
2016-01-04 09:05:00 226.65 226.70 226.85 226.45 5890.0 1.335146e+09
2016-01-04 09:10:00 226.50 226.75 226.75 226.40 2562.0 5.804133e+08
2016-01-04 09:15:00 226.45 226.45 226.60 226.40 1638.0 3.709666e+08
2016-01-04 09:20:00 226.25 226.45 226.50 226.20 3162.0 7.157891e+08
2016-01-04 09:25:00 226.25 226.25 226.30 226.20 1684.0 3.809907e+08
#df.drop('money', axis=1, inplace=True)
data_train =df.iloc[:int(df.shape[0] * 0.7), :]
data_test = df.iloc[int(df.shape[0] * 0.7):, :]
print(data_train.shape, data_test.shape)
(37317, 6) (15993, 6)
# -*- coding: utf-8 -*-
import pandas as pd
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
%matplotlib inline
from sklearn.preprocessing import MinMaxScaler
import time
scaler = MinMaxScaler(feature_range=(-1, 1))
scaler.fit(data_train)
MinMaxScaler(copy=True, feature_range=(-1, 1))
data_train = scaler.transform(data_train)
data_test = scaler.transform(data_test)
data_train
array([[-0.98877193, -0.98736842, -0.98459384, -0.99297259, -0.82504604,
        -0.85978547],
       [-0.99298246, -0.98596491, -0.98739496, -0.99437807, -0.92389948,
        -0.93904608],
       [-0.99438596, -0.99438596, -0.99159664, -0.99437807, -0.95134557,
        -0.96104178],
       ...,
       [ 0.61263158,  0.61824561,  0.61484594,  0.61349262, -0.90916652,
        -0.90885626],
       [ 0.61684211,  0.61403509,  0.61204482,  0.61630358, -0.94754352,
        -0.94737162],
       [ 0.6154386 ,  0.6154386 ,  0.61064426,  0.61349262, -0.94445435,
        -0.9442865 ]])
 from keras.layers import Input, Dense, LSTM
 from keras.models import Model
 output_dim = 1
 batch_size = 256
 epochs = 10
 seq_len = 5
 hidden_size = 128
 X_train = np.array([data_train[i : i + seq_len, 0] for i in range(data_train.shape[0] - seq_len)])[:, :, np.newaxis]
 y_train = np.array([data_train[i + seq_len, 0] for i in range(data_train.shape[0]- seq_len)])
 X_test = np.array([data_test[i : i + seq_len, 0] for i in range(data_test.shape[0]- seq_len)])[:, :, np.newaxis]
 y_test = np.array([data_test[i + seq_len, 0] for i in range(data_test.shape[0] - seq_len)])
 print(X_train.shape, y_train.shape, X_test.shape, y_test.shape)
 X = Input(shape=[X_train.shape[1], X_train.shape[2],])
 h = LSTM(hidden_size, activation='relu')(X)
 Y = Dense(output_dim, activation='sigmoid')(h)
 model = Model(X, Y)
 model.compile(loss='mean_squared_error', optimizer='adam')
 model.fit(X_train, y_train, epochs=epochs, batch_size=batch_size, shuffle=False)
 y_pred = model.predict(X_test)
 print('MSE Train:', model.evaluate(X_train, y_train, batch_size=batch_size))
 print('MSE Test:', model.evaluate(X_test, y_test, batch_size=batch_size))
 plt.plot(y_test, label='test')
 plt.plot(y_pred, label='pred')
 plt.legend()
 plt.show()
Using TensorFlow backend.
(37312, 5, 1) (37312,) (15988, 5, 1) (15988,)
Epoch 1/10
37312/37312 [==============================] - 4s 110us/step - loss: 0.1995
Epoch 2/10
37312/37312 [==============================] - 4s 94us/step - loss: 0.0612
Epoch 3/10
37312/37312 [==============================] - 4s 109us/step - loss: 0.0441
Epoch 4/10
37312/37312 [==============================] - 4s 114us/step - loss: 0.0423
Epoch 5/10
37312/37312 [==============================] - 4s 106us/step - loss: 0.0418
Epoch 6/10
37312/37312 [==============================] - 4s 96us/step - loss: 0.0415
Epoch 7/10
37312/37312 [==============================] - 4s 99us/step - loss: 0.0412
Epoch 8/10
37312/37312 [==============================] - 4s 110us/step - loss: 0.0410
Epoch 9/10
37312/37312 [==============================] - 4s 116us/step - loss: 0.0409
Epoch 10/10
37312/37312 [==============================] - 4s 98us/step - loss: 0.0408
37312/37312 [==============================] - 1s 30us/step
MSE Train: 0.04086427927292725
15988/15988 [==============================] - 0s 28us/step
MSE Test: 6.807026879235516e-05
/Users/jiaohaibin/anaconda3/lib/python3.6/site-packages/matplotlib/font_manager.py:1320: UserWarning: findfont: Font family ['monospace'] not found. Falling back to DejaVu Sans
  (prop.get_family(), self.defaultFamily[fontext]))
from keras.layers import Input, Dense, LSTM
from keras.models import Model
output_dim = 1
batch_size = 256
epochs = 10
seq_len = 5
hidden_size = 128

X_train = np.array([data_train[i : i + seq_len, :] for i in range(data_train.shape[0] - seq_len)])
y_train = np.array([data_train[i + seq_len, 0] for i in range(data_train.shape[0]- seq_len)])
X_test = np.array([data_test[i : i + seq_len, :] for i in range(data_test.shape[0]- seq_len)])
y_test = np.array([data_test[i + seq_len, 0] for i in range(data_test.shape[0] - seq_len)])

print(X_train.shape, y_train.shape, X_test.shape, y_test.shape)


X = Input(shape=[X_train.shape[1], X_train.shape[2],])
h = LSTM(hidden_size, activation='relu')(X)
Y = Dense(output_dim, activation='sigmoid')(h)
model = Model(X, Y)


model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X_train, y_train, epochs=epochs, batch_size=batch_size, shuffle=False)
y_pred = model.predict(X_test)
print('MSE Train:', model.evaluate(X_train, y_train, batch_size=batch_size))
print('MSE Test:', model.evaluate(X_test, y_test, batch_size=batch_size))
plt.plot(y_test, label='test')
plt.plot(y_pred, label='pred')
plt.legend()
plt.show()
(37312, 5, 6) (37312,) (15988, 5, 6) (15988,)
Epoch 1/10
37312/37312 [==============================] - 4s 114us/step - loss: 0.1633
Epoch 2/10
37312/37312 [==============================] - 4s 114us/step - loss: 0.0472
Epoch 3/10
37312/37312 [==============================] - 5s 142us/step - loss: 0.0430
Epoch 4/10
37312/37312 [==============================] - 4s 114us/step - loss: 0.0421
Epoch 5/10
37312/37312 [==============================] - 5s 121us/step - loss: 0.0417
Epoch 6/10
37312/37312 [==============================] - 4s 120us/step - loss: 0.0416
Epoch 7/10
37312/37312 [==============================] - 4s 111us/step - loss: 0.0414
Epoch 8/10
37312/37312 [==============================] - 4s 104us/step - loss: 0.0417
Epoch 9/10
37312/37312 [==============================] - 4s 104us/step - loss: 0.0412
Epoch 10/10
37312/37312 [==============================] - 5s 128us/step - loss: 0.0413
37312/37312 [==============================] - 2s 43us/step
MSE Train: 0.04110309858989641
15988/15988 [==============================] - 1s 37us/step
MSE Test: 0.00014048067397460243
/Users/jiaohaibin/anaconda3/lib/python3.6/site-packages/matplotlib/font_manager.py:1320: UserWarning: findfont: Font family ['monospace'] not found. Falling back to DejaVu Sans
  (prop.get_family(), self.defaultFamily[fontext]))
 

全部回复

0/140

量化课程

    移动端课程