本帖收集了小伙伴们分享的获取数据方法,仅供学习及交流使用,无利益相关。
本系列将持续更新,标题为 【共享函数】
爬取新浪热门股票(by. 股票疯赢)
爬取选股宝涨停原因(by.包希仁)
抓取港股新股数据统计打新收益(by.止一之路)
爬取申万官网行业行情/估值数据(by.ssk)
爬虫获取国债收益率数据(by.tinysnowing )
作者:股票疯赢
import requestsimport anyjsonimport pandas as pddef get_hot_stock_from_sina():'''从新浪得到热门数据'''html = requests.get('https://ssl-data.sina.com.cn/api/openapi.php/WeiboReferService.getListSymbol?code=CNHOUR6&callback=var%20AHM=').content.decode() n = html[html.index('(')+1:html.index(')')]h = anyjson.deserialize(n)data = pd.DataFrame(h['result']['data'])data.SYMBOL = data.SYMBOL.apply(normalize_code)return dataget_hot_stock_from_sina().head()
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
NAME | REF | SYMBOL | |
---|---|---|---|
0 | 东方通信 | 891768 | 600776.XSHG |
1 | 银之杰 | 735779 | 300085.XSHE |
2 | 东方财富 | 654869 | 300059.XSHE |
3 | 网宿科技 | 592289 | 300017.XSHE |
4 | 安控科技 | 498015 | 300370.XSHE |
作者: 包希仁
import urllibimport jsonimport pandas as pddef Xuangubao():url = "https://flash-api.xuangubao.cn/api/pool/detail?pool_name=limit_up" #涨停# url = 'https://flash-api.xuangubao.cn/api/pool/detail?pool_name=limit_up_broken' #炸板header_dict = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko'}# req = urllib2.Request(url=url, headers=header_dict)# df = pd.DataFrame(json.loads(urllib2.urlopen(req).read())['data'])req = urllib.request.Request(url,headers = header_dict)df = pd.DataFrame(json.loads(urllib.request.urlopen(req).read())['data'])df['stock_reason'] = df.surge_reason.apply(lambda x: x['stock_reason'])df['plate_name'] = df.surge_reason.apply(lambda x: x['related_plates'][0]['plate_name'])def get_plate_reason(x):try: return x['related_plates'][0][u'plate_reason']except:returndf['plate_reason'] = df.surge_reason.apply(get_plate_reason)df['limit_timeline'] = df.limit_timeline.apply(lambda x: datetime.datetime.fromtimestamp(x['items'][0]['timestamp']))df.index = df.surge_reason.apply(lambda x: normalize_code(x['symbol']))df.index.name=Nonereturn df.drop('surge_reason',axis=1)Xuangubao().head()
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
break_limit_down_times | break_limit_up_times | buy_lock_volume_ratio | change_percent | first_break_limit_down | first_break_limit_up | first_limit_down | first_limit_up | is_new_stock | issue_price | last_break_limit_down | last_break_limit_up | last_limit_down | last_limit_up | limit_down_days | limit_timeline | limit_up_days | listed_date | m_days_n_boards_boards | m_days_n_boards_days | mtm | nearly_new_acc_* | nearly_new_break_days | new_stock_acc_* | new_stock_break_limit_up | new_stock_limit_up_days | new_stock_limit_up_price_before_broken | non_restricted_capital | price | sell_lock_volume_ratio | stock_chi_name | symbol | total_capital | turnover_ratio | volume_bias_ratio | yesterday_break_limit_up_times | yesterday_first_limit_up | yesterday_last_limit_up | yesterday_limit_down_days | yesterday_limit_up_days | stock_reason | plate_name | plate_reason | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
002450.XSHE | 0 | 0 | 0.009174 | 0.050382 | 0 | 0 | 0 | 1551662703 | False | 14.20 | 0 | 0 | 0 | 1551662703 | 0 | 2019-03-04 09:25:03 | 5 | 1279209600 | 9 | 15 | 0.0 | 0.0 | 0 | -0.515493 | 0 | 0 | 0.0 | 2.224831e+10 | 6.88 | 0 | ST康得新 | 002450.SZ | 2.436139e+10 | 0.000768 | 0.015875 | 0 | 1551403503 | 1551403503 | 0 | 4 | 2018年度实现净利润4.02亿元 | ST股 | 年报披露高峰期,扭亏个股有望摘帽 |
300538.XSHE | 0 | 0 | 0.187240 | 0.100110 | 0 | 0 | 0 | 1551662703 | False | 15.85 | 0 | 0 | 0 | 1551662703 | 0 | 2019-03-04 09:25:03 | 3 | 1472140800 | 3 | 3 | 0.0 | 0.0 | 0 | 0.899685 | 0 | 0 | 0.0 | 7.225360e+08 | 30.11 | 0 | 同益股份 | 300538.SZ | 2.538039e+09 | 0.046596 | 0.630724 | 0 | 1551403503 | 1551403503 | 0 | 2 | 18年年报10转8 | 高送转 | None |
002207.XSHE | 0 | 8 | 0.000931 | 0.050325 | 0 | 1551679479 | 0 | 1551679467 | False | 7.85 | 0 | 1551681900 | 0 | 1551681981 | 0 | 2019-03-04 14:04:27 | 1 | 1201449600 | 3 | 6 | 0.0 | 0.0 | 0 | -0.175796 | 0 | 0 | 0.0 | 1.536120e+09 | 6.47 | 0 | ST准油 | 002207.SZ | 1.547478e+09 | 0.044471 | 1.641923 | 3 | 1551405519 | 1551406431 | 0 | 0 | 主营石油技术服务、建筑*、运输服务和化工产品销售,属于上游石油天然气采掘服务业 | ST股 | 年报披露高峰期,扭亏个股有望摘帽 |
002552.XSHE | 0 | 4 | 0.001281 | 0.050649 | 0 | 1551681207 | 0 | 1551681189 | False | 20.00 | 0 | 1551681444 | 0 | 1551682110 | 0 | 2019-03-04 14:33:09 | 1 | 1298563200 | 0 | 0 | 0.0 | 0.0 | 0 | -0.595500 | 0 | 0 | 0.0 | 1.619205e+09 | 8.09 | 0 | *ST宝鼎 | 002552.SZ | 2.477420e+09 | 0.030213 | 1.326435 | 0 | 0 | 0 | 0 | 0 | 三季报扭亏,主营大型铸锻件 | ST股 | 年报披露高峰期,扭亏个股有望摘帽 |
000727.XSHE | 0 | 2 | 0.007109 | 0.099585 | 0 | 1551663813 | 0 | 1551663003 | False | 6.16 | 0 | 1551666750 | 0 | 1551667500 | 0 | 2019-03-04 09:30:03 | 1 | 864057600 | 4 | 8 | 0.0 | 0.0 | 0 | -0.569805 | 0 | 0 | 0.0 | 7.766236e+09 | 2.65 | 0 | 华东科技 | 000727.SZ | 1.200335e+10 | 0.125476 | 1.183715 | 0 | 0 | 0 | 0 | 0 | 广东聚华印刷显示技术有限公司为参股公司,目前聚华公司建成了“国家印刷及柔性显示创新中心”,开... | 柔性屏 | 华为发布MATE X折叠屏手机 |
作者:止一之路
作者: ssk
#获取申万官网申万行业数据#导入库import numpy as npimport pandas as pdimport requestsimport jsonfrom datetime import timedelta,date# 获取申万官网申万行业数据# code:行业代码 https://www.joinquant.com/help/api/help?name=plateData#申万行业# frequency:day/week/month# start_date:None(表示最早日期)# end_date:None(表示今天日期)# fields:None(表示所有字段)def get_sw_data(code=None,start_date=None,end_date=None,frequency='day',fields=None): #headersheader={'HOST':'www.swsindex.com','Referer':'http://www.swsindex.com/idx0200.aspx?columnid=8838&type=Day','User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) \ Chrome/53.0.2785.104 Safari/537.36 Core/1.53.4482.400 QQBrowser/9.7.13001.400'}#传入参数param={'tablename':'V_Report','key':'id',#页面序号,每页返回20条数据'p':'1',#查询语句,查询的代码、日期、数据类型"where":"swindexcode in ('801020') and BargainDate>='2018-04-02' and BargainDate<='2018-04-24' and type='Day'",#排序(swindexcode asc表示按照代码升序,BargainDate_1表示按照日期降序,_2表示按照升序)'orderby':'swindexcode asc,BargainDate_2',#返回的字段'fieldlist':'SwIndexCode,SwIndexName,BargainDate,OpenIndex,CloseIndex,MaxIndex,MinIndex,BargainAmount,BargainSum,Markup,TurnoverRate,\ PE,PB,MeanPrice,BargainSumRate,NegotiablesShareSum,NegotiablesShareSum2,DP','pagecount':'1','timed':'1524497094532',}#数据表表头sw_columns_list=['SwIndexCode','SwIndexName','BargainDate','OpenIndex','CloseIndex','MaxIndex','MinIndex','BargainAmount','BargainSum', 'Markup','TurnoverRate','PE','PB','MeanPrice','BargainSumRate','NegotiablesShareSum','NegotiablesShareSum2','DP']#数据类型(日、周、月)frequency_list=['day','week','month']#配置查询语句where="swindexcode in ("if code is None:#如果代码为空,则代码为代码列表code='801010'else: if type(code)==list:code_str=str(code).replace('[','').replace(']','')if type(code)==str:code_str="'"+code+"'"where+=code_str #配置日期today_str=pd.datetime.today().strftime('%Y-%m-%d')if (start_date is None) or (start_date<'1999-12-30') or (start_date>today_str):start_date='1999-12-30'where+=") and BargainDate>='" where+=start_dateif (end_date is None) or (end_date>today_str) or (end_date<'1999-12-30'):end_date=today_strwhere+="' and BargainDate<='" where+=end_date #配置数据类型if not(frequency in frequency_list): frequency='day'where+="' and type='"where+=frequencywhere+="'"param['where']=where #配置字段columns=sw_columns_listfieldlist=str(sw_columns_list).replace(" ","").replace("'","").replace('[',"").replace(']',"") if not(fields is None):if(set(fields).issubset(set(sw_columns_list))): if not (['SwIndexCode','SwIndexName','BargainDate'] in fields):fields=['SwIndexCode','SwIndexName','BargainDate']+fieldsfieldlist=str(fields).replace(" ","").replace("'","").replace('[',"").replace(']',"") columns=fieldsparam['fieldlist']=fieldlistdf=pd.DataFrame()#urlurl='http://www.swsindex.com/handler.aspx'#页面计数器page=1while True:#获取数据ret=requests.get(url,data=param,headers=header)if not (ret.ok is True):break#整理引号、日期格式 data=ret.text.replace("'", '"').replace(' 0:00:00','').replace('/','-')#解析数据data=json.loads(data).get('root')if len(data)==0:break#追加数据表 df=df.append(pd.DataFrame(data,columns=columns))#设置页面计数器page+=1param['p']=str(page) if len(df)!=0: df.BargainDate=pd.to_datetime(df.BargainDate,format='%Y-%m-%d')#返回数据return df df=get_sw_data('850111',start_date='2019-02-23')df
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
SwIndexCode | SwIndexName | BargainDate | OpenIndex | CloseIndex | MaxIndex | MinIndex | BargainAmount | BargainSum | Markup | TurnoverRate | PE | PB | MeanPrice | BargainSumRate | NegotiablesShareSum | NegotiablesShareSum2 | DP | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 850111 | 种子生产 | 2019-02-25 | 2493.65 | 2603.85 | 2612.97 | 2469.67 | 18544 | 109029 | 4.57 | 3.6043 | 45.49 | 2.68 | 6.70 | 0.10 | 3496540.23 | 437067.53 | 0.57 |
1 | 850111 | 种子生产 | 2019-02-26 | 2601.41 | 2577.02 | 2643.98 | 2534.83 | 20089 | 115323 | -1.03 | 3.9047 | 45.02 | 2.65 | 6.65 | 0.11 | 3470405.45 | 433800.68 | 0.58 |
2 | 850111 | 种子生产 | 2019-02-27 | 2571.52 | 2547.74 | 2603.46 | 2530.19 | 13651 | 78331 | -1.14 | 2.6533 | 44.51 | 2.62 | 6.58 | 0.09 | 3430769.77 | 428846.22 | 0.59 |
3 | 850111 | 种子生产 | 2019-02-28 | 2550.00 | 2559.18 | 2584.13 | 2523.73 | 8255 | 50326 | 0.45 | 1.6044 | 44.71 | 2.64 | 6.62 | 0.08 | 3449990.37 | 431248.80 | 0.58 |
4 | 850111 | 种子生产 | 2019-03-01 | 2567.25 | 2570.26 | 2590.56 | 2519.63 | 9037 | 53291 | 0.43 | 1.7564 | 44.91 | 2.65 | 6.64 | 0.08 | 3462555.34 | 432819.42 | 0.58 |
5 | 850111 | 种子生产 | 2019-03-04 | 2581.01 | 2590.01 | 2636.93 | 2559.27 | 14772 | 91984 | 0.77 | 2.8711 | 45.25 | 2.67 | 6.70 | 0.09 | 3489184.93 | 436148.12 | 0.58 |
6 | 850111 | 种子生产 | 2019-03-05 | 2588.81 | 2651.59 | 2662.00 | 2560.38 | 16510 | 94081 | 2.38 | 3.2090 | 46.33 | 2.73 | 6.89 | 0.11 | 3577580.69 | 447197.59 | 0.56 |
7 | 850111 | 种子生产 | 2019-03-06 | 2669.46 | 2688.30 | 2715.86 | 2620.91 | 19228 | 115049 | 1.38 | 3.7373 | 46.97 | 2.77 | 6.98 | 0.10 | 3627549.76 | 453443.72 | 0.56 |
8 | 850111 | 种子生产 | 2019-03-07 | 2691.75 | 2723.26 | 2791.49 | 2645.81 | 18851 | 115497 | 1.30 | 3.6640 | 47.58 | 2.81 | 7.12 | 0.10 | 3680740.95 | 460092.62 | 0.55 |
9 | 850111 | 种子生产 | 2019-03-08 | 2672.42 | 2600.62 | 2751.95 | 2574.58 | 18283 | 105621 | -4.50 | 3.5536 | 45.44 | 2.68 | 6.73 | 0.09 | 3506135.99 | 438267.00 | 0.58 |
10 | 850111 | 种子生产 | 2019-03-11 | 2602.07 | 2717.11 | 2721.98 | 2593.20 | 15197 | 91143 | 4.48 | 2.9538 | 47.47 | 2.80 | 7.14 | 0.10 | 3682245.16 | 460280.65 | 0.55 |
11 | 850111 | 种子生产 | 2019-03-12 | 2738.21 | 2723.11 | 2783.13 | 2673.28 | 17973 | 115338 | 0.22 | 3.4933 | 47.58 | 2.81 | 7.15 | 0.10 | 3686361.67 | 460795.21 | 0.55 |
12 | 850111 | 种子生产 | 2019-03-13 | 2755.56 | 2686.94 | 2826.68 | 2650.85 | 20313 | 126599 | -1.33 | 3.9482 | 46.95 | 2.77 | 7.05 | 0.12 | 3631513.24 | 453939.15 | 0.56 |
13 | 850111 | 种子生产 | 2019-03-14 | 2658.12 | 2567.43 | 2679.73 | 2527.97 | 13592 | 78529 | -4.45 | 2.6419 | 44.68 | 2.65 | 6.72 | 0.10 | 3469352.20 | 433669.02 | 0.59 |
14 | 850111 | 种子生产 | 2019-03-15 | 2578.16 | 2594.87 | 2640.55 | 2554.47 | 9537 | 61640 | 1.07 | 1.8536 | 45.16 | 2.67 | 6.83 | 0.08 | 3508430.54 | 438553.82 | 0.58 |
15 | 850111 | 种子生产 | 2019-03-18 | 2617.85 | 2751.26 | 2759.71 | 2587.89 | 12125 | 98633 | 6.03 | 2.3567 | 47.88 | 2.83 | 7.10 | 0.12 | 3701939.95 | 462742.49 | 0.55 |
16 | 850111 | 种子生产 | 2019-03-19 | 2776.05 | 2799.75 | 2830.07 | 2727.96 | 11911 | 108091 | 1.76 | 2.3151 | 48.72 | 2.88 | 7.17 | 0.14 | 3752415.20 | 469051.90 | 0.54 |
17 | 850111 | 种子生产 | 2019-03-20 | 2792.86 | 2796.13 | 2858.79 | 2746.40 | 12165 | 90645 | -0.13 | 2.3645 | 48.66 | 2.88 | 7.17 | 0.12 | 3751792.49 | 468974.06 | 0.54 |
作者:tinysnowing
import requestsimport jsonimport pandas as pdimport timefrom sqlalchemy import create_enginedef get_bnd_yield(year=10):ids = {10: '29227', 5: '29234', 1: '29231'}url = 'https://cn.investing.com/common/modules/js_instrument_chart/api/data.php?' + \'pair_id={}&pair_id_for_news={}'.format(ids[year], ids[year]) +\'&chart_type=area&pair_interval=month&candle_count=120&events=yes&volume_series=yes&period=5-years'headers = {}headers['X-Requested-With'] = 'XMLHttpRequest'headers['Host'] = 'cn.investing.com'headers['Referer'] = 'https://cn.investing.com/rates-bonds/china-{}-year-bond-yield'.format(year)headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64)'res = requests.get(url, headers=headers)res = json.loads(res.content.decode('utf-8').replace("'", "\""))data = pd.DataFrame(res['candles'])data = data.iloc[:, :2]data.columns = ['date', 'y'+str(year)]data['date'] = data['date'].map(lambda x: time.strftime("%Y-%m-%d", time.localtime(int(str(x)[:10]))))data.set_index('date', inplace=True)return datadef get_bnd_yields(years=[1, 5, 10]):bag = pd.DataFrame()for yr in years:bag = pd.concat([bag, get_bnd_yield(year=yr)], axis=1)#print(bag.head())return bag
df = get_bnd_yields()df
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
y1 | y5 | y10 | |
---|---|---|---|
date | |||
2014-04-01 | 3.650 | 4.160 | 4.330 |
2014-05-01 | 3.360 | 4.010 | 4.160 |
2014-06-01 | 3.370 | 3.860 | 4.060 |
2014-07-01 | 3.763 | 4.031 | 4.298 |
2014-08-01 | 3.799 | 3.998 | 4.248 |
2014-09-01 | 3.767 | 3.931 | 4.028 |
2014-10-01 | 3.406 | 3.565 | 3.786 |
2014-11-01 | 3.070 | 3.413 | 3.546 |
2014-12-01 | 3.263 | 3.538 | 3.648 |
2015-01-01 | 3.151 | 3.409 | 3.514 |
2015-02-01 | 3.077 | 3.257 | 3.379 |
2015-03-01 | 3.190 | 3.479 | 3.623 |
2015-04-01 | 2.869 | 3.288 | 3.422 |
2015-05-01 | 1.960 | 3.253 | 3.591 |
2015-06-01 | 1.767 | 3.247 | 3.629 |
2015-07-01 | 2.246 | 3.178 | 3.474 |
2015-08-01 | 2.307 | 3.165 | 3.394 |
2015-09-01 | 2.360 | 3.087 | 3.276 |
2015-10-01 | 2.379 | 2.903 | 3.087 |
2015-11-01 | 2.599 | 2.926 | 3.088 |
2015-12-01 | 2.329 | 2.713 | 2.862 |
2016-01-01 | 2.393 | 2.789 | 2.909 |
2016-02-01 | 2.279 | 2.655 | 2.909 |
2016-03-01 | 2.163 | 2.531 | 2.886 |
2016-04-01 | 2.218 | 2.769 | 2.946 |
2016-05-01 | 2.338 | 2.767 | 2.995 |
2016-06-01 | 2.390 | 2.700 | 2.875 |
2016-07-01 | 2.240 | 2.606 | 2.805 |
2016-08-01 | 2.150 | 2.594 | 2.805 |
2016-09-01 | 2.185 | 2.565 | 2.769 |
2016-10-01 | 2.190 | 2.480 | 2.744 |
2016-11-01 | 2.300 | 2.765 | 2.943 |
2016-12-01 | 2.751 | 2.883 | 3.066 |
2017-01-01 | 2.770 | 3.037 | 3.363 |
2017-02-01 | 2.783 | 3.000 | 3.358 |
2017-03-01 | 2.885 | 3.085 | 3.310 |
2017-04-01 | 3.160 | 3.347 | 3.477 |
2017-05-01 | 3.475 | 3.653 | 3.670 |
2017-06-01 | 3.453 | 3.502 | 3.578 |
2017-07-01 | 3.428 | 3.574 | 3.629 |
2017-08-01 | 3.428 | 3.635 | 3.675 |
2017-09-01 | 3.460 | 3.630 | 3.638 |
2017-10-01 | 3.583 | 3.963 | 3.916 |
2017-11-01 | 3.700 | 3.876 | 3.917 |
2017-12-01 | 3.803 | 3.860 | 3.915 |
2018-01-01 | 3.583 | 3.845 | 3.944 |
2018-02-01 | 3.313 | 3.759 | 3.857 |
2018-03-01 | 3.350 | 3.690 | 3.778 |
2018-04-01 | 3.007 | 3.175 | 3.653 |
2018-05-01 | 3.185 | 3.452 | 3.646 |
2018-06-01 | 3.210 | 3.410 | 3.543 |
2018-07-01 | 2.893 | 3.227 | 3.533 |
2018-08-01 | 2.836 | 3.386 | 3.600 |
2018-09-01 | 2.990 | 3.470 | 3.655 |
2018-10-01 | 2.837 | 3.364 | 3.533 |
2018-11-01 | 2.645 | 3.168 | 3.398 |
2018-12-01 | 2.575 | 3.014 | 3.270 |
2019-01-01 | 2.415 | 2.923 | 3.130 |
2019-02-01 | 2.409 | 3.033 | 3.208 |
2019-03-01 | 2.445 | 3.040 | 3.148 |
本社区仅针对特定人员开放
查看需注册登录并通过风险意识测评
5秒后跳转登录页面...