dataframeをfor文で連結しCSVへ出力したい

前提・実現したいこと

yahoo_finance_api2を使い、複数銘柄の株価情報を取得しようとしています。

銘柄コード(codes)ごとの営業日(data)や終値(price)のデータを得た後、
for文で銘柄ごとのデータフレームを作っています。
各データフレームを縦に結合し、１つのデータフレームを作った後、それをCSVに出力したいのですが、下記３点で躓いています。

①エラーメッセージが出る
おそらく、code、n_data、priceでデータの型が違うんだと思いますが、
codeの型を変えて格納する一文をどこかに加えれば良いのでしょうか。

②行・列の入れ替えができない
「出力結果（目指している形）」のようにしたいのですが、行列の入れ替えができません。
①のエラーと関係があるんだと思います。

③for文でのデータの結合方法が分からない
他の質問を調べて、pd.concatで出来そうなのは分かりましたが、
この場合どのように記述すれば良いのでしょうか。

発生している問題・エラーメッセージ

エラーメッセージ
/opt/conda/lib/python3.7/site-packages/ipykernel_launcher.py:30: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray

出力結果（現状）
0
0 9983
1 [2021-09-24 06:15:00]
2 [77100.0]
0
0 7203
1 [2021-09-24 06:15:01]
2 [10100.0]
0
0 9020
1 [2021-09-24 06:15:00]
2 [7136.0]

出力結果（目指している形）
0 1 2
0 9983 [2021-09-24 06:15:00] [77100.0]
1 7203 [2021-09-24 06:15:01] [10100.0]
2 9020 [2021-09-24 06:15:00] [7136.0]

該当のソースコード

python
1from datetime import datetime
2import pandas as pd
3import sys
4import numpy as np
5from yahoo_finance_api2 import share
6from yahoo_finance_api2.exceptions import YahooFinanceError
7
8codes = [9983, 7203, 9020]
9S_year = 0
10S_month = 3
11
12def kabuka():
13    
14    result = []
15    for code in codes:
16        my_share = share.Share(str(code) + ".T")
17        symbol_data = None
18        
19        try:
20            symbol_data = my_share.get_historical(share.PERIOD_TYPE_YEAR,S_year,share.FREQUENCY_TYPE_MONTH,S_month)
21        
22        except YahooFinanceError as e:
23            print(e.message)
24            sys.exit(1)
25            
26        data = symbol_data['timestamp']
27        price = symbol_data['close']
28        
29        n_data = [datetime.utcfromtimestamp(int(data[i]/1000)) for i in range(len(data))]
30        all_data = np.array([code, n_data, price]).T
31        all_data = pd.DataFrame(all_data)
32        
33        result.append((all_data))
34
35    return result
36
37result = kabuka()
38
39for i in result:
40    print(i)
41
42#CSVに結合したデータフレームを出力したい
43#result.to_csv('test1.csv', mode='w', header=True)
44#data = pd.read_csv('test1.csv')

試したこと

NumPy配列ndarrayの行と列を入れ替えるためにT属性（.T）を書き足しても行列が入れ替わりませんでした。（おそらくエラーを解決していないためと思われます。）

補足情報（FW/ツールのバージョンなど）

加工前のデータです。
{'timestamp': [1632464100000], 'open': [77150.0], 'high': [77340.0], 'low': [76530.0], 'close': [77100.0], 'volume': [521000]}
/opt/conda/lib/python3.7/site-packages/ipykernel_launcher.py:32: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
{'timestamp': [1632464101000], 'open': [9985.0], 'high': [10100.0], 'low': [9973.0], 'close': [10100.0], 'volume': [6981600]}
{'timestamp': [1632464100000], 'open': [7158.0], 'high': [7162.0], 'low': [7088.0], 'close': [7136.0], 'volume': [1808400]}

行動規範の内容に同意します

回答1件

ベストアンサー

途中がとても変なのでそれを修正するべきですが、元のデータが明らかにされていないので、正しい方法に修正するのは止めました。

resultを加工して求めるものを方法を示します。

python
1>>> for i in result:
2...     print(i)
3...
4                       0
50                   9983
61  [2021-09-24 06:15:00]
72              [77100.0]
8                       0
90                   7203
101  [2021-09-24 06:15:01]
112              [10100.0]
12                       0
130                   9020
141  [2021-09-24 06:15:00]
152               [7136.0]
16>>> modified_result = pd.concat(result, axis=1).T.reset_index(drop=True)
17>>> modified_result[0] = modified_result[0].astype(np.int64)
18>>> modified_result[1] = modified_result[1].apply(lambda x: x[0])
19>>> modified_result[2] = modified_result[2].apply(lambda x: x[0])
20>>> print(modified_result)
21      0                   1        2
220  9983 2021-09-24 06:15:00  77100.0
231  7203 2021-09-24 06:15:01  10100.0
242  9020 2021-09-24 06:15:00   7136.0

追記

まず、加工前のデータを以下の形に加工してください。

python
1indata = {9983:{'timestamp': [1632464100000], 'open': [77150.0], 'high': [77340.0], 'low': [76530.0], 'close': [77100.0], 'volume': [521000]},
2          7203:{'timestamp': [1632464101000], 'open': [9985.0], 'high': [10100.0], 'low': [9973.0], 'close': [10100.0], 'volume': [6981600]},
3          9020:{'timestamp': [1632464100000], 'open': [7158.0], 'high': [7162.0], 'low': [7088.0], 'close': [7136.0], 'volume': [1808400]}}

そのあと、以下のように加工します。

python
1df = pd.DataFrame(indata).T
2result = pd.DataFrame(df['timestamp'].apply(lambda ts: datetime.utcfromtimestamp(ts[0]//1000)))
3result['close'] = df['close'].apply(lambda x: x[0])
4result.index.name = 'code'

これを実行すると以下のデータフレームになります。

python
1>>> print(df)
2            timestamp       open       high        low      close     volume
3code
49983  [1632464100000]  [77150.0]  [77340.0]  [76530.0]  [77100.0]   [521000]
57203  [1632464101000]   [9985.0]  [10100.0]   [9973.0]  [10100.0]  [6981600]
69020  [1632464100000]   [7158.0]   [7162.0]   [7088.0]   [7136.0]  [1808400]>>> print(result)
7              timestamp    close
8code
99983 2021-09-24 06:15:00  77100.0
107203 2021-09-24 06:15:01  10100.0
119020 2021-09-24 06:15:00   7136.0

codeがindexですが、columnにしたいならreset_indexします。

python
1>>> print(result.reset_index())
2   code           timestamp    close
30  9983 2021-09-24 06:15:00  77100.0
41  7203 2021-09-24 06:15:01  10100.0
52  9020 2021-09-24 06:15:00   7136.0