(機械学習用)csvファイルのデータを正常に読み取りたい

機械学習用csvファイルを自分で用意してそのファイルのデータをきちんと読み込みたいと思って調べたりしていたのですが

import pandas as pd
df = pd.read_csv('emotion3.csv')

print("データセットのキー（特徴量名）の確認==>:\n", df.keys())
#print('dataframeの行数・列数の確認==>\n', df.shape)

# dataframe各列の欠損値でないデータ数、データ型を確認
#df.info()
"""
# 数値ではない型の要素の抽出
objectlist = df[['特徴量名を入れる']][df['特徴量名を入れる'].apply(lambda s:pd.to_numeric(s, errors='coerce')).isnull()]
#objectlist

import sklearn
facemotion = sklearn.utils.Bunch()

# 'Score'(幸福スコア)を目的変数'target'とする
facemotion['target'] = df['Score']

# 説明変数を'data'に入れる
facemotion['data'] = df.loc[:, ['GDP per capita',
       'Social support', 'Healthy life expectancy',
       'Freedom to make life choices', 'Generosity',
       'Perceptions of corruption']]



# 特徴量の名前も入れておくと、グラフの凡例等に使えます（無くても可）
facemotion['feature_names'] = ['GDP per capita',
       'Social support', 'Healthy life expectancy',
       'Freedom to make life choices', 'Generosity',
       'Perceptions of corruption']

# 訓練セットとテストセットに分割
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
    worldhappiness['data'], worldhappiness['target'], random_state=0)

print("X_train shape:", X_train.shape)
print("X_test shape:", X_test.shape)

このコードを実行したところ

データセットのキー（特徴量名）の確認==>:
Index(['Unnamed: 0', '{'face_token': '8fe676a25b8b7fabde0143c41a5857e9', 'face_rectangle': {'top': 64, 'left': 15, 'width': 215, 'height': 215}, 'landmark': {'contour_chin': {'x': 121, 'y': 278}, 'contour_left1': {'x': 18, 'y': 99}, 'contour_left2': {'x': 18, 'y': 124}, 'contour_left3': {'x': 19, 'y': 149}, 'contour_left4': {'x': 23, 'y': 173}, 'contour_left5': {'x': 29, 'y': 197}, 'contour_left6': {'x': 39, 'y': 220}, 'contour_left7': {'x': 53, 'y': 240}, 'contour_left8': {'x': 70, 'y': 258}, 'contour_left9': {'x': 92, 'y': 272}, 'contour_right1': {'x': 233, 'y': 110}, 'contour_right2': {'x': 231, 'y': 134}, 'contour_right3': {'x': 227, 'y': 157}, 'contour_right4': {'x': 221, 'y': 181}, 'contour_right5': {'x': 214, 'y': 204}, 'contour_right6': {'x': 204, 'y': 226}, 'contour_right7': {'x': 189, 'y': 245}, 'contour_right8': {'x': 170, 'y': 261}, 'contour_right9': {'x': 148, 'y': 273}, 'left_eye_bottom': {'x': 75, 'y': 109}, 'left_eye_center': {'x': 77, 'y': 103}, 'left_eye_left_corner': {'x': 55, 'y': 103}, 'left_eye_lower_left_quarter': {'x': 64, 'y': 107}, 'left_eye_lower_right_quarter': {'x': 88, 'y': 109},

と出力されたのですが説明変数としてはface_rectangle、landmarkだけ取得したいのですがどのように指定すればこの情報だけ取得できるでしょうか。

よろしくお願いいたします。

エラー内容

KeyError: "None of [Index(['特徴量名を入れる'], dtype='object')] are in the [columns]"

t_obara

2021/03/30 09:48

提示されている画面キャプチャを見るに、csvではなく、jsonのリストなのでは？

Kokku

2021/03/30 09:54

使用したAPIには結果がjsonファイル形式で出力されると書いてあったので.to_csvで拡張子をcsvにしたのですがそれだけではだめなのでしょうか？また、json形式のファイルをcsv形式のファイルに変換するにはどのようにすればよいでしょうか？

行動規範の内容に同意します

回答1件

ベストアンサー

使用したAPIには結果がjsonファイル形式で出力される

であれば、jsonとして読むべきですよ
https://note.nkmk.me/python-pandas-read-json/

投稿2021/03/30 11:13

t_obara

総合スコア5488

あなたの回答

tips

プレビュー

行動規範の内容に同意します

質問の解決につながる回答をしましょう。サンプルコードなど、より具体的な説明があると質問者の理解の助けになります。また、読む側のことを考えた、分かりやすい文章を心がけましょう。

15分調べてもわからないことは
teratailで質問しよう！

ただいまの回答率
85.34%

質問をまとめることで
思考を整理して素早く解決

テンプレート機能で
簡単に質問をまとめる

質問する

質問をすることでしか得られない、回答やアドバイスがある。

15分調べてもわからないことは、質問しよう！

(機械学習用)csvファイルのデータを正常に読み取りたい

関連した質問