`df.reset_index(drop=False)`してもデータが変？

pivot_table でそれっぽい形になるがカラム名が変。
エクスポートした時の空白を消して csv のような形にしたい

整形前のデータ (test)

category	cnt	dept	item	ymd
A	12	100100	100	2
B	3	100200	200	2
A	4	100100	200	3
C	30	100200	300	4

理想

category の要素{A,B,C}を列に持ってきてcntを格納。
NaN は 0 で置換

ymd	dept	item	A	B	C
2	100100	100	12	0	0
2	100200	200	0	3	0
3	100100	200	4	0	0
4	100200	300	0	0	30

python
1ymd     4 non-null object
2item    4 non-null object
3A       4 non-null float64
4B       4 non-null float64
5C       4 non-null float64

現実

なんかズレてる
カラム名が変？

	ymd	item	cnt	cnt	cnt
category			A	B	C
0	2	100	12	0	0
1	2	200	0	3	0
2	3	200	4	0	0
3	4	300	0	0	30

python
1(ymd, )     4 non-null object
2(item, )    4 non-null object
3(cnt, A)    4 non-null float64
4(cnt, B)    4 non-null float64
5(cnt, C)    4 non-null float64

ソース

python
1# データの作成
2test = pd.DataFrame(data={'ymd':['2','2','3','4'],
3                          'category':['A','B','A','C'],
4                          'dept':['100100','100200','100100','100200'],
5                          'item':['100','200','200','300'],
6                          'cnt':[12,3,4,30]
7                         })
8# group by で集計
9test2=test.groupby(['ymd','category','dept','item'])['cnt'].sum()
10
11# Dataframe 型に変換して、MultiIndex を解除
12test3 = pd.DataFrame(test2).reset_index(drop=False)
13
14# category を列に移動、NaN を 1で置換
15test4=test.pivot_table(values=['cnt'],
16                 index=['ymd','item'],
17                 columns='category',
18                 aggfunc='sum').fillna(0)
19
20# MultiIndex 解除
21test4.reset_index(drop=True)

行動規範の内容に同意します

回答2件

集計している部分がなかったので、category列ごとに機械的に入れてしまう方法もありそうです。

python
1test2 = test.copy()
2test2.index = test2[['dept', 'item']].apply(
3    lambda x: '{}{}'.format(*x), axis=1)
4
5for category in test2['category'].unique():
6    test2[category] = test2.loc[test2['category'] == category, 'cnt']
7
8test2.reset_index(inplace=True)
9test2.drop(['index', 'category', 'cnt'], axis=1, inplace=True)
10test2.fillna(0, inplace=True)

投稿2017/09/25 02:11

driller

総合スコア720

tf23yh8df3

2017/09/25 02:47

ありがとうございます。期待通りの結果を確認できました。実際のデータは ymd と、item と dept で集計しています。 dept と item を index にしているのはどうしてですか。

driller

2017/09/25 03:24

データを代入するに当たって一意の index が欲しかったため、dept と item をつなげて index としましたが、MultiIndex でも構わないと思います。

tf23yh8df3

2017/09/25 04:46

なるほど。 test.cnt が A～C列のデータだったのでそれと必要なカラムを連結する方法もありました。

行動規範の内容に同意します

自己解決

解決手段

python
1test5 = test.cnt
2pd.concat([test.ymd, test.item, test5], axis=1)

投稿2017/09/25 04:44

tf23yh8df3

総合スコア60

あなたの回答

tips

プレビュー

行動規範の内容に同意します

質問の解決につながる回答をしましょう。サンプルコードなど、より具体的な説明があると質問者の理解の助けになります。また、読む側のことを考えた、分かりやすい文章を心がけましょう。

15分調べてもわからないことは
teratailで質問しよう！

ただいまの回答率
85.49%

質問をまとめることで
思考を整理して素早く解決

テンプレート機能で
簡単に質問をまとめる

質問する

質問をすることでしか得られない、回答やアドバイスがある。

15分調べてもわからないことは、質問しよう！

pandas の MultiIndex を csv のようなテーブルに変換する方法について

`df.reset_index(drop=False)`してもデータが変？

整形前のデータ (test)

理想

現実

ソース

解決手段

関連した質問

質問をすることでしか得られない、回答やアドバイスがある。

15分調べてもわからないことは、質問しよう！

df.reset_index(drop=False)してもデータが変？

整形前のデータ (test)

理想

現実

ソース

解決手段

関連した質問

`df.reset_index(drop=False)`してもデータが変？