回答率: 85.25%

質問するログイン新規登録

トップ CSVに関する質問 pythonでのcsvファイルの平均の計算

編集履歴

回答編集履歴

1

Update

2021/11/10 21:29

投稿

スコア21745

answer CHANGED Viewed

@@ -11,4 +11,52 @@
 2,7.00,1.00,2.00
 3,3.33,5.67,6.67
 7,6.00,5.00,5.00
+```
+**追記**
+> 文字列が含まれている場合ではどうしたらいいでしょうか？
+```python
+import numpy as np
+# load
+cols = ('num', 'a', 'b', 'c', 'name')
+tbl = np.loadtxt(
+  'data.csv', delimiter=',', skiprows=1,
+   dtype={'names': cols,
+          'formats': (*(np.int64,)*4, (np.str_, 10))})
+# mean
+names = np.unique(tbl['name']).tolist()
+tbl = np.array([tbl[n] for n in cols])[:-1].T.astype(int)
+result = np.array([tbl[tbl[:,0]==i].mean(axis=0) for i in np.unique(tbl[:,0])])
+# save
+fmt = ['{:.0f}'] + ['{:.2f}']*3
+result = [[fmt[m].format(i) for m, i in enumerate(l)] + [names[n]] for n, l in enumerate(result.tolist())]
+np.savetxt('result.csv', result, delimiter=',', header=','.join(cols), fmt='%s')
+# result.csv
+# num,a,b,c,name
+1,3.00,3.00,6.00,aa
+2,7.00,1.00,2.00,bb
+3,3.33,5.67,6.67,cc
+7,6.00,5.00,5.00,dd
+```
+一方、Pandas を使うと簡単にできます。
+```python
+import pandas as pd
+df = pd.read_csv('data.csv')
+dfx = df.groupby('num').mean().join(df['name'])
+dfx.to_csv('result.csv', float_format='%.2f')
+# result.csv
+num,a,b,c,name
+1,3.00,3.00,6.00,aa
+2,7.00,1.00,2.00,bb
+3,3.33,5.67,6.67,cc
+7,6.00,5.00,5.00,dd
 ```