データフレームのデータにおけるゼロパディングのエラー

前提・実現したいこと

データフレームのデータでゼロパディングを実行しようとしています。
ゼロパディングについて
右寄せゼロ埋めをしようとしており、実現したいのは具体的には以下のことです。

入力データ（４桁以上のデータは含まれない）

出力

発生している問題・エラーメッセージ

以下のエラーが起こっている原因と修正方法がわからないため、困っています。

エラー文

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-37-6146b0b60791> in <module>()
      1 s_zero = []
      2 for i in range(len(df[4])):
----> 3     s_zero.append(str(df[4][i]).zfill(4))
      4 print(s_zero)

~/.pyenv/versions/anaconda3-5.0.1/lib/python3.6/site-packages/pandas/core/series.py in __getitem__(self, key)
    599         key = com._apply_if_callable(key, self)
    600         try:
--> 601             result = self.index.get_value(self, key)
    602 
    603             if not is_scalar(result):

~/.pyenv/versions/anaconda3-5.0.1/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_value(self, series, key)
   2475         try:
   2476             return self._engine.get_value(s, k,
-> 2477                                           tz=getattr(series.dtype, 'tz', None))
   2478         except KeyError as e1:
   2479             if len(self) > 0 and self.inferred_type in ['integer', 'boolean']:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 0

該当のソースコード

複数のカラムを持つデータフレームで、5つめのカラムのdf[4]のデータにゼロパディングを適用させようとして、型確認・任意の行・全体へのゼロパディングと、段階を踏んで実行したコードです。

python
1import pandas as pd
2print(df[4].dtypes)＃型確認
3
4s_zero = str(df[4][10000]).zfill(4)#カラム中の任意の行で確認、値「295」
5print(s_zero)
6
7#カラム全体に適用
8s_zero = []
9for i in range(len(df[4])):
10    s_zero.append(str(df[4][i]).zfill(4))
11print(s_zero)

出力

int64

0295

エラー文

試したこと

以下のような簡単な例では実現したいことはできています。

python
1import pandas as pd
2s = pd.Series([1100,32,1515,9,72,6011,4567])
3print(s.dtypes)
4
5s_zero = []
6for i in range(len(s)):
7    s_zero.append(str(s[i]).zfill(4))
8print(s_zero)

出力

dtype('int64')

['1100', '0032', '1515', '0009', '0072', '6011', '4567']

追記

df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 107883 entries, 1 to 107897
Data columns (total 15 columns):
0     107883 non-null object
1     107883 non-null datetime64[ns]
2     107883 non-null object
3     107883 non-null object
4     107883 non-null int64
5     107883 non-null object
6     107883 non-null object
7     28646 non-null object
8     104828 non-null object
9     104892 non-null object
10    107883 non-null int64
11    107883 non-null object
12    107600 non-null datetime64[ns]
13    107883 non-null int64
14    107883 non-null int64
dtypes: datetime64[ns](2), int64(4), object(9)
memory usage: 18.2+ MB

補足情報（FW/ツールのバージョンなど）

Python 3.6.0 :: Anaconda 4.3.0

tachikoma

2018/08/18 05:13

df.info()の結果をください。

退会済みユーザー

2018/08/18 05:46

ご質問いただきましてありがとうございます。追記にdf.info()の結果を明記させていただきました。

行動規範の内容に同意します

回答1件

ベストアンサー

indexが少しおかしいようですね。インデックスアクセスをやめればうまくいくかも知れません。

Python
1#カラム全体に適用
2s_zero = []
3for value in df[4]:
4    s_zero.append(str(value).zfill(4))
5print(s_zero)

投稿2018/08/18 07:02

tachikoma

総合スコア3601

退会済みユーザー

2018/08/18 07:42

ご回答いただきましてありがとうございます。解決できました。なぜ、インデックスアクセスがよくなかったのでしょうか。

tachikoma

2018/08/18 07:53

infoの結果で Int64Index: 107883 entries, 1 to 107897 とあります。1始まりの上に、要素数よりも14個多い数字まで持っています。csvなどを読み込む際に指定したindexコラムに欠損番号があるんだと思います。

退会済みユーザー

2018/08/18 08:36

ありがとうございました。

行動規範の内容に同意します

あなたの回答

tips

プレビュー

行動規範の内容に同意します

質問の解決につながる回答をしましょう。サンプルコードなど、より具体的な説明があると質問者の理解の助けになります。また、読む側のことを考えた、分かりやすい文章を心がけましょう。

15分調べてもわからないことは
teratailで質問しよう！

ただいまの回答率
85.30%

質問をまとめることで
思考を整理して素早く解決

テンプレート機能で
簡単に質問をまとめる

質問する

前提・実現したいこと

発生している問題・エラーメッセージ

該当のソースコード

試したこと

追記

補足情報（FW/ツールのバージョンなど）

関連した質問