DataFrameの欠損値を補完するために、fillna(下記、関数)を使いましたが、期待通りの結果が得られず、困っています。補完出来ない理由を教えて頂けますと幸いです。
fillna後のSriesのuniqueを見ると、nanは消えているものの、補完はされていない状態です(補完したはずの"NA"がない)。また、isnull().sum()でfillna前後のnull数を調べると、null数は変わっていません。
----補完関数-------
def small_missing_values(train,test):
for i in [train,test]: for col in ['GarageType', 'GarageQual','GarageFinish','GarageCond',"GarageYrBlt"]: i.ix[i["GarageArea"] == 0][col].fillna("NA", inplace = True) #train.ix[train["GarageArea"] == 0]["GarageType"].fillna("NA") if i[col].isnull().sum() > 0: i[col].drop(i[i[col].isnull()].index, inplace = True) for col in [ 'BsmtQual','BsmtFinType1','BsmtFinType2','BsmtCond','BsmtExposure']: if col != "BsmtFinType2" : i.ix[i["BsmtFinSF1"] == 0][col].fillna("NA", inplace = True) if i[col].isnull().sum() > 0: i[col].drop(i[i[col].isnull()].index , inplace = True) else: i.ix[i["BsmtFinSF2"] == 0][col].fillna("NA", inplace = True) if i[col].isnull().sum() > 0: i[col].drop(i[i[col].isnull()].index, inplace = True) i.ix[i["MasVnrType"].isnull()]["MasVnrArea"].fillna(0, inplace = True) i.ix[i["MasVnrArea"] == 0]["MasVnrType"].fillna("None", inplace = True) return train,test
"""
----補完前----
【入力】
print(train.isnull().sum(), "\n")
print(train["GarageType"].unique())
【出力】
Id 0
MSSubClass 0
MSZoning 0
LotFrontage 259
LotArea 0
Street 0
LotShape 0
LandContour 0
Utilities 0
LotConfig 0
LandSlope 0
Neighborhood 0
Condition1 0
Condition2 0
BldgType 0
HouseStyle 0
OverallQual 0
OverallCond 0
YearBuilt 0
YearRemodAdd 0
RoofStyle 0
RoofMatl 0
Exterior1st 0
Exterior2nd 0
MasVnrType 8
MasVnrArea 8
ExterQual 0
ExterCond 0
Foundation 0
BsmtQual 37
...
FullBath 0
HalfBath 0
BedroomAbvGr 0
KitchenAbvGr 0
KitchenQual 0
TotRmsAbvGrd 0
Functional 0
Fireplaces 0
FireplaceQu 0
GarageType 81
GarageYrBlt 81
GarageFinish 81
GarageCars 0
GarageArea 0
GarageQual 81
GarageCond 81
PavedDrive 0
WoodDeckSF 0
OpenPorchSF 0
EnclosedPorch 0
3SsnPorch 0
ScreenPorch 0
PoolArea 0
PoolQC 0
MiscVal 0
MoSold 0
YrSold 0
SaleType 0
SaleCondition 0
SalePrice 0
Length: 78, dtype: int64
['Attchd' 'Detchd' 'BuiltIn' 'CarPort' nan 'Basment' '2Types']
----補完後----
【入力】
print(train.isnull().sum(), "\n")
print(train["GarageType"].unique())
【出力】
Id 0
MSSubClass 0
MSZoning 0
LotFrontage 259
LotArea 0
Street 0
LotShape 0
LandContour 0
Utilities 0
LotConfig 0
LandSlope 0
Neighborhood 0
Condition1 0
Condition2 0
BldgType 0
HouseStyle 0
OverallQual 0
OverallCond 0
YearBuilt 0
YearRemodAdd 0
RoofStyle 0
RoofMatl 0
Exterior1st 0
Exterior2nd 0
MasVnrType 8
MasVnrArea 8
ExterQual 0
ExterCond 0
Foundation 0
BsmtQual 37
...
FullBath 0
HalfBath 0
BedroomAbvGr 0
KitchenAbvGr 0
KitchenQual 0
TotRmsAbvGrd 0
Functional 0
Fireplaces 0
FireplaceQu 0
GarageType 81
GarageYrBlt 81
GarageFinish 81
GarageCars 0
GarageArea 0
GarageQual 81
GarageCond 81
PavedDrive 0
WoodDeckSF 0
OpenPorchSF 0
EnclosedPorch 0
3SsnPorch 0
ScreenPorch 0
PoolArea 0
PoolQC 0
MiscVal 0
MoSold 0
YrSold 0
SaleType 0
SaleCondition 0
SalePrice 0
Length: 78, dtype: int64
['Attchd' 'Detchd' 'BuiltIn' 'CarPort' 'Basment' '2Types']
バッドをするには、ログインかつ
こちらの条件を満たす必要があります。
2018/04/02 14:57