pyspark sdfの値埋め

下のような本来連番にしたいが、間が抜けているカラムdayがあるとして（10までの連番にしたいが3,6,8,9,10が抜けている）

import pandas as pd
import numpy as np
df = pd.DataFrame({'day':['null', 1,2,4,5,7],'count':[100,4,9,12,5,10]})
	day	count
0	null	100
1	1	4
2	2	9
3	4	12
4	5	5
5	7	10

抜けているシリアル値のcount=0として埋めるにはどうしたらよいでしょうか。
（またnullを一番下に持ってきたいのですが、sort,orderByしてもうまくいかず。。）
大変初歩的な質問で恐縮ですが、よろしくお願いします。

行動規範の内容に同意します

回答1件

数字と文字列が混在しているとDataFrameのソートはできないようです。

無理矢理やると、以下のようになります。

python
1>>> import pandas as pd
2>>>
3>>> def split_df(df, column):
4...     int_df = df[df[column].apply(lambda x: type(x) == int)]
5...     non_int_df = df[df[column].apply(lambda x: type(x) != int)]
6...     return int_df, non_int_df
7...
8>>> def interpolate(df, column, min_n, max_n):
9...     return pd.merge(df, pd.DataFrame({column:range(min_n,max_n+1)}), on=column, how='outer').fillna(0).sort_values(column)
10...
11>>> df = pd.DataFrame({'day':['null', 1,2,4,5,7],'count':[100,4,9,12,5,10]})
12>>>
13>>> df_int, df_non_int = split_df(df, 'day')
14>>> df_result = pd.concat([interpolate(df_int, 'day', 1, 10).astype(int), df_non_int])
15>>> print(df_result)
16    day  count
170     1      4
181     2      9
195     3      0
202     4     12
213     5      5
226     6      0
234     7     10
247     8      0
258     9      0
269    10      0
270  null    100