lightGBMにおいて、pickleロードから読み込んだ重みを使いGPUで処理を行いたい

現在、kaggleのコンペにおいて、lightGBMを使用して予測を行っています。学習と推論を同一のノートブックに書くとトータル時間が長くなってしまうためサブミット用のノートブックを別に作り、そこに学習済みモデルをアップロードして推論を行おうと考えプログラムを作成しました。以下がプログラムとなります。

import janestreet
from tqdm.notebook import tqdm

janestreet.competition.make_env.__called__ = False
env = janestreet.make_env()
iter_test = env.iter_test()
files = glob.glob("pickelファイルが保存されているディレクトリpath")

for (test_df, sample_prediction_df) in tqdm(iter_test):
    if (test_df.iloc[0].weight > 0):
        test_np = test_df.loc[:, features].values
        if np.isnan(test_np[:, 1:].sum()):
            test_np[:, 1:] = np.nan_to_num(test_np[:, 1:]) + np.isnan(test_np[:, 1:]) * f_mean
            test_df[features] = test_np
        #action = model.predict(xgb.DMatrix(test_df[features]))[0]
        actions = []
        for file in files:
            with open('file', mode='rb') as f:
                model = pickle.load(f)
            actions.append(model.predict(test_df[features])[0])
        action = sum(actions) / len(actions)
        if (action > 0.5):
            sample_prediction_df.action = 1
        else:
            sample_prediction_df.action = 0                            
    else:
        sample_prediction_df.action = 0
    env.predict(sample_prediction_df)

janestreetを使い実際のデータをダウンロードし推論するというコンペです。
学習と推論を同時に行う場合は推論時にかかる時間は15~30分程度ですが、一度学習済みモデルを保存して別のノートブックで推論を行った場合3時間以上かかってしまいます。
おそらくGPUを使用していないと考え、lightGBMとGPUについていろいろ調べたのですが、どの資料も学習と推論を同じノートブックでやっているものばかりで解決できませんでした。
(重みをロードしたlightGBMのparamsを確認したところdvice: gpuとなってはいました。)

そのため、解決方法について教えていただきたいです。
kaggleやlightGBMを使用した経験がまだ浅いためそもそも考え方が間違っているところもあると思います。そのような指摘も是非よろしくお願いします。