google colab上でpandas query methodを使用した以下のコードでエラーが起こってしまいます。
解決方法をご教授ください。
・使用PC:Macです。
・行ったこと:
違うPCで起動(win10⇨正常動作)
pandas,NUMEXP,アナコンダの再インストール
win10とのpandasのバージョン確認⇨同一Ver
pyhton
1# pipでオリジナルの解答に必要なライブラリーをインストール 2!pip install --upgrade pip 3!pip install -U pandas numpy scikit-learn imbalanced-learn 4 5# pipでオリジナルの解答に必要なライブラリーをインポート 6import os 7import pandas as pd 8import numpy as np 9from datetime import datetime, date 10from dateutil.relativedelta import relativedelta 11import math 12from sklearn import preprocessing 13from sklearn.model_selection import train_test_split 14from imblearn.under_sampling import RandomUnderSampler 15 16 17# データを github/noguhiro2002/100knocks-preprocess/work/data フォルダよりDataframe形式でRead 18df_customer = pd.read_csv('https://raw.githubusercontent.com/The-Japan-DataScientist-Society/100knocks-preprocess/master/docker/work/data/customer.csv') 19df_category = pd.read_csv('https://raw.githubusercontent.com/The-Japan-DataScientist-Society/100knocks-preprocess/master/docker/work/data/category.csv') 20df_product = pd.read_csv('https://raw.githubusercontent.com/The-Japan-DataScientist-Society/100knocks-preprocess/master/docker/work/data/product.csv') 21df_receipt = pd.read_csv('https://raw.githubusercontent.com/The-Japan-DataScientist-Society/100knocks-preprocess/master/docker/work/data/receipt.csv') 22df_store = pd.read_csv('https://raw.githubusercontent.com/The-Japan-DataScientist-Society/100knocks-preprocess/master/docker/work/data/store.csv') 23df_geocode = pd.read_csv('https://raw.githubusercontent.com/noguhiro2002/100knocks-preprocess_ForColab-AzureNotebook/master/data/geocode.csv') 24 25df_sales_amount = df_receipt.query('not customer_id.str.startswith("Z")', engine='python') 26df_sales_amount = df_sales_amount[['customer_id', 'amount']].groupby('customer_id').sum().reset_index() 27df_sales_amount['sales_flg'] = df_sales_amount['amount'].apply(lambda x: 1 if x > 2000 else 0) 28df_sales_amount.head(10)
エラーは以下
python
1--------------------------------------------------------------------------- 2ImportError Traceback (most recent call last) 3<ipython-input-24-303c66cfc474> in <module>() 4----> 1 df_sales_amount = df_receipt.query('not customer_id.str.startswith("Z")', engine='python') 5 2 df_sales_amount = df_sales_amount[['customer_id', 'amount']].groupby('customer_id').sum().reset_index() 6 3 df_sales_amount['sales_flg'] = df_sales_amount['amount'].apply(lambda x: 1 if x > 2000 else 0) 7 4 df_sales_amount.head(10) 8 93 frames 10/usr/local/lib/python3.7/dist-packages/pandas/core/computation/eval.py in _check_engine(engine) 11 39 Engine name. 12 40 """ 13---> 41 from pandas.core.computation.check import NUMEXPR_INSTALLED 14 42 15 43 if engine is None: 16 17ImportError: cannot import name '_NUMEXPR_INSTALLED' from 'pandas.core.computation.check' (/usr/local/lib/python3.7/dist-packages/pandas/core/computation/check.py) 18 19--------------------------------------------------------------------------- 20NOTE: If your import is failing due to a missing package, you can 21manually install dependencies using either !pip or !apt. 22 23To view examples of installing some common dependencies, click the 24"Open Examples" button below. 25---------------------------------------------------------------------------
使用教材:Google Colab, Azure Notebooks移植版:データサイエンス100本ノック(構造化データ加工編)
https://github.com/noguhiro2002/100knocks-preprocess_ForColab-AzureNotebook/blob/master/preprocess_knock_Python_Colab.ipynb
回答1件
あなたの回答
tips
プレビュー