error: unbalanced parenthesis at position 3

前提・実現したいこと

文章から地名を抽出するコードを書いていて、エラーが出ました

発生している問題・エラーメッセージ

---------------------------------------------
error                                     Traceback (most recent call last)
<timed exec> in <module>

~\AppData\Local\Temp/ipykernel_18428/3108238957.py in tyushutu(user)
     12     A=Shu.iloc[:,1]
     13     for j in A:
---> 14         b=find_pre(j)#地名の判定
     15         for k in b:
     16             kouho.append(k)#地名の格納

~\AppData\Local\Temp/ipykernel_18428/721366613.py in find_pre(city)
      5 
      6     if len(city_candidates)==0:
----> 7         hit_area = todohuken[todohuken['市区町村'].str.contains(city)]
      8 
      9         # 同名の市町村がちょくちょくあるので2つ以上になることもある

~\anaconda3\lib\site-packages\pandas\core\strings\accessor.py in wrapper(self, *args, **kwargs)
    114                 )
    115                 raise TypeError(msg)
--> 116             return func(self, *args, **kwargs)
    117 
    118         wrapper.__name__ = func_name

~\anaconda3\lib\site-packages\pandas\core\strings\accessor.py in contains(self, pat, case, flags, na, regex)
   1151         dtype: bool
   1152         """
-> 1153         if regex and re.compile(pat).groups:
   1154             warnings.warn(
   1155                 "This pattern has match groups. To actually get the "

~\anaconda3\lib\re.py in compile(pattern, flags)
    250 def compile(pattern, flags=0):
    251     "Compile a regular expression pattern, returning a Pattern object."
--> 252     return _compile(pattern, flags)
    253 
    254 def purge():

~\anaconda3\lib\re.py in _compile(pattern, flags)
    302     if not sre_compile.isstring(pattern):
    303         raise TypeError("first argument must be string or compiled pattern")
--> 304     p = sre_compile.compile(pattern, flags)
    305     if not (flags & DEBUG):
    306         if len(_cache) >= _MAXCACHE:

~\anaconda3\lib\sre_compile.py in compile(p, flags)
    762     if isstring(p):
    763         pattern = p
--> 764         p = sre_parse.parse(p, flags)
    765     else:
    766         pattern = None

~\anaconda3\lib\sre_parse.py in parse(str, flags, state)
    960     if source.next is not None:
    961         assert source.next == ")"
--> 962         raise source.error("unbalanced parenthesis")
    963 
    964     if flags & SRE_FLAG_DEBUG:

error: unbalanced parenthesis at position 3

該当のソースコード

python
1def tyushutu(user):
2    contents_list=[]
3    contents=ans_pre[ans_pre['user_id']==user].iloc[:,1].values.tolist()
4    for j in range(len(contents)):
5        doc = nlp(contents[j])
6        for ent in doc.ents:
7            contents_list.append([user,ent.text,ent.label_])
8    df = pd.DataFrame(contents_list, columns = ['UserID' , 'Word', 'Type'])
9    Shu=df[df['Type']=='Province']#抽出した地名についてのデータフレーム
10    
11    kouho=[]
12    A=Shu.iloc[:,1]
13    for j in A:
14        b=find_pre(j)#地名の判定
15        for k in b:
16            kouho.append(k)#地名の格納
17    if len(kouho)!=0: 
18        ken=mode(kouho) #一番たくさん出た地名を格納
19        print("抽出")
20        r1=1
21    else:
22  以下省略
23
24def find_pre(city):     
25    return_obj = ''
26    
27    city_candidates = [x for x in pre if x in city]
28    
29    if len(city_candidates)==0:
30        hit_area = todohuken[todohuken['市区町村'].str.contains(city)]
31    
32        # 同名の市町村がちょくちょくあるので2つ以上になることもある
33        city_candidates = hit_area['都道府県'].unique()
34        # 都道府県のインデックスを探す
35            
36    return city_candidates

行動規範の内容に同意します

回答1件

ベストアンサー

このソースは完全ではありませんし、どういうデータを扱っているのかがわかりません。

エラーメッセージから判断すると、

python
1        hit_area = todohuken[todohuken['市区町村'].str.contains(city)]

のうちのtodohuken['市区町村'].str.contains(city)を実行したときに、cityの値である文字列に丸括弧が含まれていて、それが正規表現として不完全であるためにエラーが発生しています。

エラーメッセージを消すだけであれば、以下のようにすればエラーは出なくなるでしょう。

python
1        hit_area = todohuken[todohuken['市区町村'].str.contains(city, regex=False)]

しかし、これで求めるものが得られるかどうかは疑問です。
その修正方法はやりたいこと次第ですので、teratailで質問して答えを得るのは難しいでしょう。

投稿2022/01/19 08:36

ppaul

総合スコア24672

mu08

2022/01/19 14:37

ありがとうございました

行動規範の内容に同意します

あなたの回答

tips

プレビュー

行動規範の内容に同意します

質問の解決につながる回答をしましょう。サンプルコードなど、より具体的な説明があると質問者の理解の助けになります。また、読む側のことを考えた、分かりやすい文章を心がけましょう。

15分調べてもわからないことは
teratailで質問しよう！

ただいまの回答率
85.30%

質問をまとめることで
思考を整理して素早く解決

テンプレート機能で
簡単に質問をまとめる

質問する

前提・実現したいこと

発生している問題・エラーメッセージ

該当のソースコード

関連した質問