質問編集履歴

python3.5のスクリプトに改変しました、再度御教示お願いいたします。

2018/02/21 02:57

投稿

akakage13

スコア89

title CHANGED Viewed

File without changes

body CHANGED Viewed

@@ -73,4 +73,115 @@
 と、エラーが出てきました。Typeerrorについて調べましたが、うまく動きませんでした。
-二行めの１６　を取る方法を　御教示　よろしくお願いいたします。
+二行めの１６　を取る方法を　御教示　よろしくお願いいたします。
+umyu様、丁寧な御教示ありがとうございます。
+うまく動きました。ただ、
+ご助言にもございましたが、小生、python3.5に移行中でございます。
+そのため、今回のスクリプトをpython3.5に改変しましたところ、
+以下のようなスクリプトになりました。
+```ここに言語を入力
+# -*- coding:utf-8 -*-
+import urllib.request
+import codecs
+import time
+from bs4 import BeautifulSoup
+f1 = codecs.open('tokyo1.csv', 'w', 'utf-8')
+f1.write('other_race_name'+u"\n")
+url_1='http://race.netkeiba.com/?pid=race_old&id=c201805010801'
+soup_1 = BeautifulSoup(urllib.request.urlopen(url_1), "html.parser")
+#race1
+tr_arr_1 = soup_1.select("table.race_table_old > tr ")
+for tr_1 in tr_arr_1:
+    #time.sleep(0.25)
+    tds_1 = tr_1.findAll("td")
+    if len( tds_1 ) > 1:
+        other_race_name_tag_1 = soup_1.find('div',{'class':'race_otherdata'}).find('p')
+        other_race_name_1 = "".join([x for x in other_race_name_tag_1.text if not x == u'\xa0' and not x == u'\n'])
+        cols = [other_race_name_1.strip()]
+        f1.write(",".join(cols) + "\n")
+        print (other_race_name_1.strip())
+f1.close()
+```
+上記のスクリプトは一行目まではうまく動きます。
+しかしながら、今回の目的の二行目までを取得しようとして、
+御教示いただいたように、下記の通りに改変しましたが、うまく動きませんでした。
+```ここに言語を入力
+# -*- coding:utf-8 -*-
+import urllib.request
+import codecs
+import time
+from bs4 import BeautifulSoup
+f1 = codecs.open('tokyo1.csv', 'w', 'utf-8')
+f1.write('other_race_name'+u"\n")
+url_1='http://race.netkeiba.com/?pid=race_old&id=c201805010801'
+soup_1 = BeautifulSoup(urllib.request.urlopen(url_1), "html.parser")
+#race1
+tr_arr_1 = soup_1.select("table.race_table_old > tr ")
+for tr_1 in tr_arr_1:
+    #time.sleep(0.25)
+    tds_1 = tr_1.findAll("td")
+    if len( tds_1 ) > 1:
+        other_race_name_tag_1 = soup_1.find('div',{'class':'race_otherdata'}).find_all('p')
+        other_race_name_1 = "".join([x for x in other_race_name_tag_1.text if not x == u'\xa0' and not x == u'\n'])
+        cols = [other_race_name_1.strip()]
+        f1.write(",".join(cols) + "\n")
+        print (other_race_name_1.strip())
+f1.close()
+```
+下記のエラーが出てきました。
+```ここに言語を入力
+Traceback (most recent call last):
+  File "C:\Users\satoru\satoru_system\race_data_scan\tokyo\1_tokyo_race_data_scan.py", line 23, in <module>
+    other_race_name_1 = "".join([x for x in other_race_name_tag_1.text if not x == u'\xa0' and not x == u'\n'])
+  File "C:\Users\satoru\AppData\Local\Programs\Python\Python36\lib\site-packages\bs4\element.py", line 1807, in __getattr__
+    "ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?" % key
+AttributeError: ResultSet object has no attribute 'text'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?```
+python3.5におけるfindの使い方について、再度ご助言頂けますと助かります。
+よろしくお願いいたします。

317

試したことと、うまく動かない結果について追記させていただきました。

2018/02/21 02:57

投稿

akakage13

スコア89

title CHANGED Viewed

File without changes

body CHANGED Viewed

@@ -27,10 +27,50 @@
 ```
-上記のソースコードでは、一行目の　1回　東京３日目　３歳　は　取れるのですが、
+上記のスクリプトでは、一行目の　1回　東京３日目　３歳　は　取れるのですが、
 目的とする、二行めの、　１６頭　が　取ることが出来ず、苦慮しております。
-二行目を取る方法をいろいろ調べましたが、うまく出来ませんでした。
+試したこと
+二行目を取る方法として、当該ソースコードを　見てみましたところ以下のような構造でした。
+```ここに言語を入力
+1R
+</dt>
+<dd>
+<h1>３歳未勝利</h1>
+<p><span>ダ1300m&nbsp;/&nbsp;天気：晴&nbsp;/&nbsp;馬場：不良&nbsp;/&nbsp;発走：11:00</span></p>
+</dd>
+</dl>
+<div class="race_otherdata">
+<p>1回東京3日目&nbsp;３歳&nbsp;</p>
+<p>牝[指定]&nbsp;16頭</p>
+<p>本賞金：500、200、130、75、50万円</p>
+</div>
+<ul class="btn_link_list fc">
+```
+上記の　１回　東京３日目　３歳はとれるので、<p>  </p>で囲まれている全てを取ろうと考えて、上記のスクリプトのfindの箇所を
+```ここに言語を入力
+other_race_name_tag_1 = soup_1.find('div',{'class':'race_otherdata'}).findall('p')
+other_race_name_1 = "".join([x for x in other_race_name_tag_1.text if not x == u'\xa0' and not x == u'\n'])
+```
+findall に改変して試しましたが、
+```ここに言語を入力
+    other_race_name_tag_1 = soup_1.find('div',{'class':'race_otherdata'}).findall('p')
+TypeError: 'NoneType' object is not callable
+```
+と、エラーが出てきました。Typeerrorについて調べましたが、うまく動きませんでした。
 二行めの１６　を取る方法を　御教示　よろしくお願いいたします。

317