回答編集履歴

追記

2019/11/21 14:17

投稿

スコア1286

answer CHANGED Viewed

@@ -25,4 +25,28 @@
 ```
 サンプルにdivでclassが"__shi_m_txt"はないのと
-これで取得しても「資格」しか取れないと思うのですが
+これで取得しても「資格」しか取れないと思うのですが
+# 追記
+beautifulsoupの場合
+```python
+import requests
+from bs4 import BeautifulSoup
+import re
+url = 'https://shingakunet.com/gakko/SC003048/gakubugakka/00000000000160337/'
+r = requests.get(url)
+r.raise_for_status()
+soup = BeautifulSoup(r.content, 'html5lib')
+# 資格の下のテキスト取得
+t = "\n".join([re.sub("\s", "", i) for i in soup.find("span", class_ = "__shi_m_txt", text = "資格").parent.find_next_sibling("div", class_="__shi_gakkaDetail_cont").stripped_strings])
+# 資格のタグ（find）→親タグ（parent）→下の兄弟タグ（find_next_sibling）→前後の空白文字を除去したテキスト→文中の空白文字削除→結合
+# Webページと同じように、の後の改行を削除
+print(re.sub("、\n", "、", t))
+```