beautifulsoupの条件選択について

Question

### 前提・実現したいこと pythonのbeatifulsoupを使って、htmlから内容部分をタグごと抜き出したいと思っています。内容を文字だけ抜き出すのは、調べると簡単に出てくるのですが、タグごと抜き出すのは見つけられませんでした。 ### 発生している問題・エラーメッセージそこで次の2つの方法で解決しようとしました。 1.

タグかつ

タグを要素に持つ
を抜き出す。 ```python content_with_tag = soup.select('div:has(>p h2)') print(content_with_tag) ``` 実行すると空のリストが出力されてしまいます。 div:has(>p)やdiv:has(>h2)にするとそれぞれの条件でうまく出力されるので、pかつh2の書き方が悪いと思っています。beautifulsoupのドキュメントも調べたのですが、解決しませんでした。 2.1がうまくいかなかったので以下のコードで対応しようとしました。 ```python content_with_tag_p = soup.select('div:has(>p)') content_with_tag_p_h2 = content_with_tag_p.select('div:has(h2)') print(content_with_tag_p_h2) ``` 先に
タグを要素に持つ
を選択して、そこからまた
タグを要素に持つ
を抜き出そうとしましたが、「content_with_tag_p_h2」でattributeエラーが出てうまくいきませんでした。 ### 問題のソースコード ```python from selenium import webdriver from bs4 import BeautifulSoup driver = webdriver.Chrome('chromeドライバーがある場所') target_url = "https://www.verisign.com/en_US/website-presence/online/what-is-a-url/index.xhtml" driver.get(target_url) html = driver.page_source soup = BeautifulSoup(html, "lxml") content_with_tag_p = soup.select('div:has(>p)') content_with_tag_p_h2 = content_with_tag_p.select('div:has(h2)') print(content_with_tag_p_h2) ``` 1と2の解決方法をそれぞれお願いできると助かります。よろしくお願いいたします。

Accepted Answer

```python
content_with_tag_p = soup.select('div:has(> p + h2)')
```

前提・実現したいこと

発生している問題・エラーメッセージ

問題のソースコード

関連した質問