Pythonのスクレイピング実行時にエラーが出てしまう

実施したいこと

・日経のニュースサイトから<a>タグのURLを取得したい

こちらの本の中でPythonでGoogleニュース(補足にてGoogleニュースは仕様変更のため日経ニュース）からURLをスクレイピングするコードがあり実行したいのですが、
下記のようなエラーが出ており、先に進めず困っています。

環境：Python 3.10.0

if (isinstance(value, str) or isinstance(value, collections.Callable) or hasattr(value, 'match')
AttributeError: module 'collections' has no attribute 'Callable'
コード

コードの参照元
独学プログラマー Pythonの言語の基本から仕事のやり方まで

Python
1import urllib.request
2from urllib.parse import urljoin  # URLを扱うモジュールを追加
3from bs4 import BeautifulSoup as BS
4
5class Scraper:
6    def __init__(self, site):
7        self.site = site
8        self.urls = set()  # 収集済みURLを入れておく変数
9
10    def scrape(self):
11        r = urllib.request.urlopen(self.site)
12        html = r.read()
13        parser = 'html.parser'
14        sp = BS(html, parser)
15        for tag in sp.find_all('a'):
16            url = tag.get('href')
17            if url is None:
18                continue
19            if 'atcl/news' not in url:  # 'atcl/news' を含まないURLは対象外にする
20                continue
21            full_url = urljoin(self.site, url)  # ドメイン名を含むURLに変換
22            if full_url in self.urls:  # 既に収集済みのURLは対象外にする
23                continue
24            self.urls.add(full_url)  # 収集済みURLに追加
25            print('\n' + full_url)  # URLを表示
26
27
28news = 'https://trendy.nikkeibp.co.jp/news/'  # ニュース取得元サイトを変更
29Scraper(news).scrape()

エラー内容

Terminalの実行結果
1
2Traceback (most recent call last):
3  File "/Users/nakajimataichi/Downloads/hosoku20181016/scraping_test.py", line 29, in <module>
4    Scraper(news).scrape()
5  File "/Users/nakajimataichi/Downloads/hosoku20181016/scraping_test.py", line 14, in scrape
6    sp = BS(html, parser)
7  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/bs4/__init__.py", line 228, in __init__
8    self._feed()
9  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/bs4/__init__.py", line 289, in _feed
10    self.builder.feed(self.markup)
11  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/bs4/builder/_htmlparser.py", line 215, in feed
12    parser.feed(markup)
13  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/html/parser.py", line 110, in feed
14    self.goahead(0)
15  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/html/parser.py", line 178, in goahead
16    k = self.parse_html_declaration(i)
17  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/html/parser.py", line 269, in parse_html_declaration
18    self.handle_decl(rawdata[i+2:gtpos])
19  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/bs4/builder/_htmlparser.py", line 160, in handle_decl
20    self.soup.endData(Doctype)
21  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/bs4/__init__.py", line 365, in endData
22    self.object_was_parsed(o)
23  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/bs4/__init__.py", line 370, in object_was_parsed
24    previous_element = most_recent_element or self._most_recent_element
25  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/bs4/element.py", line 1054, in __getattr__
26    return self.find(tag)
27  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/bs4/element.py", line 1292, in find
28    l = self.find_all(name, attrs, recursive, text, 1, **kwargs)
29  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/bs4/element.py", line 1313, in find_all
30    return self._find_all(name, attrs, text, limit, generator, **kwargs)
31  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/bs4/element.py", line 528, in _find_all
32    strainer = SoupStrainer(name, attrs, text, **kwargs)
33  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/bs4/element.py", line 1610, in __init__
34    self.text = self._normalize_search_value(text)
35  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/bs4/element.py", line 1615, in _normalize_search_value
36    if (isinstance(value, str) or isinstance(value, collections.Callable) or hasattr(value, 'match')
37AttributeError: module 'collections' has no attribute 'Callable'
38

疑問

エラーの内容としては、Python 組み込みモジュール 'collection'が'Callable'というメソッドを保持していないということなので、
Python3.10ではこのコードは実行できないということでしょうか？

もし、その場合はどのバージョンのPythonであれば、このコードが実行できるのかをどうやって調べればいいのでしょうか？

行動規範の内容に同意します

回答2件

What’s New In Python 3.10 を見ると関連情報の記載があります。

Removed

Remove deprecated aliases to Collections Abstract Base Classes from the collections module. (Contributed by Victor Stinner in bpo-37324.)

Improved Modules
collections.abc

The __args__ of the parameterized generic for collections.abc.Callable are now consistent with typing.Callable. collections.abc.Callable generic now flattens type parameters, similar to what typing.Callable currently does.

投稿2021/11/16 03:05

melian

総合スコア21181

Nakajima_Taichi

2021/11/16 11:23

ご回答ありがとうございます。確かに、マニュアルで確認できました。

行動規範の内容に同意します

ベストアンサー

古い BeautifulSoup は Python 3.10 で動きません。
BeautifulSoup を新しいものにするか、Python 3.9 以下を使いましょう。

投稿2021/11/16 02:50

int32_t

総合スコア21929

Nakajima_Taichi

2021/11/16 08:41

ご回答ありがとうございます。以下のように実行してみたのですが、解決できず困っています。 python3.10.0が入っていたので、pyenvで3.8.12をインストールし、 pyenv local 3.8.12でディレクトリにpython 3.8.12を設定しました。 pyenv versions > system > 3.10.0 >* 3.8.12 (set by /Users/nakajimataichi/Downloads/hosoku20181016/.python-version) この状態で python scraping_test.py ※scraping_test.pyはスクレイピングコードを実行したのですが、同様のエラーが出ています。 - エラー - if (isinstance(value, str) or isinstance(value, collections.Callable) or hasattr(value, 'match') AttributeError: module 'collections' has no attribute 'Callable' python -V で確認すると、 Python 3.10.0 が出てきます。これはうまく、3.8.12で実行できていないのではないかと思うですが、python3.9以下を利用する（切り替える方法）はどうすればいいのか教えていただけないでしょうか？