実現したいこと
Scrapyを学習中です。スクレイピングした情報をMongoDBに保存したいのですが、詰まっております。
前提
仮想環境の構築、モジュール類のインストール、pythonファイルへの記述は済んでいます。
発生している問題・エラーメッセージ
動かすとTypeError: 'MongoClient' object is not callable
TypeError: 'MongoClient' object is not callable
というエラーメッセージが出てしまいます。
**piplines.pyでself.client = pymongo.MongoClient()と記述しているのですが、呼び出せないと言われてしまいます。
何がいけないのかがわかりません。**
ご教示いただけるとありがたいです。
該当のソースコード
最後尾に「TypeError: 'MongoClient' object is not callable」が出ています。
(scrapy) E:\scraping\projects\kinokuniya>scrapy crawl computer_books 2023-09-16 10:33:12 [scrapy.utils.log] INFO: Scrapy 2.4.1 started (bot: kinokuniya) 2023-09-16 10:33:12 [scrapy.utils.log] INFO: Versions: lxml 4.9.3.0, libxml2 2.10.3, cssselect 1.2.0, parsel 1.7.0, w3lib 2.1.2, Twisted 22.10.0, Python 3.8.18 (default, Sep 11 2023, 13:39:12) [MSC v.1916 64 bit (AMD64)], pyOpenSSL 22.0.0 (OpenSSL 3.0.5 5 Jul 2022), cryptography 38.0.1, Platform Windows-10-10.0.19045-SP0 2023-09-16 10:33:12 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor 2023-09-16 10:33:12 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'kinokuniya', 'CONCURRENT_REQUESTS': 1, 'DEPTH_PRIORITY': 1, 'DOWNLOAD_DELAY': 3, 'FEED_EXPORT_ENCODING': 'utf-8', 'HTTPCACHE_ENABLED': True, 'HTTPCACHE_EXPIRATION_SECS': 86400, 'NEWSPIDER_MODULE': 'kinokuniya.spiders', 'ROBOTSTXT_OBEY': True, 'SCHEDULER_DISK_QUEUE': 'scrapy.squeues.PickleFifoDiskQueue', 'SCHEDULER_MEMORY_QUEUE': 'scrapy.squeues.FifoMemoryQueue', 'SPIDER_MODULES': ['kinokuniya.spiders']} 2023-09-16 10:33:12 [scrapy.extensions.telnet] INFO: Telnet Password: 2931899aede90ae7 2023-09-16 10:33:12 [scrapy.middleware] INFO: Enabled extensions: ['scrapy.extensions.corestats.CoreStats', 'scrapy.extensions.telnet.TelnetConsole', 'scrapy.extensions.logstats.LogStats'] 2023-09-16 10:33:13 [scrapy.middleware] INFO: Enabled downloader middlewares: ['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware', 'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware', 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware', 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware', 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware', 'scrapy.downloadermiddlewares.retry.RetryMiddleware', 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware', 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware', 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware', 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware', 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware', 'scrapy.downloadermiddlewares.stats.DownloaderStats', 'scrapy.downloadermiddlewares.httpcache.HttpCacheMiddleware'] 2023-09-16 10:33:13 [scrapy.middleware] INFO: Enabled spider middlewares: ['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware', 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware', 'scrapy.spidermiddlewares.referer.RefererMiddleware', 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware', 'scrapy.spidermiddlewares.depth.DepthMiddleware'] 2023-09-16 10:33:13 [scrapy.middleware] INFO: Enabled item pipelines: ['kinokuniya.pipelines.CheckItemPipeline', 'kinokuniya.pipelines.MongoPipeline'] 2023-09-16 10:33:13 [scrapy.core.engine] INFO: Spider opened 2023-09-16 10:33:14 [scrapy.core.engine] INFO: Closing spider (shutdown) 2023-09-16 10:33:14 [scrapy.utils.signal] ERROR: Error caught on signal handler: <bound method CoreStats.spider_closed of <scrapy.extensions.corestats.CoreStats object at 0x000002185A6B1640>> Traceback (most recent call last): File "C:\Users\Owner\anaconda3\envs\scrapy\lib\site-packages\scrapy\crawler.py", line 89, in crawl yield self.engine.open_spider(self.spider, start_requests) TypeError: 'MongoClient' object is not callable During handling of the above exception, another exception occurred: Traceback (most recent call last): File "C:\Users\Owner\anaconda3\envs\scrapy\lib\site-packages\scrapy\utils\defer.py", line 157, in maybeDeferred_coro result = f(*args, **kw) File "C:\Users\Owner\anaconda3\envs\scrapy\lib\site-packages\pydispatch\robustapply.py", line 55, in robustApply return receiver(*arguments, **named) File "C:\Users\Owner\anaconda3\envs\scrapy\lib\site-packages\scrapy\extensions\corestats.py", line 31, in spider_closed elapsed_time = finish_time - self.start_time TypeError: unsupported operand type(s) for -: 'datetime.datetime' and 'NoneType' 2023-09-16 10:33:14 [scrapy.statscollectors] INFO: Dumping Scrapy stats: {'log_count/ERROR': 1, 'log_count/INFO': 8} 2023-09-16 10:33:14 [scrapy.core.engine] INFO: Spider closed (shutdown) 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: Peer did not staple an OCSP response 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: Requesting OCSP data 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: Trying http://r3.o.lencr.org 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: Peer did not staple an OCSP response 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: Requesting OCSP data 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: Trying http://r3.o.lencr.org 2023-09-16 10:33:14 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): r3.o.lencr.org:80 2023-09-16 10:33:14 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): r3.o.lencr.org:80 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: Peer did not staple an OCSP response 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: Requesting OCSP data 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: Trying http://r3.o.lencr.org 2023-09-16 10:33:14 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): r3.o.lencr.org:80 2023-09-16 10:33:14 [urllib3.connectionpool] DEBUG: http://r3.o.lencr.org:80 "POST / HTTP/1.1" 200 503 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: OCSP response status: <OCSPResponseStatus.SUCCESSFUL: 0> 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: Verifying response 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: Responder is issuer 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: Caching OCSP response. 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: OCSP cert status: <OCSPCertStatus.GOOD: 0> 2023-09-16 10:33:14 [urllib3.connectionpool] DEBUG: http://r3.o.lencr.org:80 "POST / HTTP/1.1" 200 503 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: OCSP response status: <OCSPResponseStatus.SUCCESSFUL: 0> 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: Verifying response 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: Responder is issuer 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: Caching OCSP response. 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: OCSP cert status: <OCSPCertStatus.GOOD: 0> 2023-09-16 10:33:14 [urllib3.connectionpool] DEBUG: http://r3.o.lencr.org:80 "POST / HTTP/1.1" 200 503 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: OCSP response status: <OCSPResponseStatus.SUCCESSFUL: 0> 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: Verifying response 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: Responder is issuer 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: Caching OCSP response. 2023-09-16 10:33:14 [pymongo.ocsp_support] DEBUG: OCSP cert status: <OCSPCertStatus.GOOD: 0> Unhandled error in Deferred: 2023-09-16 10:33:14 [twisted] CRITICAL: Unhandled error in Deferred: Traceback (most recent call last): File "C:\Users\Owner\anaconda3\envs\scrapy\lib\site-packages\scrapy\crawler.py", line 192, in crawl return self._crawl(crawler, *args, **kwargs) File "C:\Users\Owner\anaconda3\envs\scrapy\lib\site-packages\scrapy\crawler.py", line 196, in _crawl d = crawler.crawl(*args, **kwargs) File "C:\Users\Owner\anaconda3\envs\scrapy\lib\site-packages\twisted\internet\defer.py", line 1947, in unwindGenerator return _cancellableInlineCallbacks(gen) File "C:\Users\Owner\anaconda3\envs\scrapy\lib\site-packages\twisted\internet\defer.py", line 1857, in _cancellableInlineCallbacks _inlineCallbacks(None, gen, status, _copy_context()) --- <exception caught here> --- File "C:\Users\Owner\anaconda3\envs\scrapy\lib\site-packages\twisted\internet\defer.py", line 1697, in _inlineCallbacks result = context.run(gen.send, result) File "C:\Users\Owner\anaconda3\envs\scrapy\lib\site-packages\scrapy\crawler.py", line 89, in crawl yield self.engine.open_spider(self.spider, start_requests) builtins.TypeError: 'MongoClient' object is not callable 2023-09-16 10:33:14 [twisted] CRITICAL: Traceback (most recent call last): File "C:\Users\Owner\anaconda3\envs\scrapy\lib\site-packages\twisted\internet\defer.py", line 1697, in _inlineCallbacks result = context.run(gen.send, result) File "C:\Users\Owner\anaconda3\envs\scrapy\lib\site-packages\scrapy\crawler.py", line 89, in crawl yield self.engine.open_spider(self.spider, start_requests) TypeError: 'MongoClient' object is not callable
試したこと
piplines.pyの記述
python
1from itemadapter import ItemAdapter 2from scrapy.exceptions import DropItem 3import pymongo 4 5 6import certifi 7 8class CheckItemPipeline: 9 def process_item(self, item, spider): 10 if not item.get('isbn'): 11 raise DropItem('Missing ISBN') 12 return item 13 14class MongoPipeline: 15 collection_name = 'computer_books' 16 def open_spider(self, spider): 17 self.client = pymongo.MongoClient('mongodb+srv://hisa:pasword@cluster0.iveymqq.mongodb.net/?retryWrites=true&w=majority') 18 19 self.db = self.client('BOOKDB') 20 21 def close_spider(self, spider): 22 self.client.close() 23 24 def process_item(self, item, spider): 25 self.db[self.collection_name].insert(dict(item))
setting.pyの記述
python
1ITEM_PIPELINES = { 2 3 'kinokuniya.pipelines.CheckItemPipeline': 100, 4 'kinokuniya.pipelines.MongoPipeline': 300, 5} 6 7CONCURRENT_REQUESTS = 1 8 9DEPTH_PRIORITY = 1 10SCHEDULER_DISK_QUEUE = 'scrapy.squeues.PickleFifoDiskQueue' 11SCHEDULER_MEMORY_QUEUE = 'scrapy.squeues.FifoMemoryQueue'
回答1件
あなたの回答
tips
プレビュー
バッドをするには、ログインかつ
こちらの条件を満たす必要があります。
2023/09/17 10:33