回答率: 85.48%

質問するログイン新規登録

トップに関する質問スクレイピングである人物の画像を収集したい

編集履歴

質問編集履歴

2

情報の追加

2018/12/03 08:25

投稿

スコア170

test CHANGED Viewed

File without changes

test CHANGED Viewed

@@ -1,6 +1,6 @@
 スクレイピングである人物の画像を収集したいです。
-https://qiita.com/kenmaz/items/4b60ea00b159b3e00100　を参考にしながら作っています。
+https://note.mu/kokoperikyo/n/n8023c7e9e262　を参考にしながら作っています。
 ```ここに言語を入力

1

情報の追加

2018/12/03 08:25

投稿

スコア170

test CHANGED Viewed

File without changes

test CHANGED Viewed

@@ -4,9 +4,31 @@
 ```ここに言語を入力
-url = 'https://search.yahoo.co.jp/image/search?p=藤田ニコル&oq=藤田&ei=UTF-8&b={}&ktot=5'.format(num_self)
+def img_url_list(num):
+    """
+    using yahoo (this script can't use at google)
+    """
+    num_self = num
+    url = 'https://search.yahoo.co.jp/image/search?p=藤田ニコル&oq=藤田&ei=UTF-8&b={}&ktot=5'.format(num_self)
-byte_content, _ = fetcher.fetch(url)
+    byte_content, _ = fetcher.fetch(url)
+    structured_page = BeautifulSoup(byte_content.decode('UTF-8'), 'html.parser')
+    img_link_elems = structured_page.find_all('a', attrs={'target': 'imagewin'})
+    img_urls = [e.get('href') for e in img_link_elems if e.get('href').startswith('http')]
+    img_urls = list(set(img_urls))
+    num_self += 20
+    return img_urls,num_self
 ```