編集履歴

質問編集履歴

少し改変しました。

2020/07/28 12:15

投稿

onosan

スコア62

title CHANGED Viewed

File without changes

body CHANGED Viewed

@@ -5,8 +5,9 @@
 https://nintei.nurse.or.jp/certification/General/(X(1)S(efl0y555pect3x45oxjzfw3x))/General/GCPP01LS/GCPP01LS.aspx?AspxAutoDetectCookieSupport=1
-以前に試した方法として、以下のようなものをテンプレートにしているのですが、エラーは起きないのですが、うまく取得できません。```python
+以前に試した方法として、以下のようなものをテンプレートにしているのですが、エラーは起きないのですが、うまく取得できません。
+```Python3
 #保存用
@@ -83,5 +84,4 @@
 dw_xlsx = [f for f in dw_list]
 for dw in dw_xlsx:
     shutil.move(r'C:\Users\akira\Documents\Python\会社')
 ```

少し改変しました。

2020/07/28 12:15

投稿

onosan

スコア62

title CHANGED Viewed

File without changes

body CHANGED Viewed

@@ -40,7 +40,7 @@
 soup = bs4.BeautifulSoup(driver.page_source, 'html5lib')
-base = 'https://web.shionogi.co.jp/mstr/microstrategy/asp/'
+base = 'https://nintei.nurse.or.jp/certification/General/'
 soup_file1 = soup.find_all('a')
 href_list = []

余計な部分を省きました

2020/07/28 11:33

投稿

onosan

スコア62

title CHANGED Viewed

File without changes

body CHANGED Viewed

@@ -8,7 +8,7 @@
 以前に試した方法として、以下のようなものをテンプレートにしているのですが、エラーは起きないのですが、うまく取得できません。```python
-#保存用（SHIFTデータのダウンロード可能）
+#保存用
 driver_path = r'C:\Anaconda3\chromedriver.exe'#自分のChoromedriverの場所

余計な部分を省きました

2020/07/28 11:26

投稿

onosan

スコア62

title CHANGED Viewed

File without changes

body CHANGED Viewed

@@ -7,6 +7,7 @@
 以前に試した方法として、以下のようなものをテンプレートにしているのですが、エラーは起きないのですが、うまく取得できません。```python
 #保存用（SHIFTデータのダウンロード可能）
 driver_path = r'C:\Anaconda3\chromedriver.exe'#自分のChoromedriverの場所
@@ -44,8 +45,8 @@
 soup_file1 = soup.find_all('a')
 href_list = []
-file_num = 1#len(os.listdir(r'C:\Users\{0}\Downloads'.format(myNo)))の初期値が1になるため
+file_num = 1
-sum_file = 1#上と同じ理由
+sum_file = 1
 cc = 0
 for s in soup_file1:

サンプルコードを追加しました

2020/07/28 11:24

投稿

onosan

スコア62

title CHANGED Viewed

File without changes

body CHANGED Viewed

@@ -56,8 +56,8 @@
         print(path)
         driver.get(path)
-        WebDriverWait(driver, 300).until(EC.element_to_be_clickable((By.XPATH,'//*[@id="3131"]')))
+        WebDriverWait(driver, 300).until(EC.element_to_be_clickable((By.XPATH,'//*[@id="ctl00_plhContent_btnSearchMain"]')))
-        driver.find_element_by_xpath('//*[@id="3131"]').click()
+        driver.find_element_by_xpath('//*[@id="ctl00_plhContent_btnSearchMain"]').click()
         while sum_file == file_num :
             sum_file = len(os.listdir(r'C:\Users\akira\Downloads'))

動かしたコードを書きました

2020/07/28 11:19

投稿

onosan

スコア62

title CHANGED Viewed

	@@ -1,1 +1,1 @@
1	- ~~メニュ~~ーが~~たくさんあるウェブページから情報を一気に~~取得したい
1	+ pythonのスクレイピングでデータが取得できない

body CHANGED Viewed

@@ -3,4 +3,84 @@
 ウェブページは以下のものです。
-https://nintei.nurse.or.jp/certification/General/(X(1)S(efl0y555pect3x45oxjzfw3x))/General/GCPP01LS/GCPP01LS.aspx?AspxAutoDetectCookieSupport=1
+https://nintei.nurse.or.jp/certification/General/(X(1)S(efl0y555pect3x45oxjzfw3x))/General/GCPP01LS/GCPP01LS.aspx?AspxAutoDetectCookieSupport=1
+以前に試した方法として、以下のようなものをテンプレートにしているのですが、エラーは起きないのですが、うまく取得できません。```python
+#保存用（SHIFTデータのダウンロード可能）
+driver_path = r'C:\Anaconda3\chromedriver.exe'#自分のChoromedriverの場所
+#読み込みたいフォルダの場所
+URL = 'https://nintei.nurse.or.jp/certification/General/(X(1)S(efl0y555pect3x45oxjzfw3x))/GCPP01LS/GCPP01LS.aspx'
+#格納したいフォルダの場所
+send_path = r'C:\Users\akira\Documents\Python\会社'
+from selenium import webdriver
+import time
+import bs4
+import re
+import os
+import time
+import shutil
+from selenium.webdriver.support.ui import WebDriverWait
+from selenium.webdriver.support import expected_conditions as EC
+from selenium.webdriver.common.by import By
+start = time.time()
+driver = webdriver.Chrome(driver_path)
+driver.get(URL)
+time.sleep(3)
+soup = bs4.BeautifulSoup(driver.page_source, 'html5lib')
+base = 'https://web.shionogi.co.jp/mstr/microstrategy/asp/'
+soup_file1 = soup.find_all('a')
+href_list = []
+file_num = 1#len(os.listdir(r'C:\Users\{0}\Downloads'.format(myNo)))の初期値が1になるため
+sum_file = 1#上と同じ理由
+cc = 0
+for s in soup_file1:
+    if s.string=='検索':
+        path = base+s.get('href')
+        href_list.append(path)
+        print(path)
+        driver.get(path)
+        WebDriverWait(driver, 300).until(EC.element_to_be_clickable((By.XPATH,'//*[@id="3131"]')))
+        driver.find_element_by_xpath('//*[@id="3131"]').click()
+        while sum_file == file_num :
+            sum_file = len(os.listdir(r'C:\Users\akira\Downloads'))
+        else:
+            print("現在のダウンロードファイル数_{}枚".format(sum_file-1))
+            file_num += 1
+        cc += 1
+#一時ファイルが邪魔をする場合があるので時間を少し開ける
+time.sleep(60)
+#ファイルの移動
+dw_path = r'C:\Users\akira\Documents\Python\会社'
+dw_list = os.listdir(dw_path)
+dw_xlsx = [f for f in dw_list]
+for dw in dw_xlsx:
+    shutil.move(r'C:\Users\akira\Documents\Python\会社')
+```