[問題] Pchome股票網站爬蟲

看板Python作者s8607142004 (挖哩勒)時間3年前 (2021/12/08 14:13)推噓0(0推 0噓 3→)

留言3則, 2人參與, 3年前最新討論串1/1

各位版上大大好小弟剛進到爬蟲的世界想嘗試爬取Pchome股市的概念股清單網址如下 https://pchome.megatime.com.tw/group/sto3 先附上程式碼 import time import requests from bs4 import BeautifulSoup header={'Referer':'http://pchome.megatime.com.tw/stock/sto3/', 'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36'} url = "https://pchome.megatime.com.tw/group/sto3" r = requests.post(url,header) r.encoding = 'UTF-8' sp = BeautifulSoup(r.text, 'html5lib') sp 在sto3 那個Document裡面有看到需要的資料但爬出來的資料卻只有下面幾行但爬出來只有看到下面幾行 <html><head> </head> <body> <form action="https://pchome.megatime.com.tw/group/sto3" id="submit_form" method="post" name="submit_form"> <input name="is_check" type="hidden" value="1"/> </form> <script type="text/javascript"> document.getElementById('submit_form').submit(); </script> </body></html> 有爬到之前的文章說是header設定不對 https://pttdigit.com/python/M.1485354796.A.810.html 但我header 照著這篇大大說的設定方法類比去設還是沒辦法成功有另外嘗試使用pyppeteer 但也是爬不出來想請版上大神能指點迷津感謝 -- ※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 220.135.101.62 (臺灣) ※ 文章網址: https://www.ptt.cc/bbs/Python/M.1638972815.A.1BC.html

→

Woqeker

12/10 02:42, 3年前 , 1^F

12/10 02:42, 1^F

→

blc

12/10 20:30, 3年前 , 2^F

12/10 20:30, 2^F

→

blc

12/10 20:33, 3年前 , 3^F

12/10 20:33, 3^F

‣ 返回看板[ Python ] 程設

‣ 更多 s8607142004 的文章

文章代碼(AID): #1XiBsF6y (Python)