[問題] requests.get中文網址出現 BadStatusLine

看板Python作者tides (monet)時間5年前 (2019/01/27 21:03)推噓0(0推 0噓 1→)

留言1則, 1人參與討論串1/1

新人剛接觸 Crawler 想要爬中文網址的 url (e.g. https://dictionary.cambridge.org/zht/詞典/英語-漢語-繁體/tuple) 使用下面的方式處理 UTF-5 但會碰到 http.client.BadStatusLine 的問題不知道有什麼可能的解決方向？底下是示範碼: import requests # import sys from safeprint import print import urllib url1 = "https://dictionary.cambridge.org/zht/詞典/英語-漢語-繁體/tuple" url5 = "https://dictionary.cambridge.org/zht/%E8%A9%9E%E5%85%B8/%E8%8B%B1%E8%AA%9E-%E6%BC%A2%E8%AA%9E-%E7%B9%81%E9%AB%94/tuple" url6a = "https://dictionary.cambridge.org/zht/" url6b = urllib.parse.quote("詞典/英語-漢語-繁體") url6c = "/tuple" url6 = url6a + url6b + url6c # url6 = url5 print(url6) print(url5) r = requests.get(url5) # get error here r.encoding='utf-8' print(r.text) -- Sent from my Windows -- ※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 114.34.37.144 ※ 文章網址: https://www.ptt.cc/bbs/Python/M.1548594218.A.D82.html

→

01/31 05:51, 5年前 , 1^F

01/31 05:51, 1^F

文章代碼(AID): #1SJQmgs2 (Python)