Re: [請益] parser 文字
原文恕刪
想請問前輩們
小弟在parser網頁遇到一個新的問題
就是用原本的 simple_parser_dom的工具來parser
http://tour.taitung.gov.tw/zh-tw/Home/Index
會出錯
問題1 : 如何解
再來小弟到處研究了一下
用了另一個 curl
<?php
# Use the Curl extension to query Google and get back a page of results
$url = "http://tour.taitung.gov.tw/zh-tw/Home/Index";
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$html = curl_exec($ch);
curl_close($ch);
# Create a DOM parser object
$dom = new DOMDocument();
# Parse the HTML from Google.
# The @ before the method call suppresses any warnings that
# loadHTML might throw because of invalid HTML in the page.
@$dom->loadHTML($html);
# Iterate over all the <a> tags
foreach($dom->getElementsByTagName('a') as $link) {
# Show the <a href>
echo $link->getAttribute('href');
echo "<br />";
}
foreach($dom->getElementsByTagName('a') as $v) {
echo $v->getAttribute('title');
echo "<br />";
}
?>
用上面的語法 是parser出來了,不過parser回來的字是亂碼
試著加入
$v = mb_convert_encoding($v,"BIG5","UTF-8");
結果會出錯
請教這如何解呢 ?
--
※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 49.158.112.110
※ 文章網址: http://www.ptt.cc/bbs/PHP/M.1411883164.A.113.html
→
09/28 14:02, , 1F
09/28 14:02, 1F
→
09/28 14:14, , 2F
09/28 14:14, 2F
推
09/28 15:25, , 3F
09/28 15:25, 3F
→
09/28 16:07, , 4F
09/28 16:07, 4F
→
09/28 16:07, , 5F
09/28 16:07, 5F
推
09/28 22:08, , 6F
09/28 22:08, 6F
→
09/28 22:09, , 7F
09/28 22:09, 7F
→
09/28 22:09, , 8F
09/28 22:09, 8F
→
09/28 22:09, , 9F
09/28 22:09, 9F
→
09/28 22:16, , 10F
09/28 22:16, 10F
→
09/28 22:16, , 11F
09/28 22:16, 11F
討論串 (同標題文章)