[問題] r+hadoop中rmr/rhdfs問題
[目的]
在VM上建立三台虛擬機
並建立rhadoop
執行簡單的rmr範例
[問題簡敘]=============================================================
在VM虛擬器上建立hadoop環境建三台機子,
master,node1,node2
裝好了hadoop(hadoop可以運作)
但是裝R時遇到了無法library:rmr和rhdfs的問題
且無法install.packages:rJava套作
上敘的環境我裝了兩次
第一次(環境一)不知怎麼的就成功install.packges:
rJava的套,但第二次重裝(環境二)卻怎麼樣也無法成功下載
rJava,若rJava無法,rmr跟rhdfs也用不了了!!
[環境一]((提外話~之前莫明成功???)))=====================================
怎麼裝有點忘了,但跟第二次差不多
感覺還是有點問題...(下有影片連結)
https://www.youtube.com/watch?v=ByAisA_dQxI&feature=youtu.be
[環境二](!!!!!主要問題!!!!!!現在安裝失敗)))=============================
[安裝步驟]
以下是我安裝的方法:(文長)(含安裝步驟影片)
https://www.youtube.com/watch?v=QTHfV_xYr8A&t=145s
沒剪輯7:04-14:30可跳過~下載很久@@
啟動hadoop
cd ~/hadoop && sbin/start-all.sh
------------------------------------------------
sudo vim .bashrc
sudo vim /etc/environment
sudo vim /etc/profile
在以上三個檔中新增路徑
export JAVA_HOME=/usr/lib/jvm/jdk/
export HADOOP_CMD=/home/hduser/hadoop/bin/hadoop
export HADOOP_HOME=/home/hduser/hadoop
export
HADOOP_STREAMING=/home/hduser/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.7.3.jar
重啟三個檔案
. /etc/environment
. /etc/profile
source .bashrc
-----------------------------------------------------------------------------
三個機子都要安裝R
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install r-base
sudo apt-get install r-base r-base-dev
java 設訂------------------
echo $JAVA_HOME
sudo JAVA_HOME=/usr/lib/jvm/jdk/ R CMD javareconf
in the R--------------------
進入R
sudo R
這裡有錯誤,無法下載rJava(!!!!!問題所在)(註1)
install.packages(c("codetools","R","Rcpp","RJSONIO","bitops","digest","functional","stringr","plyr","reshape2","rJava","caTools"))
下載rmr 和 rhdfs
wget --no-check-certificate
https://raw.github.com/RevolutionAnalytics/rmr2/3.3.0/build/rmr2_3.3.0.tar.gz
wget --no-check-certificate
https://raw.github.com/RevolutionAnalytics/rhdfs/master/build/rhdfs_1.0.8.tar.gz
在R中----------------------
第二次安裝出錯 (註2)
install.packages("/home/hduser/rhdfs_1.0.8.tar.gz", repos=NULL, type="source")
install.packages("/home/hduser/rmr2_2.2.2.tar.gz", repos = NULL,
type="source")
Sys.setenv(HADOOP_HOME="/home/hduser/hadoop")
Sys.setenv(HADOOP_PREFIX="/home/hduser/hadoop")
Sys.setenv(HADOOP_CMD="/home/hduser/hadoop/bin/hadoop")
Sys.setenv(HADOOP_STREAMING="/home/hduser/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.7.3.jar")
Sys.getenv("HADOOP_CMD")
後面就不用看了...因為全不行@@
------------------------------------
無法下載rJava後,後面步驟都出錯了QQ
[錯誤息訊]=============================================================
(安裝環境二)
https://www.youtube.com/watch?v=QTHfV_xYr8A&t=145s
7:04-14:30可跳過~下載很久@@
(註1)14:30
在install.packages(c("codetools","R","Rcpp","rJava"...後失敗,錯誤訊息:
...waring messages:
1: package 'R' is not available (for R version 3.2.3)
2: Ln install.packages(c(......) 'rJava' had non-zero exit status
(註2)16:38
install.packages("/home/hduser/rhdfs_1.0.8.tar.gz", repos=NULL, type="source")
install.packages("/home/hduser/rmr2_2.2.2.tar.gz", repos = NULL,
type="source")
兩個都出現:
would you like to use a personal library instead?
我只能選y
懷疑這裡也出了問題
因為RDM網站有提到library要設定給所有用戶(其實不太懂意思..@@)
http://www.rdatamining.com/big-data/r-hadoop-setup-guide
在7.1 Install relevant R packages的地方
原文:
RHadoop packages are dependent on above packages,
which should be installed for all users, instead of in personal library.
....文長以下略
但我不知到在install.packages那些套件前
要怎麼給所有使用者用(這裡的意思應該是三台機子吧@@
[版本]=============================================================
(三台機子都是)
ubuntu 16.04.1
hadoop-2.7.3
R 3.2.3
rmr2 3.3.0
rhdfs 1.0.8
--
※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 140.128.101.143
※ 文章網址: https://www.ptt.cc/bbs/Linux/M.1482850033.A.98F.html
→
12/29 03:36, , 1F
12/29 03:36, 1F
→
12/29 03:37, , 2F
12/29 03:37, 2F
→
12/29 03:38, , 3F
12/29 03:38, 3F
→
12/29 03:48, , 4F
12/29 03:48, 4F
→
12/29 03:55, , 5F
12/29 03:55, 5F
→
01/11 15:53, , 6F
01/11 15:53, 6F
→
01/11 15:53, , 7F
01/11 15:53, 7F