Re: [程式] 關於R 統計點的數目
※ 引述《jackhzt (巴克球)》之銘言:
: [軟體程式類別]:R
: [程式問題]:資料處理
: [程式熟悉]:熟悉
: [問題敘述]:假設我有一筆資料(x1,y1),(x2,y2),(x3,y3)...我將其用plot打點
: 如今想對其所在的座標做分割,切出4個等面積的區域,
: 請問有相關的程式碼可以統計區域內點的總數目嗎?
: [程式範例]:
: EX:
: x=rnorm(10,0,1)
: y=rnorm(10,2,1)
: z=data.frame(x,y)
: plot(z) #打點出來
: 假設現在想對x軸=0以及y軸=0 分四塊,
: 那有方法可以統計這4塊面積中點的數目嗎?
: 目前我的方法就是 把點拿出來先對x軸比大小 再對y軸比大小
: 但是如果比數或是需要分割的東西太多,就會很複雜,有其他較好的方法嗎?
1. 如果會dplyr
library(dplyr)
library(magrittr)
set.seed(2)
x=rnorm(10,0,1)
y=rnorm(10,1,1)
z=data.frame(x,y)
a = 0; b = 0
z_append = z %>% mutate(x_gt_a = x > a, y_gt_b = y > b) %>%
group_by(x_gt_a, y_gt_b) %>% summarise(count = length(x))
#
# Source: local data frame [4 x 3]
# Groups: x_gt_a y_gt_b
#
# x_gt_a y_gt_b count
# (lgl) (lgl) (int)
# 1 FALSE FALSE 1
# 2 FALSE TRUE 4
# 3 TRUE FALSE 1
# 4 TRUE TRUE 4
# 多個切割點就看第三點吧
2. 如果不會dplyr
mat = tapply(rep(1, nrow(z)), list(z$x > a, z$y > b), length)
z_count = data.frame(x_gt_a = rep(colnames(mat), 2),
y_gt_b = rep(rownames(mat), each = 2), count = as.vector(mat))
# x_gt_a y_gt_b count
# 1 FALSE FALSE 1
# 2 TRUE FALSE 1
# 3 FALSE TRUE 4
# 4 TRUE TRUE 4
3. findInterval
library(dplyr)
library(magrittr)
set.seed(2)
x=rnorm(1000,0,1)
y=rnorm(1000,1,1)
z=data.frame(x = x, y = y)
x_cutPoints = c(-Inf, seq(-1, 1, by = 0.1), Inf)
y_cutPoints = c(-Inf, seq(-1, 1, by = 0.1), Inf)
count_df = z %>% mutate(cutPointsGroup_x = cut(x, x_cutPoints),
cutPointsGroup_y = cut(y, y_cutPoints)) %>%
group_by(cutPointsGroup_x, cutPointsGroup_y) %>%
summarise(count = length(x))
#
# Source: local data frame [276 x 3]
# Groups: cutPointsGroup_x, cutPointsGroup_y
#
# cutPointsGroup_x cutPointsGroup_y count
# (fctr) (fctr) (int)
# 1 (-Inf,-1] (-Inf,-1] 3
# 2 (-Inf,-1] (-1,-0.9] 2
# 3 (-Inf,-1] (-0.8,-0.7] 1
# 4 (-Inf,-1] (-0.7,-0.6] 4
# 5 (-Inf,-1] (-0.6,-0.5] 2
# 6 (-Inf,-1] (-0.5,-0.4] 2
# 7 (-Inf,-1] (-0.4,-0.3] 4
# 8 (-Inf,-1] (-0.3,-0.2] 2
# 9 (-Inf,-1] (-0.2,-0.1] 2
# 10 (-Inf,-1] (-0.1,0] 6
# .. ... ... ...
--
R資料整理套件系列文:
magrittr #1LhSWhpH (R_Language) http://tinyurl.com/1LhSWhpH
data.table #1LhW7Tvj (R_Language) http://tinyurl.com/1LhW7Tvj
dplyr(上) #1LhpJCfB (R_Language) http://tinyurl.com/1LhpJCfB
dplyr(下) #1Lhw8b-s (R_Language)
tidyr #1Liqls1R (R_Language) http://tinyurl.com/1Liqls1R
--
※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 140.109.73.231
※ 文章網址: https://www.ptt.cc/bbs/Statistics/M.1451496691.A.D63.html
推
12/31 01:43, , 1F
12/31 01:43, 1F
→
12/31 01:43, , 2F
12/31 01:43, 2F
→
12/31 01:44, , 3F
12/31 01:44, 3F
切的多塊一點的話,我要再想想看XD
→
12/31 01:45, , 4F
12/31 01:45, 4F
→
12/31 01:47, , 5F
12/31 01:47, 5F
請看第三點
推
12/31 02:00, , 6F
12/31 02:00, 6F
另外,如果要改成矩陣,可以找reshape2:::dcast, tidyr:::spread or
data.table:::dcast.data.table
※ 編輯: celestialgod (140.109.73.231), 12/31/2015 02:09:43
推
12/31 02:20, , 7F
12/31 02:20, 7F
討論串 (同標題文章)
完整討論串 (本文為第 2 之 2 篇):
程式
0
3