"grouped/clustered" Regions In Vector In R/python
I struggle a bit with following problem. I would like to find 'grouped/clustered' regions with 1s based on following criteria: Starting with position of first 1, if in window after
Solution 1:
You can do this easily by run length encoding the vector:
x <- c(1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,1,0,1,0,0,0,0,0,0,1,0,0,0,1,1,1,0,0,0,0,0,0,1,0,0,1,1,1,
1,0,1,0,1,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,1)
which(x == 1)
#[1] 1 15 62 63 67 88 89 91 98 102 103 104 111 114 115 116 117 119 121 125 127 145 150 160 164 166
window <- 5
#run length encoding
y <- rle(x)
#Run Length Encoding
# lengths: int [1:37] 1 13 1 46 2 3 1 20 2 1 ...
# values : num [1:37] 1 0 1 0 1 0 1 0 1 0 ...
#if run length for zeros is smaller than window replace with 1
y$values[(y$values == 0) & (y$lengths < window)] <- 1
#combine runs of ones
y <- rle(inverse.rle(y))
start <- cumsum(y$lengths)[y$values == 1] - y$lengths[y$values == 1] + 1
#[1] 1 15 62 88 98 111 145 160
end <- cumsum(y$lengths)[y$values == 1]
#[1] 1 15 67 91 104 127 150 166
Post a Comment for ""grouped/clustered" Regions In Vector In R/python"