A multidimensional "which" function

update Henrik Bengtsson commented that which(x, arr.ind=TRUE) gives the same result, rendering the blog below academic (thanks for the comment!). So, for academic interest, I'll leave it. In my defense, I implemented this kind of functionality in C some time ago, so I did not even think about RTFM. (So there's a lesson in this blog after all 😉 )

The well-known which function accepts a logical vector and returns the indices where its value equals TRUE. Actually, which also accepts matrices or multidimensional arrays. Internally, R uses a single index to run through such two- or higher-dimensional structures, in a column-first fashion. This is easy for computers, but for us poor humans it is less readable. The following function gives the multi-index of TRUE values into any d1 x d2 x ... x dn - dimensional array.

# A which for multidimensional arrays.
# Mark van der Loo 16.09.2011
#
# A Array of booleans
# returns a sum(A) x length(dim(A)) array of multi-indices where A == TRUE
#
multi.which <- function(A){
if ( is.vector(A) ) return(which(A))
d <- dim(A)
T <- which(A) - 1
nd <- length(d)
t( sapply(T, function(t){
I <- integer(nd)
I <- t %% d
sapply(2:nd, function(j){
I[j] <<- (t %/% prod(d[1:(j-1)])) %% d[j]
})
I
}) + 1 )
}

For example. Let's create a 2x3x2 logical array (2 rows, three columns, and this structure times 2):

> set.seed(1)
> set.seed(1)
> (B <- array(sample(c(TRUE,FALSE),12,replace=TRUE),dim=c(2,3,2)) )
, , 1

[,1]  [,2]  [,3]
[1,] TRUE FALSE  TRUE
[2,] TRUE FALSE FALSE

, , 2

[,1]  [,2] [,3]
[1,] FALSE FALSE TRUE
[2,] FALSE  TRUE TRUE

The standard which function gives 1-dimensional indices:

> which(B)
  1  2  5 10 11 12

If you don't need to see the result, this is fine. However, sometimes it is convenient to have the multi-index available. For example, the element in the first row of the first column of the first matrix of B equals TRUE. That is, element (1,1,1). The multi.which function returns all multi-indices where coefficients are TRUE:

> multi.which(B)
[,1] [,2] [,3]
[1,]    1    1    1
[2,]    2    1    1
[3,]    1    3    1
[4,]    2    2    2
[5,]    1    3    2
[6,]    2    3    2

The result is a 2-dimensional array, where each row is a single multi-index. You can check the last row by confirming that the second row of the third column of the second matrix indeed has coefficient TRUE. As noted, the function works for any multidimensional array (including vectors and matrices).

So, how does it all work? I will just give the basic equation here, but see this paper for a more thorough description and the inverse relation. Basically, you can regard the multi-index as a positional number system, with the first index running fastest. (Remember, that our decimal notation system is a positional system, but with the first number running slowest).

Denote the single index in a d1 x d2 x ... x dn - dimensional array with t. The multi-index I may be written as

where

and the product equals 1 if j=1. The symbols div and mod stand for integer division and remainder upon division. This equation assumes base 0 indexing, meaning that both the single and multi-indexing start at 0. Since R uses base 1 indexing, the first and 11th line in multi.which first subtract, then add one to correct for this.

This entry was posted in R and tagged . Bookmark the permalink.

1 Response to A multidimensional "which" function

1. Henrik Bengtsson says:

Found this post via the R-bloggers feed:

I don't want to take away your efforts, and I think I did the same fruitful exercise early on, but sometime reading the help pages will save you lots of time 😉 So...

which(B, arr.ind=TRUE)