Extracting Columns from a Tibble

February 18, 2020    tibble

This is both a follow up post to my earlier post tibbles, data frames and vectors and a query that came up at work.

The central question was how can you extract a column from a tibble as part of a pipe. That might sound straight forward but let’s take a look.

# load packages and create a tibble to work with
library(tibble)
library(dplyr)
test = tibble(id = c("a", "b", "a"), val = c(1, 2, 3))
test
# A tibble: 3 × 2
  id      val
  <chr> <dbl>
1 a         1
2 b         2
3 a         3

Let’s run a quick filter() and then select() the values.

test %>%
  filter(id == "a") %>%
  select(val)
# A tibble: 2 × 1
    val
  <dbl>
1     1
2     3

But as we saw in my previous post, this is still a tibble, so if we pass it to something expecting a vector it doesn’t work.

test %>%
  filter(id == "a") %>%
  select(val) %>%
  mean()
Warning in mean.default(.): argument is not numeric or logical: returning NA
[1] NA

The Best Solution

The best solution to this problem is to use the pull() function. I missed this function when it first came out in dplyr 0.7.0 but now use it a lot.

# pull() gives us a vector 
test %>%
  filter(id == "a") %>%
  pull(val) 
[1] 1 3
# now we don't get an error
test %>%
  filter(id == "a") %>%
  pull(val) %>%
  mean()
[1] 2

Other Solutions

As with many, many things in R there are alternatives and for the moment we are just going to continue to work with a pipe. Some neat alternatives come from a special type of function called an infix function. You use these all the time, + and %>% being examples. These infix functions sit between arguments x + y but they can also be used in a more traditional f(x,y) manner. To use them in this way you need to use backticks so 2^3 = 8 becomes `^`(2,3) = 8 :

# Power
2 ^ 3
[1] 8
`^`(2,3)
[1] 8

This means that we can use [[]] and $ like this too. Take our tibble.

test$val
[1] 1 2 3
`$`(test, val)
[1] 1 2 3

or

test[[2]]
[1] 1 2 3
`[[`(test, 2)
[1] 1 2 3

This means we can add an infix function to the end of our pipe.

test %>%
  filter(id == "a") %>%
  `[[`(2)
[1] 1 3
test %>%
  filter(id == "a") %>%
  `$`(val)
[1] 1 3

Finally, we could just use getElement().

test %>%
  filter(id == "a") %>%
  getElement("val")
[1] 1 3