This is both a follow up post to my earlier post tibbles, data frames and vectors and a query that came up at work.
The central question was how can you extract a column from a tibble as part of a pipe. That might sound straight forward but let’s take a look.
# load packages and create a tibble to work with
library(tibble)
library(dplyr)
test = tibble(id = c("a", "b", "a"), val = c(1, 2, 3))
test
# A tibble: 3 × 2
id val
<chr> <dbl>
1 a 1
2 b 2
3 a 3
Let’s run a quick filter()
and then select()
the values.
test %>%
filter(id == "a") %>%
select(val)
# A tibble: 2 × 1
val
<dbl>
1 1
2 3
But as we saw in my previous post, this is still a tibble, so if we pass it to something expecting a vector it doesn’t work.
test %>%
filter(id == "a") %>%
select(val) %>%
mean()
Warning in mean.default(.): argument is not numeric or logical: returning NA
[1] NA
The best solution to this problem is to use the pull()
function. I missed this function when it first came out in dplyr 0.7.0 but now use it a lot.
# pull() gives us a vector
test %>%
filter(id == "a") %>%
pull(val)
[1] 1 3
# now we don't get an error
test %>%
filter(id == "a") %>%
pull(val) %>%
mean()
[1] 2
As with many, many things in R
there are alternatives and for the moment we are just going to continue to work with a pipe. Some neat alternatives come from a special type of function called an infix function. You use these all the time, +
and %>%
being examples. These infix functions sit between arguments x + y
but they can also be used in a more traditional f(x,y)
manner. To use them in this way you need to use backticks so 2^3 = 8
becomes `^`(2,3) = 8
:
# Power
2 ^ 3
[1] 8
`^`(2,3)
[1] 8
This means that we can use [[]]
and $
like this too. Take our tibble.
test$val
[1] 1 2 3
`$`(test, val)
[1] 1 2 3
or
test[[2]]
[1] 1 2 3
`[[`(test, 2)
[1] 1 2 3
This means we can add an infix function to the end of our pipe.
test %>%
filter(id == "a") %>%
`[[`(2)
[1] 1 3
test %>%
filter(id == "a") %>%
`$`(val)
[1] 1 3
Finally, we could just use getElement()
.
test %>%
filter(id == "a") %>%
getElement("val")
[1] 1 3