contact@a2zlearners.com

1.3. Writing Code

1.3.2. Pipe Operators in R

Pipe operators allow you to write cleaner, more readable code by chaining operations together. Instead of nesting functions or storing intermediate results, you can express a sequence of actions as a straightforward flow. Pipes are especially useful for data transformation and analysis workflows.


1. %>% — magrittr Pipe (from {magrittr} / {dplyr})

  • Passes the result on the left as the first argument to the function on the right.
  • Enables readable, step-by-step pipelines.
  • Supports the use of the dot (.) as a placeholder for the left-hand side value, useful for functions where the first argument is not the main input.
  • Widely used in the tidyverse ecosystem.

Example:

library(dplyr)
labs <- tibble::tibble(
  patient_id = 1:5,
  status = c("active", "inactive", "active", "active", "inactive"),
  weight = c(60, 72, 85, 90, 78)
)
labs2 <- labs %>%
  filter(status == "active") %>%
  mutate(weight_kg = weight / 1000) %>%
  arrange(desc(weight_kg))

Input Table (labs):

patient_id status weight
1 active 60
2 inactive 72
3 active 85
4 active 90
5 inactive 78

Output Table (labs2):

patient_id status weight weight_kg
4 active 90 0.09
3 active 85 0.085
1 active 60 0.06
  • Here, labs is filtered for active status, a new column is created, and the result is sorted—all in one readable chain.

Using the dot placeholder:

labs <- tibble::tibble(
  patient_id = 1:5,
  weight = c(60, 72, NA, 85, 90)
)

labs %>%
  select(weight) %>%
  sum(., na.rm = TRUE)

#Output:
[1] 307

#That’s the sum of: 60 + 72 + 85 + 90 Missing value (NA) is excluded due to na.rm = TRUE.

Input Table (labs):

patient_id weight
1 60
2 72
3 NA
4 85
5 90

Output:
307

  • sum(., na.rm = TRUE): Sums the values from the previous result (the weight column), using the dot . to explicitly pass that vector as the first argument to sum().

Without %>%:

labs2 <- arrange(
           mutate(
             filter(labs, status == "active"),
             weight_kg = weight / 1000
           ),
           desc(weight_kg)
         )

Input Table (labs):

patient_id status weight
1 active 60
2 inactive 72
3 active 85
4 active 90
5 inactive 78

Output Table (labs2):

patient_id status weight weight_kg
4 active 90 0.09
3 active 85 0.085
1 active 60 0.06

With intermediate steps:

step1 <- filter(labs, status == "active")
step2 <- mutate(step1, weight_kg = weight / 1000)
labs2 <- arrange(step2, desc(weight_kg))

Input Table (labs):

patient_id status weight
1 active 60
2 inactive 72
3 active 85
4 active 90
5 inactive 78

Intermediate Table (step1):

patient_id status weight
1 active 60
3 active 85
4 active 90

Intermediate Table (step2):

patient_id status weight weight_kg
1 active 60 0.06
3 active 85 0.085
4 active 90 0.09

Output Table (labs2):

patient_id status weight weight_kg
4 active 90 0.09
3 active 85 0.085
1 active 60 0.06

2. |> — Base R Pipe (from R 4.1.0 onward)

  • Passes the result on the left to the first argument of the function on the right.
  • No need for external packages.
  • Does not support the dot placeholder (.), but can use anonymous functions for more complex cases.
  • Slightly faster than %>% and integrates natively with base R.

Basic usage:

labs <- tibble(
  patient_id = 1:5,
  status = c("active", "inactive", "active", "active", "inactive"),
  weight = c(60, 72, 85, 90, 78)
)
labs2 <- labs |>
  subset(status == "active") |>
  head(3)

Input Table (labs):

patient_id status weight
1 active 60
2 inactive 72
3 active 85
4 active 90
5 inactive 78

Output Table (labs2):

patient_id status weight
1 active 60
3 active 85
4 active 90

With anonymous function for flexible argument placement:

to_upper_phrase <- function(x) {
  paste("Programming with", toupper(x))
}

"clinical" |> (function(x) to_upper_phrase(x))()

#OR

"clinical" |> (to_upper_phrase)()

Input:
"clinical"
Output:
"Programming with CLINICAL"


3. %T>% — Tee Pipe (from magrittr)

  • Executes a side effect (e.g., plotting, printing) and returns the original input to the next step in the pipeline.
  • Useful for debugging, logging, or visualization within a pipeline.

Example:

library(magrittr)
iris %>%
  { plot(.$Sepal.Length, .$Sepal.Width) } %T>%
  summary()

Input Table (iris):
(First 3 rows shown)

Sepal.Length Sepal.Width Petal.Length Petal.Width Species
5.1 3.5 1.4 0.2 setosa
4.9 3.0 1.4 0.2 setosa
4.7 3.2 1.3 0.2 setosa

Output:

  • A scatter plot of Sepal.Length vs Sepal.Width is displayed (side effect).
  • The summary of the iris data frame is returned.

4. %<>% — Compound Assignment Pipe (from magrittr)

  • Pipes and reassigns the result back into the original object.
  • Equivalent to x <- x %>% ...
  • Useful for updating objects in place.

Example:

library(magrittr)
x <- 1:5
x %<>% sqrt()
print(x)
# [1] 1.000000 1.414214 1.732051 2.000000 2.236068

Input:
x = 1:5
Output:
x = 1.000000, 1.414214, 1.732051, 2.000000, 2.236068


5. %$% — Exposition Pipe (from magrittr)

  • Exposes the variables of a data frame to the right-hand side expression, so you can refer to columns directly by name.
  • Useful for concise code in modeling or plotting.

Example:

library(magrittr)
iris %$%
  cor(Sepal.Length, Sepal.Width)
#[1] -0.1176

Input Table (iris):
(First 3 rows shown)

Sepal.Length Sepal.Width Petal.Length Petal.Width Species
5.1 3.5 1.4 0.2 setosa
4.9 3.0 1.4 0.2 setosa
4.7 3.2 1.3 0.2 setosa

Output:
[1] -0.1176


6. Error Handling Scenarios and Warnings with Pipes

  • Null or Missing Data:
    If a pipe step returns NULL or an unexpected value, subsequent steps may fail or produce misleading results. Always check for missing or unexpected data before piping.
library(dplyr)
df <- data.frame(a = c(1, 2, NA), b = c(4, NA, 6))
df %>%
  filter(a > 1) %>%
  summarise(mean_b = mean(b, na.rm = TRUE))
# If filter returns 0 rows, summarise will fail or return NA.

Input Table (df):

a b
1 4
2 NA
NA 6

Output Table after filter(a > 1):

a b
2 NA
NA 6

Output Table after summarise(mean_b = mean(b, na.rm = TRUE)):

mean_b
6
  • Non-Standard Evaluation Pitfalls:
    Some functions (especially in base R) do not work well with pipes due to non-standard evaluation or argument positions. Use anonymous functions or the dot placeholder (.) where needed.
# This will not work as expected:
c(1, 2, 3) %>% sum(na.rm = TRUE)
# Correct way using the dot placeholder:
c(1, 2, 3) %>% sum(., na.rm = TRUE)
# With base pipe, use anonymous function:
c(1, 2, 3) |> (\(x) sum(x, na.rm = TRUE))()

Input:
c(1, 2, 3)
Output:
6

  • Side Effects in Pipes:
    Avoid relying on side effects (like printing or plotting) inside pipes unless using %T>%. Side effects can make debugging harder.
library(magrittr)
iris %>%
  { plot(.$Sepal.Length, .$Sepal.Width) } %T>%
  summary()
# Plot is created as a side effect, but summary is still returned.
  • Error Propagation:
    If an error occurs in any step of the pipe, the entire pipeline fails. Use tryCatch() or purrr::possibly() for safer error handling in pipelines.
library(dplyr)
library(purrr)
safe_log <- possibly(log, otherwise = NA_real_)
c(-1, 0, 1) %>%
  map_dbl(safe_log)
# Returns NA for invalid log inputs instead of stopping the pipeline.

Input:
c(-1, 0, 1)
Output:
NA, -Inf, 0

  • Overly Long Chains:
    Very long or complex pipes can be hard to debug. Break them into smaller steps or assign intermediate results for clarity.
# Hard to debug:
result <- df %>% filter(a > 1) %>% mutate(c = a + b) %>% group_by(c) %>% summarise(mean_b = mean(b))
# Easier to debug:
step1 <- filter(df, a > 1)
step2 <- mutate(step1, c = a + b)
result <- step2 %>% group_by(c) %>% summarise(mean_b = mean(b))
  • Assignment Confusion:
    Remember that %>% does not reassign by default; use %<>% or explicit assignment if you want to update the original object.
library(magrittr)
x <- 1:5
x %>% sqrt()    # x is unchanged
x %<>% sqrt()   # x is updated in place

Input:
x = 1:5
Output after x %>% sqrt():
[1] 1.000000 1.414214 1.732051 2.000000 2.236068 (but x is unchanged)

Output after x %<>% sqrt():
x = 1.000000 1.414214 1.732051 2.000000 2.236068

  • Base Pipe (|>) Limitations:
    The base R pipe does not support the dot placeholder (.). For complex argument placement, use anonymous functions.
# This will not work:
c(1, 2, 3) |> sum(., na.rm = TRUE) # Error
# Correct:
c(1, 2, 3) |> (\(x) sum(x, na.rm = TRUE))()
  • Data Masking:
    Pipes may mask variables from the global environment, leading to unexpected results if variable names overlap. Be explicit with variable references when needed.
x <- 10
df <- data.frame(x = 1:3)
df %>% mutate(y = x + 1) # Uses df$x, not global x

Input Table (df):

x
1
2
3

Output Table:

x y
1 2
2 3
3 4
  • Debugging:
    Debugging inside pipes can be challenging. Use intermediate assignments or insert print()/str() calls with %T>% to inspect data at each stage.
library(magrittr)
df %>%
  filter(a > 1) %T>%
  { print(.) } %>%
  summarise(mean_b = mean(b, na.rm = TRUE))
# Prints intermediate result after filtering.

Input Table (df):

a b
1 4
2 NA
NA 6

Output Table after filter(a > 1):

a b
2 NA
NA 6

Output Table after summarise(mean_b = mean(b, na.rm = TRUE)):

mean_b
6

Comparison: Pipe Operator Features

Operator Package Supports Placeholder Assignment Side Effects Exposes Columns Base R Example
%>% magrittr Yes (.) No No No No df %>% mutate(...)
|> base R No No No No Yes df |> head()
%T>% magrittr Yes No Yes No No df %T>% plot()
%<>% magrittr Yes Yes No No No x %<>% sqrt()
%$% magrittr N/A No No Yes No df %$% cor(x, y)
%|>% pipeR Yes No No No No x %|>% f

**Resource download links**

1.3.2.-Pipe-Operators-in-R.zip