1.3.2. Pipe Operators in R

1.3. Writing Code

Pipe operators allow you to write cleaner, more readable code by chaining operations together. Instead of nesting functions or storing intermediate results, you can express a sequence of actions as a straightforward flow. Pipes are especially useful for data transformation and analysis workflows.

1. `%>%` — magrittr Pipe (from `{magrittr}` / `{dplyr}`)

Passes the result on the left as the first argument to the function on the right.
Enables readable, step-by-step pipelines.
Supports the use of the dot (.) as a placeholder for the left-hand side value, useful for functions where the first argument is not the main input.
Widely used in the tidyverse ecosystem.

Example:

library(dplyr)
labs <- tibble::tibble(
  patient_id = 1:5,
  status = c("active", "inactive", "active", "active", "inactive"),
  weight = c(60, 72, 85, 90, 78)
)
labs2 <- labs %>%
  filter(status == "active") %>%
  mutate(weight_kg = weight / 1000) %>%
  arrange(desc(weight_kg))

Input Table (labs):

patient_id	status	weight
1	active	60
2	inactive	72
3	active	85
4	active	90
5	inactive	78

Output Table (labs2):

patient_id	status	weight	weight_kg
4	active	90	0.09
3	active	85	0.085
1	active	60	0.06

Here, labs is filtered for active status, a new column is created, and the result is sorted—all in one readable chain.

Using the dot placeholder:

labs <- tibble::tibble(
  patient_id = 1:5,
  weight = c(60, 72, NA, 85, 90)
)

labs %>%
  select(weight) %>%
  sum(., na.rm = TRUE)

#Output:
[1] 307

#That’s the sum of: 60 + 72 + 85 + 90 Missing value (NA) is excluded due to na.rm = TRUE.

Input Table (labs):

patient_id	weight
1	60
2	72
3	NA
4	85
5	90

Output:
307

sum(., na.rm = TRUE): Sums the values from the previous result (the weight column), using the dot . to explicitly pass that vector as the first argument to sum().

Without %>%:

labs2 <- arrange(
           mutate(
             filter(labs, status == "active"),
             weight_kg = weight / 1000
           ),
           desc(weight_kg)
         )

Input Table (labs):

patient_id	status	weight
1	active	60
2	inactive	72
3	active	85
4	active	90
5	inactive	78

Output Table (labs2):

patient_id	status	weight	weight_kg
4	active	90	0.09
3	active	85	0.085
1	active	60	0.06

With intermediate steps:

step1 <- filter(labs, status == "active")
step2 <- mutate(step1, weight_kg = weight / 1000)
labs2 <- arrange(step2, desc(weight_kg))

Input Table (labs):

patient_id	status	weight
1	active	60
2	inactive	72
3	active	85
4	active	90
5	inactive	78

Intermediate Table (step1):

patient_id	status	weight
1	active	60
3	active	85
4	active	90

Intermediate Table (step2):

patient_id	status	weight	weight_kg
1	active	60	0.06
3	active	85	0.085
4	active	90	0.09

Output Table (labs2):

patient_id	status	weight	weight_kg
4	active	90	0.09
3	active	85	0.085
1	active	60	0.06

2. `|>` — Base R Pipe (from R 4.1.0 onward)

Passes the result on the left to the first argument of the function on the right.
No need for external packages.
Does not support the dot placeholder (.), but can use anonymous functions for more complex cases.
Slightly faster than %>% and integrates natively with base R.

Basic usage:

labs <- tibble(
  patient_id = 1:5,
  status = c("active", "inactive", "active", "active", "inactive"),
  weight = c(60, 72, 85, 90, 78)
)
labs2 <- labs |>
  subset(status == "active") |>
  head(3)

Input Table (labs):

patient_id	status	weight
1	active	60
2	inactive	72
3	active	85
4	active	90
5	inactive	78

Output Table (labs2):

patient_id	status	weight
1	active	60
3	active	85
4	active	90

With anonymous function for flexible argument placement:

to_upper_phrase <- function(x) {
  paste("Programming with", toupper(x))
}

"clinical" |> (function(x) to_upper_phrase(x))()

#OR

"clinical" |> (to_upper_phrase)()

Input:
"clinical"
Output:
"Programming with CLINICAL"

3. `%T>%` — Tee Pipe (from `magrittr`)

Executes a side effect (e.g., plotting, printing) and returns the original input to the next step in the pipeline.
Useful for debugging, logging, or visualization within a pipeline.

Example:

library(magrittr)
iris %>%
  { plot(.$Sepal.Length, .$Sepal.Width) } %T>%
  summary()

Input Table (iris):
(First 3 rows shown)

Sepal.Length	Sepal.Width	Petal.Length	Petal.Width	Species
5.1	3.5	1.4	0.2	setosa
4.9	3.0	1.4	0.2	setosa
4.7	3.2	1.3	0.2	setosa

Output:

A scatter plot of Sepal.Length vs Sepal.Width is displayed (side effect).
The summary of the iris data frame is returned.

4. `%<>%` — Compound Assignment Pipe (from `magrittr`)

Pipes and reassigns the result back into the original object.
Equivalent to x <- x %>% ...
Useful for updating objects in place.

Example:

library(magrittr)
x <- 1:5
x %<>% sqrt()
print(x)
# [1] 1.000000 1.414214 1.732051 2.000000 2.236068

Input:
x = 1:5
Output:
x = 1.000000, 1.414214, 1.732051, 2.000000, 2.236068

5. `%$%` — Exposition Pipe (from `magrittr`)

Exposes the variables of a data frame to the right-hand side expression, so you can refer to columns directly by name.
Useful for concise code in modeling or plotting.

Example:

library(magrittr)
iris %$%
  cor(Sepal.Length, Sepal.Width)
#[1] -0.1176

Input Table (iris):
(First 3 rows shown)

Sepal.Length	Sepal.Width	Petal.Length	Petal.Width	Species
5.1	3.5	1.4	0.2	setosa
4.9	3.0	1.4	0.2	setosa
4.7	3.2	1.3	0.2	setosa

Output:
[1] -0.1176

6. Error Handling Scenarios and Warnings with Pipes

Null or Missing Data:
If a pipe step returns NULL or an unexpected value, subsequent steps may fail or produce misleading results. Always check for missing or unexpected data before piping.

library(dplyr)
df <- data.frame(a = c(1, 2, NA), b = c(4, NA, 6))
df %>%
  filter(a > 1) %>%
  summarise(mean_b = mean(b, na.rm = TRUE))
# If filter returns 0 rows, summarise will fail or return NA.

Input Table (df):

a	b
1	4
2	NA
NA	6

Output Table after filter(a > 1):

a	b
2	NA
NA	6

Output Table after summarise(mean_b = mean(b, na.rm = TRUE)):

mean_b
6

Non-Standard Evaluation Pitfalls:
Some functions (especially in base R) do not work well with pipes due to non-standard evaluation or argument positions. Use anonymous functions or the dot placeholder (.) where needed.

# This will not work as expected:
c(1, 2, 3) %>% sum(na.rm = TRUE)
# Correct way using the dot placeholder:
c(1, 2, 3) %>% sum(., na.rm = TRUE)
# With base pipe, use anonymous function:
c(1, 2, 3) |> (\(x) sum(x, na.rm = TRUE))()

Input:
c(1, 2, 3)
Output:
6

Side Effects in Pipes:
Avoid relying on side effects (like printing or plotting) inside pipes unless using %T>%. Side effects can make debugging harder.

library(magrittr)
iris %>%
  { plot(.$Sepal.Length, .$Sepal.Width) } %T>%
  summary()
# Plot is created as a side effect, but summary is still returned.

Error Propagation:
If an error occurs in any step of the pipe, the entire pipeline fails. Use tryCatch() or purrr::possibly() for safer error handling in pipelines.

library(dplyr)
library(purrr)
safe_log <- possibly(log, otherwise = NA_real_)
c(-1, 0, 1) %>%
  map_dbl(safe_log)
# Returns NA for invalid log inputs instead of stopping the pipeline.

Input:
c(-1, 0, 1)
Output:
NA, -Inf, 0

Overly Long Chains:
Very long or complex pipes can be hard to debug. Break them into smaller steps or assign intermediate results for clarity.

# Hard to debug:
result <- df %>% filter(a > 1) %>% mutate(c = a + b) %>% group_by(c) %>% summarise(mean_b = mean(b))
# Easier to debug:
step1 <- filter(df, a > 1)
step2 <- mutate(step1, c = a + b)
result <- step2 %>% group_by(c) %>% summarise(mean_b = mean(b))

Assignment Confusion:
Remember that %>% does not reassign by default; use %<>% or explicit assignment if you want to update the original object.

library(magrittr)
x <- 1:5
x %>% sqrt()    # x is unchanged
x %<>% sqrt()   # x is updated in place

Input:
x = 1:5
Output after x %>% sqrt():
[1] 1.000000 1.414214 1.732051 2.000000 2.236068 (but x is unchanged)

Output after x %<>% sqrt():
x = 1.000000 1.414214 1.732051 2.000000 2.236068

Base Pipe (|>) Limitations:
The base R pipe does not support the dot placeholder (.). For complex argument placement, use anonymous functions.

# This will not work:
c(1, 2, 3) |> sum(., na.rm = TRUE) # Error
# Correct:
c(1, 2, 3) |> (\(x) sum(x, na.rm = TRUE))()

Data Masking:
Pipes may mask variables from the global environment, leading to unexpected results if variable names overlap. Be explicit with variable references when needed.

x <- 10
df <- data.frame(x = 1:3)
df %>% mutate(y = x + 1) # Uses df$x, not global x

Input Table (df):

x
1
2
3

Output Table:

x	y
1	2
2	3
3	4

Debugging:
Debugging inside pipes can be challenging. Use intermediate assignments or insert print()/str() calls with %T>% to inspect data at each stage.

library(magrittr)
df %>%
  filter(a > 1) %T>%
  { print(.) } %>%
  summarise(mean_b = mean(b, na.rm = TRUE))
# Prints intermediate result after filtering.

Input Table (df):

a	b
1	4
2	NA
NA	6

Output Table after filter(a > 1):

a	b
2	NA
NA	6

Output Table after summarise(mean_b = mean(b, na.rm = TRUE)):

mean_b
6

Comparison: Pipe Operator Features

Operator	Package	Supports Placeholder	Assignment	Side Effects	Exposes Columns	Base R	Example
`%>%`	magrittr	Yes (`.`)	No	No	No	No	`df %>% mutate(...)`
`\|>`	base R	No	No	No	No	Yes	`df \|> head()`
`%T>%`	magrittr	Yes	No	Yes	No	No	`df %T>% plot()`
`%<>%`	magrittr	Yes	Yes	No	No	No	`x %<>% sqrt()`
`%$%`	magrittr	N/A	No	No	Yes	No	`df %$% cor(x, y)`
`%\|>%`	pipeR	Yes	No	No	No	No	`x %\|>% f`

Resource download links

1.3.2.-Pipe-Operators-in-R.zip

⁂

1.3. Writing Code