2.3.2. Reordering Columns and Rows
1. Introduction
Reordering columns and rows is a common data wrangling task. It helps present your data in a logical, readable, and analysis-friendly format. In R, you can easily reorder columns using select() and reorder rows using arrange() from the dplyr package.
2. Why Reorder Columns and Rows?
- Improve data readability and presentation.
- Place important variables at the front or group related columns together.
- Sort data for reporting, visualization, or further analysis.
- Prepare data for merging or exporting.
3. Reordering Columns with select()
- The
select()function can reorder columns by specifying the desired order. - You can also rename columns and select only the columns you need.
Input Table:
| USUBJID | AGE | SEX | RACE | ARM | COUNTRY |
|---|---|---|---|---|---|
| 01-701-101 | 34 | M | WHITE | Placebo | USA |
| 01-701-102 | 29 | F | ASIAN | Drug X | USA |
| 01-701-103 | 67 | F | BLACK | Placebo | CAN |
| 01-701-104 | 54 | M | WHITE | Drug X | USA |
R Code:
# Dummy data for dm dataframe
dm <- data.frame(
USUBJID = c("01-701-101", "01-701-102", "01-701-103", "01-701-104"),
AGE = c(34, 29, 67, 54),
SEX = c("M", "F", "F", "M"),
RACE = c("WHITE", "ASIAN", "BLACK", "WHITE"),
ARM = c("Placebo", "Drug X", "Placebo", "Drug X"),
COUNTRY = c("USA", "USA", "CAN", "USA")
)
# Dummy data for ae dataframe
ae <- data.frame(
USUBJID = c("01-701-101", "01-701-102"),
AESEQ = c(1, 1),
AETERM = c("Headache", "Nausea"),
AESER = c("N", "Y"),
AEOUT = c("RECOV", "RECOV"),
AGE = c(34, 29)
)
dm %>%
select(USUBJID, ARM, SEX, AGE, COUNTRY, RACE)
Output Table:
| USUBJID | ARM | SEX | AGE | COUNTRY | RACE |
|---|---|---|---|---|---|
| 01-701-101 | Placebo | M | 34 | USA | WHITE |
| 01-701-102 | Drug X | F | 29 | USA | ASIAN |
| 01-701-103 | Placebo | F | 67 | CAN | BLACK |
| 01-701-104 | Drug X | M | 54 | USA | WHITE |
- Explanation:
- The columns are reordered as specified in
select(). - You can place any column in any position, and omit columns you don't want.
- The columns are reordered as specified in
4. Reordering Rows with arrange()
- The
arrange()function sorts rows by the values of one or more columns. - By default, sorting is ascending (smallest to largest).
Input Table:
| USUBJID | ARM | SEX | AGE | COUNTRY | RACE |
|---|---|---|---|---|---|
| 01-701-101 | Placebo | M | 34 | USA | WHITE |
| 01-701-102 | Drug X | F | 29 | USA | ASIAN |
| 01-701-103 | Placebo | F | 67 | CAN | BLACK |
| 01-701-104 | Drug X | M | 54 | USA | WHITE |
R Code:
dm %>%
arrange(AGE)
Output Table:
| USUBJID | ARM | SEX | AGE | COUNTRY | RACE |
|---|---|---|---|---|---|
| 01-701-102 | Drug X | F | 29 | USA | ASIAN |
| 01-701-101 | Placebo | M | 34 | USA | WHITE |
| 01-701-104 | Drug X | M | 54 | USA | WHITE |
| 01-701-103 | Placebo | F | 67 | CAN | BLACK |
- Explanation:
- Rows are sorted by
AGEin ascending order.
- Rows are sorted by
5. Sorting in Descending Order
- Use
desc()insidearrange()to sort from largest to smallest.
R Code:
dm %>%
arrange(desc(AGE))
Output Table:
| USUBJID | ARM | SEX | AGE | COUNTRY | RACE |
|---|---|---|---|---|---|
| 01-701-103 | Placebo | F | 67 | CAN | BLACK |
| 01-701-104 | Drug X | M | 54 | USA | WHITE |
| 01-701-101 | Placebo | M | 34 | USA | WHITE |
| 01-701-102 | Drug X | F | 29 | USA | ASIAN |
- Explanation:
desc(AGE)sorts rows byAGEin descending order.
6. Sorting by Multiple Columns
- You can sort by more than one column by listing them in
arrange(). - Useful for breaking ties or sorting within groups.
R Code:
dm %>%
arrange(ARM, desc(AGE))
Output Table:
| USUBJID | ARM | SEX | AGE | COUNTRY | RACE |
|---|---|---|---|---|---|
| 01-701-104 | Drug X | M | 54 | USA | WHITE |
| 01-701-102 | Drug X | F | 29 | USA | ASIAN |
| 01-701-103 | Placebo | F | 67 | CAN | BLACK |
| 01-701-101 | Placebo | M | 34 | USA | WHITE |
- Explanation:
- First sorts by
ARMalphabetically, then within each ARM byAGEdescending.
- First sorts by
7. Sorting Character Columns
arrange()can also sort character columns alphabetically.
R Code:
dm %>%
arrange(USUBJID)
Output Table:
| USUBJID | ARM | SEX | AGE | COUNTRY | RACE |
|---|---|---|---|---|---|
| 01-701-101 | Placebo | M | 34 | USA | WHITE |
| 01-701-102 | Drug X | F | 29 | USA | ASIAN |
| 01-701-103 | Placebo | F | 67 | CAN | BLACK |
| 01-701-104 | Drug X | M | 54 | USA | WHITE |
- Explanation:
- Rows are sorted alphabetically by
USUBJID.
- Rows are sorted alphabetically by
8. Beyond the Basics
Reordering columns and rows can be even more powerful and flexible using advanced dplyr features. Here are some techniques with detailed explanations and SDTM-style examples:
- Reorder columns with helper functions
Use
everything(),starts_with(),ends_with(),contains(), etc., insideselect()to programmatically order columns.Input Table:
USUBJID AESEQ AETERM AESER AGE ARM 01-701-101 1 Headache N 34 Placebo 01-701-102 1 Nausea Y 29 Drug X R Code:
ae %>% select(USUBJID, starts_with("AE"), everything())Output Table:
USUBJID AESEQ AETERM AESER AGE ARM 01-701-101 1 Headache N 34 Placebo 01-701-102 1 Nausea Y 29 Drug X Explanation:
- All columns starting with "AE" are moved right after USUBJID, followed by the rest.
- Sort within groups
Use
group_by()witharrange()to sort rows within each group/category.Input Table:
USUBJID ARM VISIT AGE 01-701-101 Placebo 2 34 01-701-101 Placebo 1 34 01-701-102 Drug X 2 29 01-701-102 Drug X 1 29 R Code:
dm %>% group_by(USUBJID) %>% arrange(USUBJID, VISIT, .by_group = TRUE)Output Table:
USUBJID ARM VISIT AGE 01-701-101 Placebo 1 34 01-701-101 Placebo 2 34 01-701-102 Drug X 1 29 01-701-102 Drug X 2 29 Explanation:
- Rows are sorted by VISIT within each subject.
- Custom sorting with factor()
Use
factor()to define a custom order for categorical variables before arranging.Input Table:
USUBJID ARM AGE 01-701-101 Placebo 34 01-701-102 Drug X 29 01-701-103 Placebo 67 01-701-104 Drug X 54 R Code:
dm %>% mutate(ARM = factor(ARM, levels = c("Drug X", "Placebo"))) %>% arrange(ARM)Output Table:
USUBJID ARM AGE 01-701-102 Drug X 29 01-701-104 Drug X 54 01-701-101 Placebo 34 01-701-103 Placebo 67 Explanation:
- Rows are sorted so "Drug X" comes before "Placebo", regardless of alphabetical order.
- Reorder columns programmatically with relocate()
Use
relocate()to move columns to specific positions without rewriting the full column list.Input Table:
USUBJID AGE SEX ARM COUNTRY 01-701-101 34 M Placebo USA 01-701-102 29 F Drug X USA R Code:
dm %>% relocate(AGE, .after = USUBJID)Output Table:
USUBJID AGE SEX ARM COUNTRY 01-701-101 34 M Placebo USA 01-701-102 29 F Drug X USA Explanation:
- Moves the AGE column immediately after USUBJID.
- Reorder columns using select helpers for partial matches
Use
contains()orends_with()to move columns matching a pattern.Input Table:
USUBJID AESEQ AETERM AESER AEOUT AGE 01-701-101 1 Headache N RECOV 34 R Code:
ae %>% select(USUBJID, contains("AE"), everything())Output Table:
USUBJID AESEQ AETERM AESER AEOUT AGE 01-701-101 1 Headache N RECOV 34 Explanation:
- All columns containing "AE" are grouped together after USUBJID.
Summary of Advanced Reordering:
- Use select helpers (
starts_with,ends_with,contains,everything) for flexible column selection and ordering. - Use
relocate()for moving columns without rewriting the full column list. - Use
group_by()andarrange()for sorting within groups. - Use
factor()for custom sort orders on categorical variables.
**Resource download links**
2.3.2.-Reordering-Columns-and-Rows.zip