contact@a2zlearners.com

2.3.2. Reordering Columns and Rows

1. Introduction

Reordering columns and rows is a common data wrangling task. It helps present your data in a logical, readable, and analysis-friendly format. In R, you can easily reorder columns using select() and reorder rows using arrange() from the dplyr package.


2. Why Reorder Columns and Rows?

  • Improve data readability and presentation.
  • Place important variables at the front or group related columns together.
  • Sort data for reporting, visualization, or further analysis.
  • Prepare data for merging or exporting.

3. Reordering Columns with select()

  • The select() function can reorder columns by specifying the desired order.
  • You can also rename columns and select only the columns you need.

Input Table:

USUBJID AGE SEX RACE ARM COUNTRY
01-701-101 34 M WHITE Placebo USA
01-701-102 29 F ASIAN Drug X USA
01-701-103 67 F BLACK Placebo CAN
01-701-104 54 M WHITE Drug X USA

R Code:

# Dummy data for dm dataframe
dm <- data.frame(
  USUBJID = c("01-701-101", "01-701-102", "01-701-103", "01-701-104"),
  AGE = c(34, 29, 67, 54),
  SEX = c("M", "F", "F", "M"),
  RACE = c("WHITE", "ASIAN", "BLACK", "WHITE"),
  ARM = c("Placebo", "Drug X", "Placebo", "Drug X"),
  COUNTRY = c("USA", "USA", "CAN", "USA")
)

# Dummy data for ae dataframe
ae <- data.frame(
  USUBJID = c("01-701-101", "01-701-102"),
  AESEQ = c(1, 1),
  AETERM = c("Headache", "Nausea"),
  AESER = c("N", "Y"),
  AEOUT = c("RECOV", "RECOV"),
  AGE = c(34, 29)
)

dm %>%
  select(USUBJID, ARM, SEX, AGE, COUNTRY, RACE)

Output Table:

USUBJID ARM SEX AGE COUNTRY RACE
01-701-101 Placebo M 34 USA WHITE
01-701-102 Drug X F 29 USA ASIAN
01-701-103 Placebo F 67 CAN BLACK
01-701-104 Drug X M 54 USA WHITE
  • Explanation:
    • The columns are reordered as specified in select().
    • You can place any column in any position, and omit columns you don't want.

4. Reordering Rows with arrange()

  • The arrange() function sorts rows by the values of one or more columns.
  • By default, sorting is ascending (smallest to largest).

Input Table:

USUBJID ARM SEX AGE COUNTRY RACE
01-701-101 Placebo M 34 USA WHITE
01-701-102 Drug X F 29 USA ASIAN
01-701-103 Placebo F 67 CAN BLACK
01-701-104 Drug X M 54 USA WHITE

R Code:

dm %>%
  arrange(AGE)

Output Table:

USUBJID ARM SEX AGE COUNTRY RACE
01-701-102 Drug X F 29 USA ASIAN
01-701-101 Placebo M 34 USA WHITE
01-701-104 Drug X M 54 USA WHITE
01-701-103 Placebo F 67 CAN BLACK
  • Explanation:
    • Rows are sorted by AGE in ascending order.

5. Sorting in Descending Order

  • Use desc() inside arrange() to sort from largest to smallest.

R Code:

dm %>%
  arrange(desc(AGE))

Output Table:

USUBJID ARM SEX AGE COUNTRY RACE
01-701-103 Placebo F 67 CAN BLACK
01-701-104 Drug X M 54 USA WHITE
01-701-101 Placebo M 34 USA WHITE
01-701-102 Drug X F 29 USA ASIAN
  • Explanation:
    • desc(AGE) sorts rows by AGE in descending order.

6. Sorting by Multiple Columns

  • You can sort by more than one column by listing them in arrange().
  • Useful for breaking ties or sorting within groups.

R Code:

dm %>%
  arrange(ARM, desc(AGE))

Output Table:

USUBJID ARM SEX AGE COUNTRY RACE
01-701-104 Drug X M 54 USA WHITE
01-701-102 Drug X F 29 USA ASIAN
01-701-103 Placebo F 67 CAN BLACK
01-701-101 Placebo M 34 USA WHITE
  • Explanation:
    • First sorts by ARM alphabetically, then within each ARM by AGE descending.

7. Sorting Character Columns

  • arrange() can also sort character columns alphabetically.

R Code:

dm %>%
  arrange(USUBJID)

Output Table:

USUBJID ARM SEX AGE COUNTRY RACE
01-701-101 Placebo M 34 USA WHITE
01-701-102 Drug X F 29 USA ASIAN
01-701-103 Placebo F 67 CAN BLACK
01-701-104 Drug X M 54 USA WHITE
  • Explanation:
    • Rows are sorted alphabetically by USUBJID.

8. Beyond the Basics

Reordering columns and rows can be even more powerful and flexible using advanced dplyr features. Here are some techniques with detailed explanations and SDTM-style examples:

  • Reorder columns with helper functions
    • Use everything(), starts_with(), ends_with(), contains(), etc., inside select() to programmatically order columns.

    • Input Table:

      USUBJID AESEQ AETERM AESER AGE ARM
      01-701-101 1 Headache N 34 Placebo
      01-701-102 1 Nausea Y 29 Drug X
    • R Code:

      ae %>% select(USUBJID, starts_with("AE"), everything())
      
    • Output Table:

      USUBJID AESEQ AETERM AESER AGE ARM
      01-701-101 1 Headache N 34 Placebo
      01-701-102 1 Nausea Y 29 Drug X
    • Explanation:

      • All columns starting with "AE" are moved right after USUBJID, followed by the rest.

  • Sort within groups
    • Use group_by() with arrange() to sort rows within each group/category.

    • Input Table:

      USUBJID ARM VISIT AGE
      01-701-101 Placebo 2 34
      01-701-101 Placebo 1 34
      01-701-102 Drug X 2 29
      01-701-102 Drug X 1 29
    • R Code:

      dm %>%
        group_by(USUBJID) %>%
        arrange(USUBJID, VISIT, .by_group = TRUE)
      
    • Output Table:

      USUBJID ARM VISIT AGE
      01-701-101 Placebo 1 34
      01-701-101 Placebo 2 34
      01-701-102 Drug X 1 29
      01-701-102 Drug X 2 29
    • Explanation:

      • Rows are sorted by VISIT within each subject.

  • Custom sorting with factor()
    • Use factor() to define a custom order for categorical variables before arranging.

    • Input Table:

      USUBJID ARM AGE
      01-701-101 Placebo 34
      01-701-102 Drug X 29
      01-701-103 Placebo 67
      01-701-104 Drug X 54
    • R Code:

      dm %>%
        mutate(ARM = factor(ARM, levels = c("Drug X", "Placebo"))) %>%
        arrange(ARM)
      
    • Output Table:

      USUBJID ARM AGE
      01-701-102 Drug X 29
      01-701-104 Drug X 54
      01-701-101 Placebo 34
      01-701-103 Placebo 67
    • Explanation:

      • Rows are sorted so "Drug X" comes before "Placebo", regardless of alphabetical order.

  • Reorder columns programmatically with relocate()
    • Use relocate() to move columns to specific positions without rewriting the full column list.

    • Input Table:

      USUBJID AGE SEX ARM COUNTRY
      01-701-101 34 M Placebo USA
      01-701-102 29 F Drug X USA
    • R Code:

      dm %>% relocate(AGE, .after = USUBJID)
      
    • Output Table:

      USUBJID AGE SEX ARM COUNTRY
      01-701-101 34 M Placebo USA
      01-701-102 29 F Drug X USA
    • Explanation:

      • Moves the AGE column immediately after USUBJID.

  • Reorder columns using select helpers for partial matches
    • Use contains() or ends_with() to move columns matching a pattern.

    • Input Table:

      USUBJID AESEQ AETERM AESER AEOUT AGE
      01-701-101 1 Headache N RECOV 34
    • R Code:

      ae %>% select(USUBJID, contains("AE"), everything())
      
    • Output Table:

      USUBJID AESEQ AETERM AESER AEOUT AGE
      01-701-101 1 Headache N RECOV 34
    • Explanation:

      • All columns containing "AE" are grouped together after USUBJID.

Summary of Advanced Reordering:

  • Use select helpers (starts_with, ends_with, contains, everything) for flexible column selection and ordering.
  • Use relocate() for moving columns without rewriting the full column list.
  • Use group_by() and arrange() for sorting within groups.
  • Use factor() for custom sort orders on categorical variables.

**Resource download links**

2.3.2.-Reordering-Columns-and-Rows.zip