1.5. SAS PROCEDURES -> Equivalent in R
1.5.12. PROC EXPORT CSV file in SAS vs R equivalent
1. Basic CSV Export: Simple Data Output
| Feature | SAS (PROC EXPORT) |
R (write.csv) |
|---|---|---|
| Basic export | PROC EXPORT DATA=... OUTFILE=... DBMS=CSV |
write.csv(data, "file.csv") |
| Column separators | DELIMITER=',' |
sep = "," (default for write.csv) |
| Missing value | Empty for character, . for numeric |
na = "NA" (customizable) |
SAS Example
proc export data=sashelp.class
outfile="C:/output/class.csv"
dbms=csv replace;
run;
Explanation (SAS):
- Uses
PROC EXPORTwithdbms=csvto export data in CSV format outfile=specifies the destination file pathreplaceoverwrites any existing file with the same name- By default, uses comma as separator and outputs column names as header
- Missing values for character variables appear as empty, numeric as
.
R Example
# Dummy data for export
data <- data.frame(
Name = c("John", "Alice", "Bob"),
Age = c(25, 30, NA),
Height = c(180, 165, 175)
)
# Base R approach
write.csv(mtcars, "output/mtcars.csv", row.names = FALSE)
# readr approach
library(readr)
write_csv(mtcars, "output/mtcars.csv")
Explanation (R):
- Base R's
write.csv()is a wrapper forwrite.table()with CSV-specific defaults row.names = FALSEprevents writing row names as a separate column- Missing values are written as "NA" by default
- The
readrpackage'swrite_csv()offers better performance and more consistent behavior - Both functions use comma as the default separator and include column names as headers
2. Controlling Field Delimiters and Separators
| Feature | SAS | R |
|---|---|---|
| Field delimiter | DELIMITER='delimiter' |
sep = "delimiter" |
| Tab delimiter | DELIMITER='09'x |
write.table(sep="\t") or write_tsv() |
| Date format | PUTNAMES=YES |
col.names = TRUE (default) |
SAS Example
proc export data=sashelp.class
outfile="C:/output/class_tab.txt"
dbms=tab replace;
run;
/* Alternative with DBMS=DLM */
proc export data=sashelp.class
outfile="C:/output/class_pipe.txt"
dbms=dlm replace;
delimiter='|';
run;
Explanation (SAS):
dbms=tabcreates a tab-delimited file (alternative to CSV)dbms=dlmwithdelimiter='|'creates a pipe-delimited file- SAS recognizes several built-in formats (CSV, TAB, DLM)
- For tab delimiter, can use
DELIMITER='09'x(hexadecimal ASCII for tab)
R Example
# Dummy data for tab/pipe export
df <- data.frame(
Name = c("John", "Alice", "Bob"),
Age = c(25, NA, NA),
Height = c(180, 165, 175)
)
# Tab-delimited file
write.table(mtcars, "output/mtcars_tab.txt",
sep = "\t", row.names = FALSE)
# Using readr for pipe-delimited
library(readr)
write_delim(mtcars, "output/mtcars_pipe.txt",
delim = "|")
Explanation (R):
write.table()is the general-purpose function behindwrite.csv()sep = "\t"specifies tab as the delimiter- For pipe delimiter, use
sep = "|"or with readr:write_delim()withdelim = "|" readrpackage offers specialized functions likewrite_tsv()for tab-delimited files
Input Table Example
| Name | Age | Height |
|---|---|---|
| John | 25 | 180 |
| Alice | 30 | 165 |
| Bob | NA | 175 |
Expected Output (CSV format)
Name,Age,Height
John,25,180
Alice,30,165
Bob,NA,175
Expected Output (Pipe-delimited)
Name|Age|Height
John|25|180
Alice|30|165
Bob||175
3. Handling Missing Values and Quoting
| Feature | SAS | R |
|---|---|---|
| Missing value strings | Not directly configurable | na = "STRING" |
| Quoting | QUOTE option |
quote = TRUE and qmethod |
SAS Example
proc export data=sashelp.class
outfile="C:/output/class_quoted.csv"
dbms=csv replace;
putnames=yes;
quote=yes;
run;
Explanation (SAS):
quote=yestells SAS to enclose character variables in quotes- Missing values for numeric fields appear as
. - Missing values for character fields appear as empty strings
- No direct way to customize missing value markers in CSV output
R Example
# Dummy data with missing values
df_missing <- data.frame(
Name = c("John", "Alice", "\"Bob\""),
Age = c(25, NA, 28),
Height = c(180, 165, NA)
)
# Custom NA strings
write.csv(mtcars, "output/mtcars_missing.csv",
row.names = FALSE, na = "MISSING")
# Control quoting
write.csv(mtcars, "output/mtcars_quoted.csv",
row.names = FALSE, quote = TRUE)
Explanation (R):
na = "MISSING"replaces all NA values with the string "MISSING"quote = TRUE(default) places quotes around strings and factor levelsquote = FALSEsuppresses quoting- Can control which columns are quoted with
qmethodparameter
Input Table with Missing Values
| Name | Age | Height |
|---|---|---|
| John | 25 | 180 |
| Alice | NA | 165 |
| "Bob" | 28 | NA |
Expected Output (R with na="MISSING")
Name,Age,Height
John,25,180
Alice,MISSING,165
"Bob",28,MISSING
Expected Output (SAS with quote=yes)
"Name","Age","Height"
"John",25,180
"Alice",.,"165"
""Bob"",28,.
4. Controlling Headers and Row Names
| Feature | SAS | R |
|---|---|---|
| Header output | PUTNAMES=YES/NO |
col.names = TRUE/FALSE |
| Row names output | Not applicable | row.names = TRUE/FALSE |
SAS Example
proc export data=sashelp.class
outfile="C:/output/class_noheader.csv"
dbms=csv replace;
putnames=no;
run;
Explanation (SAS):
putnames=nosuppresses column headers in the output- SAS datasets do not have "row names" like R, so no equivalent option
- Default is
putnames=yeswhich includes headers
R Example
# Dummy data for header/row names
df_header <- data.frame(
mpg = c(21.0, 21.0, 22.8),
cyl = c(6, 6, 4),
disp = c(160, 160, 108)
)
rownames(df_header) <- c("Mazda RX4", "Mazda RX4 Wag", "Datsun 710")
# No column names
write.csv(mtcars, "output/mtcars_noheader.csv",
row.names = FALSE, col.names = FALSE)
# Include row names as first column
write.csv(mtcars, "output/mtcars_rownames.csv",
row.names = TRUE)
Explanation (R):
col.names = FALSEsuppresses the header rowrow.names = TRUE(default) includes row names as the first columnrow.names = FALSEomits row names from the output- R data frames have row names that can be exported as an unnamed first column
Input Data Frame in R
mtcars[1:3, 1:3] # First 3 rows, 3 columns with row names
| Row Name | mpg | cyl | disp |
|---|---|---|---|
| Mazda RX4 | 21.0 | 6 | 160 |
| Mazda RX4 Wag | 21.0 | 6 | 160 |
| Datsun 710 | 22.8 | 4 | 108 |
Expected Output (with row.names=TRUE)
"","mpg","cyl","disp"
"Mazda RX4",21,6,160
"Mazda RX4 Wag",21,6,160
"Datsun 710",22.8,4,108
Expected Output (with row.names=FALSE and col.names=FALSE)
21,6,160
21,6,160
22.8,4,108
5. Beyond Basics: Character Encoding and Locale Settings
| Feature | SAS | R |
|---|---|---|
| Character encoding | ENCODING="encoding" |
fileEncoding = "encoding" |
| Line endings | Platform-dependent | eol = "\n" or "\r\n" or "\r" |
SAS Example
proc export data=sashelp.class
outfile="C:/output/class_utf8.csv"
dbms=csv replace;
encoding="utf-8";
run;
Explanation (SAS):
encoding=option specifies the output file's character encoding- Common encodings: "utf-8", "latin1", "wlatin1"
- Line endings are platform-dependent by default
R Example
# Dummy data for encoding/line endings
df_enc <- data.frame(
Name = c("Jörg", "Anaïs", "李"),
Score = c(90, 85, 88)
)
# Set encoding and line endings
write.csv(mtcars, "output/mtcars_utf8.csv",
row.names = FALSE, fileEncoding = "UTF-8",
eol = "\r\n")
# readr with encoding
library(readr)
write_csv(mtcars, "output/mtcars_utf8_readr.csv",
eol = "\r\n") # UTF-8 is default in readr
Explanation (R):
fileEncodingcontrols the character encoding of the output fileeolparameter controls line endings: "\r\n" (Windows), "\n" (Unix), or "\r" (Mac)- readr's
write_csv()uses UTF-8 by default and can handle special characters better - Setting proper encoding is crucial for international character support
6. Beyond Basics: Advanced Output Formatting
| Feature | SAS | R |
|---|---|---|
| Date formatting | DATA step with formats before export |
format() before writing or format within write.csv() |
| Numeric precision | Not directly controllable in PROC EXPORT |
digits = n in write.table() |
| Column selection | Use DATA step view or KEEP= |
Select columns before export |
SAS Example
/* Format dates and numbers before export */
data export_ready;
set sashelp.class;
format height 5.1 weight 6.2 birth_date mmddyy10.;
run;
proc export data=export_ready
outfile="C:/output/class_formatted.csv"
dbms=csv replace;
run;
Explanation (SAS):
- SAS requires pre-formatting data before export
- Use
FORMATstatement in aDATAstep to apply formats - No direct control over output formatting in
PROC EXPORT - Need to create intermediate dataset with desired formats
R Example
# Dummy data for formatting
df_fmt <- data.frame(
Name = c("John", "Alice"),
Height = c(180.25, 165.75),
Weight = c(75.125, 62.375),
Birth_Date = as.Date(c("2000-01-15", "1995-06-22"))
)
# Pre-format data
library(dplyr)
formatted_mtcars <- mtcars %>%
mutate(mpg = sprintf("%.1f", mpg),
wt = sprintf("%.3f", wt))
# Write pre-formatted data
write.csv(formatted_mtcars, "output/mtcars_formatted.csv",
row.names = FALSE)
# Alternative with readr for decimal precision
library(readr)
write_csv(mtcars, "output/mtcars_readr.csv",
num_threads = 4)
Explanation (R):
- R offers multiple approaches to control formatting
- Pre-format columns using
sprintf(),format(), orround() write.csv()doesn't offer direct formatting optionsreadr::write_csv()maintains precision but doesn't allow format customizationnum_threadsparameter in readr enables parallel processing for large files
Input Data
| Name | Height | Weight | Birth_Date |
|---|---|---|---|
| John | 180.25 | 75.125 | 2000-01-15 |
| Alice | 165.75 | 62.375 | 1995-06-22 |
Expected Output (Formatted)
Name,Height,Weight,Birth_Date
John,180.3,75.13,01/15/2000
Alice,165.8,62.38,06/22/1995
7. Beyond Basics: Metadata-Driven Export and Automation
| Feature | SAS | R |
|---|---|---|
| Batch export | Macro loops | lapply() or purrr::map() |
| Config-driven export | Macro with parameters | Functions with parameters |
| Multiple file output | Loop through datasets | Loop through data frames |
SAS Example
/* Macro for flexible CSV export */
%macro export_csv(ds, outpath, delim=',', headers=YES, quotes=YES);
proc export data=&ds
outfile="&outpath"
dbms=csv replace;
delimiter="&delim";
putnames=&headers;
%if %upcase("es) = YES %then %do;
quote=yes;
%end;
run;
%mend;
/* Use macro for batch export */
%export_csv(sashelp.class, C:/output/class.csv);
%export_csv(sashelp.cars, C:/output/cars.csv, delim='|', quotes=NO);
Explanation (SAS):
- Create a parameterized macro for reusable export logic
- Parameters control delimiter, headers, and quoting
- Use macro variables for flexible file paths
- Can be expanded for batch processing multiple datasets
R Example
# Dummy data for batch export
df1 <- data.frame(A = 1:3, B = c("x", "y", "z"))
df2 <- data.frame(X = c(10, 20), Y = c("a", "b"))
df3 <- data.frame(M = c(TRUE, FALSE), N = c(2.5, 3.5))
# Create a function for consistent exports
export_csv <- function(data, file_path, delimiter = ",",
include_header = TRUE, na_str = "NA") {
write.table(data, file = file_path,
sep = delimiter,
col.names = include_header,
row.names = FALSE,
na = na_str,
quote = TRUE)
}
# Batch export from a list of data frames
datasets <- list(mtcars = mtcars, iris = iris, airquality = airquality)
paths <- paste0 "output/", names(datasets), ".csv")
# Export all with purrr
library(purrr)
walk2(datasets, paths, export_csv)
Explanation (R):
- Create a custom function for standardized exports
- Function parameters allow customization of delimiter, headers, NA values
- Use
listto organize multiple datasets purrr::walk2()applies the export function to each dataset-path pair- Can be expanded to read configuration from external files
Config Table Example (for R)
| dataset | output_path | delimiter | headers | na_string |
|---|---|---|---|---|
| mtcars | output/mtcars.csv | , | TRUE | MISSING |
| iris | output/iris.csv | ; | TRUE | NA |
| airquality | output/air.csv | , | FALSE | . |
Resulting Code (reading from config)
# Dummy config and data
export_csv <- function(data, file_path, delimiter = ",",
include_header = TRUE, na_str = "NA") {
write.table(data, file = file_path,
sep = delimiter,
col.names = include_header,
row.names = FALSE,
na = na_str,
quote = TRUE)
}
mtcars <- head(mtcars)
iris <- head(iris)
airquality <- head(airquality)
config <- data.frame(
dataset = c("mtcars", "iris", "airquality"),
output_path = c("output/mtcars.csv", "output/iris.csv", "output/air.csv"),
delimiter = c(",", ";", ","),
headers = c(TRUE, TRUE, FALSE),
na_string = c("MISSING", "NA", ".")
)
# Apply export function to each row of config
for (i in 1:nrow(config)) {
cfg <- config[i, ]
data <- get(as.character(cfg$dataset))
export_csv(data,
as.character(cfg$output_path),
as.character(cfg$delimiter),
as.logical(cfg$headers),
as.character(cfg$na_string))
}
8. Summary: SAS vs R CSV Export Capabilities
| Capability | SAS (PROC EXPORT) |
R (write.csv/write_csv) |
|---|---|---|
| Basic export | ✓ PROC EXPORT |
✓ write.csv() or write_csv() |
| Custom delimiters | ✓ DELIMITER= |
✓ sep = parameter |
| Header control | ✓ PUTNAMES=YES/NO |
✓ col.names = TRUE/FALSE |
| Row names export | ✗ Not applicable | ✓ row.names = TRUE/FALSE |
| Missing value customization | ✗ Limited control | ✓ na = parameter |
| Character encoding | ✓ ENCODING= |
✓ fileEncoding = parameter |
| Line ending control | ✗ Platform-dependent | ✓ eol = parameter |
| Format control | ✗ Requires pre-formatting | ✗ Requires pre-formatting |
| Batch processing | ✓ Via macros | ✓ Via functions and iteration |
| Performance on large files | ⚠ May be slower | ✓ Fast with data.table::fwrite() or readr |
Summary:
- Both SAS and R provide robust capabilities for CSV exports
- SAS's
PROC EXPORToffers straightforward syntax for basic exports - R provides more granular control over export parameters
- R has multiple packages (
readr,data.table) with optimized CSV writing functions - Both systems require pre-formatting for precise control of numeric formats
- R's ecosystem offers better tools for batch processing and automation
- For very large files, specialized packages in R offer significant performance advantages
**Resource download links**
1.5.12.-PROC-EXPORT-CSV-file-in-SAS-vs-R-equivalent.zip