contact@a2zlearners.com

1. R Fundamentals

1.1. Introduction to R

1.1.3. Installing and Managing R

1. R's Modular Architecture

R differs fundamentally from commercial statistical software through its:

  • Component-Based Design: Core system plus specialized add-on packages
  • Open Ecosystem: Over 18,000 packages extending core functionality
  • Resource Efficiency: Install only what you need, minimizing system footprint
  • Customization: Tailor your environment to specific analytical requirements
  • Continuous Evolution: Regular updates to both R and its package ecosystem

Example: Checking Your R Foundation

# See what R version you're running
R.version.string
#> [1] "R version 4.2.1 (2022-06-23)"

# View installed packages
head(installed.packages()[, c("Package", "Version")])
#>       Package    Version
#> base  "base"     "4.2.1"
#> boot  "boot"     "1.3-28"
#> class "class"    "7.3-20"
#> ...

2. Installation and Version Management

2.1. Initial Installation

Getting R

  • Source: CRAN (https://cran.r-project.org/) with global mirror network
  • Platforms: Windows, macOS, Linux (pre-compiled binaries for most systems)
  • Advanced Options: Source code available for custom compilation
  • Installation Process: Simple wizard-based installation (typically 3-5 minutes)

Example: Verifying Installation

# After installation, check capabilities
capabilities()
#> jpeg      png     tiff    tcltk      X11    aqua http/ftp  sockets 
#>  TRUE     TRUE     TRUE     TRUE    FALSE     TRUE     TRUE     TRUE 
#> libxml     fifo   cledit    iconv      NLS  profmem    cairo    ICU 
#>  TRUE     TRUE     TRUE     TRUE     TRUE     TRUE     TRUE    TRUE

Version Considerations

  • Numbering System: major.minor.patch format (e.g., 4.2.1)
  • Release Cycle: Major (years), minor (quarterly), patch (as needed)
  • Version Selection: Stable releases recommended for most users
2.2. Managing Multiple Versions
  • Isolated Installation: Each R version installs to its separate directory
  • Side-by-Side Usage: Multiple versions can coexist and even run simultaneously
  • Testing: Evaluate new versions while maintaining stable production environments
  • Version-Specific Libraries: Each R version maintains separate package collections

Example: Windows Directory Structure

C:\Program Files\R\
   ├── R-4.1.3\          # Older R version
   │   ├── bin\
   │   └── library\      # Packages for 4.1.3
   │ 
   └── R-4.2.1\          # Current R version
       ├── bin\
       └── library\      # Packages for 4.2.1
2.3. Updating R
  • New Version Process: Install alongside existing version rather than upgrading in-place
  • Package Migration: Requires reinstalling packages for each new R version
  • Bulk Installation: Create package lists for efficient reinstallation
    # Define and install your common packages
    myPackages <- c("dplyr", "ggplot2", "tidyr", "Hmisc")
    install.packages(myPackages, dependencies = TRUE)
    

Example: Checking for Updates (Windows)

# Using the installr package (Windows)
if(!require(installr)) install.packages("installr")
installr::check.for.updates.R()
# Shows dialog with update information if available
2.4. Uninstallation

R Removal By Platform

  • Windows: Use Start menu > Programs > R > Uninstall or direct uninstaller
  • macOS: Drag application to Trash and remove framework files
  • Linux: Use package manager or delete source installation directory

Example: Linux Uninstallation Commands

# Ubuntu/Debian
sudo apt remove r-base r-base-dev

# Fedora/RHEL
sudo dnf remove R

3. Package Management

3.1. Package Repositories
  • Primary Source: CRAN (Comprehensive R Archive Network)
  • Specialized Repositories:
    • Bioconductor: Genomics and bioinformatics
    • R-Forge: Development platform for cutting-edge packages
    • GitHub/GitLab: Direct developer releases
  • Repository Selection: Use setRepositories() or menu options to configure sources
3.2. Installing Packages

Basic Installation

  • Single Package: install.packages("packagename")
  • Multiple Packages: install.packages(c("package1", "package2"))
  • Dependencies: Add dependencies = TRUE to include suggested packages
  • Special Scenarios: Non-admin installation, version-specific installation

Installation Methods

  • GUI: Menu-driven via Packages > Install package(s)
  • Console: Direct function calls with additional options
  • Project-Based: Tools like renv for project-specific dependencies

Example: Installing a Package from GitHub

# Using devtools to install from GitHub
if(!require(devtools)) install.packages("devtools")
devtools::install_github("username/repo")
3.3. Loading and Using Packages

Package Loading

  • Basic Loading: library(packagename) or require(packagename)
  • Verification: search() to see loaded packages
  • Selective Usage: Access functions without loading via packagename::function()

Managing Conflicts

  • Masking: When functions in different packages share names
  • Resolution: Use explicit references (stats::filter() vs dplyr::filter())
  • Temporary Usage: Load-use-detach pattern for conflict avoidance
3.4. Package Maintenance

Updating

  • Check and Update: update.packages() for interactive updates
  • Non-Interactive: update.packages(ask = FALSE) for automatic updates

Removing Packages

  • Package Removal: remove.packages("packagename") after detaching
  • Cleanup: Consider pacman::p_clean() for dependency management

4. Working with Package Data

Finding Datasets

  • View All Available: data() lists datasets in loaded packages
  • Package-Specific: data(package = "packagename") for single package
  • Search by Topic: data(pattern = "keyword") for relevant datasets
  • Comprehensive Search: data(package = .packages(all.available = TRUE))

Using Package Datasets

  • Direct Access: Available after loading the containing package
  • Explicit Loading: data(datasetname) when needed
  • Exploration Tools: head(), str(), summary() for quick examination

Common Learning Datasets

  • General Statistics: mtcars, iris, ToothGrowth
  • Documentation: Use ?datasetname for structure and source information
  • Specialized Domains: Field-specific packages include relevant datasets

5. Automation and Customization

Startup Customization

  • Configuration File: Create .Rprofile in working or home directory
  • Automatic Package Loading: Set commonly used packages to load at startup
  • Custom Settings: Define personal defaults for R options

Project Management

  • Project-Specific Settings: Maintain separate configurations by project
  • Package Lists: Document required packages for reproducibility
  • Version Recording: Note R and package versions for key analyses

This modular design and comprehensive management system provides flexibility and power unmatched by traditional statistical packages, allowing for customized analytical environments tailored to specific needs.