Reading and Writing Data

Examples

How do I read a csv file called grades.csv into a data.frame?

. means the current working directory.

So, if we were in /home/john/projects, that is our current working directory.

./grades.csv is the same as /home/john/projects/grades.csv. In this case, ./grades.csv is the relative path.

dat <- read.csv("./grades.csv")
head(dat)
   grade      year
 1   100    junior
 2    99 sophomore
 3    75 sophomore
 4    74 sophomore
 5    44    senior
 6    69    junior

How do I read a csv file called grades.csv into a data.frame using the function fread?

Note: The fread function is part of the data.table package. It reads in datasets faster than read.csv. You are strongly encouraged to use fread to read large datasets in R.

library(data.table)
dat <- data.frame(fread("./grades.csv"))
head(dat)
   grade      year
 1   100    junior
 2    99 sophomore
 3    75 sophomore
 4    74 sophomore
 5    44    senior
 6    69    junior

How do I read a csv file called grades2.csv where the data is separated by semi-colons (;) in a data.frame?

CSV stands for comma-separated values. So, read.csv has comma (,) as the default value for the sep parameter.
dat <- read.csv("./grades_semi.csv", sep=";")
head(dat)
   grade      year
 1   100    junior
 2    99 sophomore
 3    75 sophomore
 4    74 sophomore
 5    44    senior
 6    69    junior

How do I prevent R from reading in strings as factors when using a function like read.csv?

In R 4.0+, strings are not read in as factors, so you don’t need to worry about this. For R < 4.0 (any older R version than 4.0), use stringsAsFactors.

dat <- read.csv("./grades.csv", stringsAsFactors=F)
head(dat)
   grade      year
 1   100    junior
 2    99 sophomore
 3    75 sophomore
 4    74 sophomore
 5    44    senior
 6    69    junior

How do I specify the type of 1 or more columns when reading in a csv file?

dat <- read.csv("./grades.csv", colClasses=c("grade"="character", "year"="factor"))
str(dat)
'data.frame':    10 obs. of  2 variables:
 $ grade: chr  "100" "99" "75" "74" ...
 $ year : Factor w/ 4 levels "freshman","junior",..: 2 4 4 4 3 2 2 3 1 2

Given a list of csv files with the same columns, how can I read in and combine them into a one dataframe?

# We want to read in grades.csv, grades2.csv, and grades3.csv
# into a single dataframe.

list_of_files <- c("grades.csv", "grades2.csv", "grades3.csv")

results <- data.frame()
for (file in list_of_files) {
  dat <- read.csv(file)
  results <- rbind(results, dat)
}
dim(results)
[1] 32  2

How do I create a data.frame with comma-separated data that I’ve copied onto my clipboard?

# For mac
dat <- read.delim(pipe("pbpaste"),header=F,sep=",")

# For windows
dat <- read.table("clipboard",header=F,sep=",")