Some notes for R.
Filter rows according to values of multiple columns
Use filter_all, filter_at or filter_if from dplyr.1
1 | # You can take the intersection of the replicated expressions: |
Join path and/or filename
Use file.path().2
1 | > file.path("usr", "local", "lib") |
List files in a dir
Use list.files().3
1 | list.files(path = ".", pattern = NULL, all.files = FALSE, |
Read multiple files into one dataframe
Use do.call + lapply.4
1 | dataset <- do.call("rbind",lapply(file_list, FUN=function(files){read.table(files, header=TRUE, sep="\t")})) |
Select helpers in dplyr
Select helpers5:
starts_with(): starts with a prefixends_with(): ends with a prefixcontains(): contains a literal stringmatches(): matches a regular expressionnum_range(): a numerical range like x01, x02, x03.one_of(): variables in character vector.everything(): all variables.
Keep strings matching a pattern
Use stringr::str_subset.6
1 | fruit <- c("apple", "banana", "pear", "pinapple") |
References
1. https://dplyr.tidyverse.org/reference/filter_all.html ↩
2. https://stackoverflow.com/questions/13110076/function-to-concatenate-paths ↩
3. https://stat.ethz.ch/R-manual/R-devel/library/base/html/list.files.html ↩
4. https://psychwire.wordpress.com/2011/06/03/merge-all-files-in-a-directory-using-r-into-a-single-dataframe/#comment-24 ↩
5. https://dplyr.tidyverse.org/reference/select_helpers.html ↩
6. https://stringr.tidyverse.org/reference/str_subset.html ↩