String manipulation
See also Factors in “Data structures” chapter.
- Concatenate
- Split
- Regular expressions (grepl, grep, sub)
Concatenate
paste and paste0 concatenate a set of character strings. They can also do replication in strings.
paste("Chr", c(1:22, "X", "Y"), sep = "")
## [1] "Chr1" "Chr2" "Chr3" "Chr4" "Chr5" "Chr6" "Chr7" "Chr8"
## [9] "Chr9" "Chr10" "Chr11" "Chr12" "Chr13" "Chr14" "Chr15" "Chr16"
## [17] "Chr17" "Chr18" "Chr19" "Chr20" "Chr21" "Chr22" "ChrX" "ChrY"
paste0("Chr", c(1:22, "X", "Y"))
## [1] "Chr1" "Chr2" "Chr3" "Chr4" "Chr5" "Chr6" "Chr7" "Chr8"
## [9] "Chr9" "Chr10" "Chr11" "Chr12" "Chr13" "Chr14" "Chr15" "Chr16"
## [17] "Chr17" "Chr18" "Chr19" "Chr20" "Chr21" "Chr22" "ChrX" "ChrY"
Split
Spliting characters is done by the strsplit function.
Using the empty string as separator separates all characters.
strsplit("ATTGCCTGGATT", "")
## [[1]]
## [1] "A" "T" "T" "G" "C" "C" "T" "G" "G" "A" "T" "T"
Regular expressions
Regular expressions can be applied on character strings.
grepl() finds if a string contains a given pattern using or not the Perl syntax (see also regexpr()).
multi_strings <- c("Giraf", "Cow", "Frog", "Panda")
grepl(pattern = "^F", x = multi_strings, perl = TRUE)
## [1] FALSE FALSE TRUE FALSE
grep() returns the index of the matching strings, if any. sub() find a pattern and replace it. If it can't replace anything in a string, the full string is itself returned.

This work by Celine Hernandez is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.