1

I have a data frame that looks like this:

    USequence
# 1 GATCAGATC
# 2 ATCAGAC

I'm trying to create a function that would replace all the G's with C's, A's with T's, C's with G's, and T's with A's:

    USequence
# 1 CTAGTCTAG
# 2 TAGTCTG

This is what I have right now, the function accepts k, a data frame with a column named USequence.

conjugator <- function(k) {
  k$USequence <- str_replace_all(k$USequence,"A","T")
  k$USequence <- str_replace_all(k$USequence,"T","A")
  k$USequence <- str_replace_all(k$USequence,"G","C")
  k$USequence <- str_replace_all(k$USequence,"C","G")
}

However the obvious problem would be that this is doesn't replace the characters at once, but rather in steps which would not return the desired result. Any suggestions? Thanks

1
  • Sounds like a variant of the infamous interview question: "Swap A and B without creating a temp variable" :-) Commented Jul 18, 2015 at 11:33

1 Answer 1

6

You could use chartr

df1$USequence <- chartr('GATC', 'CTAG', df1$USequence)
df1$USequence 
#[1] "CTAGTCTAG" "TAGTCTG"  

Or

library(gsubfn)
gsubfn('[GATC]', list(G='C', A='T', T='A', C='G'), df1$USequence)
#[1] "CTAGTCTAG" "TAGTCTG"  
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.