2

Is there an elegant way to remove a sub-string within a string based on the index of the characters?

Here is how I do it now:

# My data
mystring <- "Hello, how are {you} doing?"
index_of_substring <- c(16,20)

# Pasting two substrings
mystring_no_substring <- paste0(substr(mystring, 1, index_of_substring[1]-1), substr(mystring, index_of_substring[2]+1, nchar(mystring)))

# Cleaning extra spaces
mystring_no_substring <- gsub("  ", " ", mystring_no_substring)

I could of course write this up to a general function, but I was just wondering if there was an elegant solution out there, e.g. to substitute an index in a string with nothing or another word.

Note: This is not a regex question.

3
  • 6
    stringi::stri_sub(mystring, 16, 20) <- "" for example? Commented Dec 15, 2017 at 19:08
  • Yeah, that is already much more elegant! Thanks. Commented Dec 15, 2017 at 19:35
  • Another Option is: stringr::str_sub(mystring, 16, 20) <- "" Commented Dec 15, 2017 at 19:54

3 Answers 3

2

1) strsplit/paste Break up the input into characters, omit the ones between 16 and 20 inclusive, collapse it back together and replace runs of spaces with single spaces. Uses base functions only.

gsub(" +", " ", paste(strsplit(s, "")[[1]][-seq(ix[1], ix[2])], collapse = ""))
## [1] "Hello, how are doing?"

2) substr<- Replace the indicated characters with spaces and then reduce runs of spaces to a single space. Only base functions are used.

gsub(" +", " ", "substr<-"(s, ix[1],  ix[2], gsub(".", " ", s)))
## [1] "Hello, how are doing?"

Note that this is non-destructive, i.e. it outputs the result without modifying the input.

Note: We used test input:

s <- "Hello, how are {you} doing?"
ix <- c(16, 20)
Sign up to request clarification or add additional context in comments.

Comments

1

You can use paste0 and substr like this too:-

paste0(substr(mystring, 1, 14), substr(mystring, 21, 27))

Comments

0

I believe my solution is pretty much what you'd get if coded your method as a general function but here you go. I first use a custom function called "strpos_fixed" to index the substring I'd like to remove. I not quite as comfotable as I'd like to be with regex so I restrict this function to fixed matching for simplicity sake.

strpos_fixed=function(x,y){
  a<-regexpr(y, x,fixed=T)
  b<-a[1]
  return(b)
}


rm_substr<-function(string,rm_start,rm_end){

  sub1<-substr(string,1,strpos_fixed(string, rm_start)-1)

  sub2<-substr(string, strpos_fixed(string,rm_end)+nchar(rm_end), 
               nchar(string))

  new <- gsub("\\s{2,}"," ",paste(sub1, sub2))

  return(new)
}

mystring <- "Hello, how are {you} doing?"
rm_substr(mystring, "{", "}")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.