0

I have a .csv file that I’m having trouble processing because the numbers use commas for decimals, which causes issues when reading the file in R. Excel interprets commas for decimals, while R uses dots. I want to replace the commas with dots as I load the data into R. Here are some of the approaches I’ve tried:

Approach 1:

data <- read.csv2("cleaned_data.csv", stringsAsFactors = FALSE)
data$Daily_Loss <- as.numeric(gsub(",", ".", data$Daily_Loss))
data$Total_Loss <- as.numeric(gsub(",", ".", data$Total_Loss))
data$loss_2023 <- as.numeric(gsub(",", ".", data$loss_2023))
data$loss_2024 <- as.numeric(gsub(",", ".", data$loss_2024))

Approach 2:

data <- read.csv("cleaned_data.csv", sep = ";", dec = ",", stringsAsFactors = FALSE)
data$Daily_Loss <- as.numeric(gsub(",", ".", trimws(data$Daily_Loss)))
data$Total_Loss <- as.numeric(gsub(",", ".", trimws(data$Total_Loss)))
data$loss_2023 <- as.numeric(gsub(",", ".", trimws(data$loss_2023)))
data$loss_2024 <- as.numeric(gsub(",", ".", trimws(data$loss_2024)))

Any suggestions on how I can fix this more efficiently? What I want to do after is:

total_loss_2023 <- sum(data$loss_2023, na.rm = TRUE)
total_loss_2024 <- sum(data$loss_2024, na.rm = TRUE)
cat("Total Loss in 2023: ", total_loss_2023, "\n")
cat("Total Loss in 2024: ", total_loss_2024, "\n")

Any help is welcomed! That is part of the table I am trying to convert to a different format.

enter image description here

4
  • without sep=";" does not work. your proposal gives me: Error in read.table(file = file, header = header, sep = sep, quote = quote, : more columns than column names Commented Dec 25, 2024 at 19:25
  • Hi, in my question I specify that my file is .csv and I wrote that I already tried: read.csv2("cleaned_data.csv", stringsAsFactors = FALSE) read.csv("cleaned_data.csv", sep = ";", dec = ",", stringsAsFactors = FALSE) Commented Dec 25, 2024 at 19:36
  • 6
    Posts to SO need to show the input which in this case is the csv file so that others can run the code. If the file is large you can post a cut down version. Commented Dec 25, 2024 at 19:39
  • Just use read.csv("file.csv", dec = ","), like @G.Grothendieck suggested, instead of read.csv2(). It works for me! Commented Dec 25, 2024 at 19:47

1 Answer 1

2

To handle comma decimals efficiently, try:

data <- read.csv2("cleaned_data.csv", dec = ",", stringsAsFactors = FALSE)

read.csv2() is designed for European-style CSV files with semicolon separators and comma decimals. If that doesn't work you can use this for replacing the commas:

data <- read.csv("cleaned_data.csv", stringsAsFactors = FALSE)
# Convert all numeric columns at once
numeric_cols <- c("Daily_Loss", "Total_Loss", "loss_2023", "loss_2024")
data[numeric_cols] <- lapply(data[numeric_cols], function(x) as.numeric(gsub(",", ".", x)))
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.