2

I am trying to plot spectroscopic data using ggplot2. I get my data in the following form:

data structure

My code so far is:

library(ggplot2)
library(reshape2)
melt_data <- melt(spectroscopic_data, id.vars = "sample_name", variable.name = "wavenumber", value.name = "intensity")
melt_data$probe = factor(melt_data$probe)
melt_data$wellenzahl = as.numeric(levels(melt_data$wellenzahl))[melt_data$wellenzahl]
ggplot(melt_data, aes(x=wavenumber, y=intensity, group=sample_name, color=sample_name)) + geom_line() +
scale_x_reverse(breaks=c(10000, 9500, 9000, 8500, 8000, 7500, 7000, 6500, 6000, 5500, 5000, 4500, 4000)) +
scale_color_manual(values=c("#FF0000", "#0000CC", "#00CC00", "#FF00FF", "#FF9900", "#000000", "#999900", "#33FFFF", "#FFCCFF", "#FFFF00", "#999999", "#9933FF", "#993300", "#99FF33")) + 
theme_bw() + 
theme(legend.position = "bottom") +
labs(x=expression(wavenumbers), y="intensity", colour = "") + 
theme(legend.text=element_text(size=10), axis.text=element_text(size=12), axis.title=element_text(size=14)) + 
guides(colour = guide_legend(ncol = 2, keywidth=1.5, keyheight=1, override.aes = list(size=1.8)))

I need the same color for aaa-samples, bbb-samples and so on (multiple measurements of one sample) but the plot does not work. I get a plot that looks like this when you zoom in:

zoom of current plot

It looks like ggplot2 connects two samples/lines of the same measurement instead of plotting them separately. Does anyone have an idea? I am trying to fix this since hours...

Thank you!

1
  • 1
    Welcome to Stack Overflow! This is a solid first question (I wish all the new questions were this quality). I'll just give you links, in case you want to take the tour. If you have more questions, you can head over to the help center as well. Commented Apr 17, 2017 at 21:27

2 Answers 2

1

Here is my result after Luke C's awesome support:

library(ggplot2)
library(reshape2)

melted_data <- melt(newtestdata, id.vars = c("sample_name", 
"sample_id"), variable.name = "wavenumber", value.name = "intensity")

melted_data$wavenumber=as.numeric(levels(melted_data$wavenumber))[melted_data$wavenumber]

ggplot(melted_data, aes(x=wavenumber, y=intensity, group = sample_id, color = sample_name)) + geom_line() +
scale_x_reverse(breaks=c(1005, 1200, 1400), expand = c(0.01, 0.01)) +  
scale_y_continuous(breaks=c(0, 0.5, 1.0, 1.5, 2.0), expand = c(0.01, 0.01)) +

scale_color_manual(values=c("#FF0000", "#0000CC", "#00CC00", "#FF00FF", "#FF9900", "#000000")) +

theme_bw() + 
theme(legend.position = "bottom") + 
theme(plot.margin=unit(c(1,1,0.5,1),"cm")) +

labs(x=expression(wavenumbers~"in"~cm^{"-1"}), y="absorbance in a.u.", colour = "") + 
theme(legend.text=element_text(size=10), axis.text=element_text(size=12), axis.title=element_text(size=14)) + 
guides(colour = guide_legend(ncol = 3, keywidth=1.5, keyheight=1, override.aes = list(size=1.2)))

ggsave("buechi-all.pdf", width = 11.69, height = 8.27)

Data Structure

Result

Sign up to request clarification or add additional context in comments.

Comments

0

One way is to add a sample id to your data frame before you reshape it. That will allow you to keep the names like "aaa" and "bbb" but assign a unique identifier to act as your grouping variable (since it cannot differentiate between two observations at the same x variable otherwise). For an example where I tried to mimic your input data:

ex<-cbind(c("aaa","aaa","bbb","bbb"), c(0.426,0.405,0.409,0.395), c(0.430,0.408,0.411,0.399), c(0.432,0.411,0.413,0.401))

ex<- as.data.frame(ex)

colnames(ex) <- c("sample_name", "4000", "4004", "4008")

ex$sample_id<-1:nrow(ex)

melt <- melt(ex, id.vars = c("sample_name", "sample_id"), variable.name = "wavenumber", value.name = "intensity")

ggplot(melt, aes(x = wavenumber, y = intensity, group = sample_id, color = sample_name)) +
  geom_line() +
  theme_classic()

This outputs separate lines for different measurements of samples grouped by sample id, but keeping the color according to the sample name:

enter image description here

Is that sort of what you're after?


Edits below

To show the same approach with a larger dataset:

alpha <-rep(sapply(letters[1:10], function(x) {paste(x,x,x, sep = "")}), each = 2)
adf <- data.frame(alpha)
adf$sample_id <- seq(1, (length(alpha)))
adf$t <- rnorm(20, 0.4, 0.1)

wavenum <- seq(4, 1503)
for(i in wavenum){
  for(j in 1:length(alpha)){
    adf[j,i] <- adf[j,i-1] + (rnorm(1, 0.01, 0.01))
  }
}
adf[1:10, 1:10]

anames <- c("sample_name", "sample_id", (1400 + 4 * seq(0, 1500)))
names(adf)<-anames

melt <- melt(adf, id.vars = c("sample_name", "sample_id"), variable.name = "wavenumber", value.name = "intensity")

head(melt)

ggplot(melt[1:1500,], aes(x = wavenumber, y = intensity, group = sample_id, color = sample_name)) +
  geom_line(lwd = 1.5) +
  theme_classic()

This will give a similar plot to the one above, where each sample has an individual line for each measurement that are both the same color.

enter image description here

If I'm still missing what you're actually after, I apologize!

4 Comments

Thank you for your reply! Since I am new to R and ggplot2, I am not sure - but I think your approach might be impractical because those datasets are huge (1500 wavenumbers for example). Maybe the problem is in the line melt_data$wavenumber = as.numeric(levels(melt_data$wavenumber))[melt_data$wavenumber] ?? Any idea?
I don't think there is an actual problem with your code, unless you're getting an error- I think ggplot is just constrained by not knowing how to separate two samples with the same name. I'll edit my answer to show my approach working with a larger dataset, since as far as I know the size shouldn't matter in this case (beyond processing time).
Thanks again, I am going to try your approach with my data as soon as possible and I will let you know! Thank you so far!
Luke C, you are awesome!!! I tried it today - worked perfectly! Thank you very much!!! I will add an answer with my data set and result, it might help someone with less r-experience (like me)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.