1

I don't like to use Excel to produce my plots so I want to use R instead and ggplot2 if possible. This is my dataset and probably it is not in the right format for doing what I want to do:

    C>A         C>G         C>T         T>A         T>C         T>G
ACA 0.049915398 0.008460237 0.018612521 0.015228426 0.036379019 0.005922166
ACC 0.015228426 0.003384095 0.010152284 0.005922166 0.005076142 0
ACG 0.014382403 0.005076142 0.010998308 0.007614213 0.013536379 0.001692047
ACT 0.031302876 0.007614213 0.01607445  0.010998308 0.013536379 0.002538071
CCA 0.021150592 0.005076142 0.011844332 0.007614213 0.011844332 0.001692047
CCC 0.027072758 0.002538071 0.009306261 0.005076142 0.004230118 0
CCG 0.014382403 0.001692047 0.009306261 0.005076142 0.008460237 0.000846024
CCT 0.0321489   0.00676819  0.016920474 0.00676819  0.008460237 0.000846024
GCA 0.011844332 0.003384095 0.015228426 0.003384095 0.013536379 0.002538071
GCC 0.008460237 0.004230118 0.010152284 0.007614213 0.011844332 0.003384095
GCG 0.002538071 0.004230118 0.010998308 0.009306261 0.010998308 0.003384095
GCT 0.012690355 0.005076142 0.010998308 0.003384095 0.005076142 0.000846024
TCA 0.030456853 0.011844332 0.013536379 0.011844332 0.017766497 0.001692047
TCC 0.026226734 0.00676819  0.017766497 0.002538071 0.004230118 0.002538071
TCG 0.011844332 0.000846024 0.009306261 0.003384095 0.011844332 0.000846024
TCT 0.03891709  0.016920474 0.020304569 0.008460237 0.019458545 0.00676819

From this dataset I would like to produce something like this: The values on the X axis are repeated for each column

Can you help me? Everything I produce is far from what I want.

2 Answers 2

1

You are certainly looking for this result - I use tidyr or reshape2 library to shape the data first:

library(reshape2)

df1 = melt(df, id.vars='Gene', variable.name='Class', value.name='Value')
#df1 = gather(df, Class, Value, -Gene) using tidyr

df1 = transform(df1, x=1:nrow(df1))

ggplot(df1, aes(x=x, y=Value, fill=Class)) + 
    geom_bar(stat="identity") + 
    scale_x_discrete(labels=df1$Gene) + 
    theme(axis.text.x = element_text(angle = 90, hjust = 1))

enter image description here

Data:

df = structure(list(Gene = c("ACA", "ACC", "ACG", "ACT", "CCA", "CCC", 
"CCG", "CCT", "GCA", "GCC", "GCG", "GCT", "TCA", "TCC", "TCG", 
"TCT"), C.A = c(0.049915398, 0.015228426, 0.014382403, 0.031302876, 
0.021150592, 0.027072758, 0.014382403, 0.0321489, 0.011844332, 
0.008460237, 0.002538071, 0.012690355, 0.030456853, 0.026226734, 
0.011844332, 0.03891709), C.G = c(0.008460237, 0.003384095, 0.005076142, 
0.007614213, 0.005076142, 0.002538071, 0.001692047, 0.00676819, 
0.003384095, 0.004230118, 0.004230118, 0.005076142, 0.011844332, 
0.00676819, 0.000846024, 0.016920474), C.T = c(0.018612521, 0.010152284, 
0.010998308, 0.01607445, 0.011844332, 0.009306261, 0.009306261, 
0.016920474, 0.015228426, 0.010152284, 0.010998308, 0.010998308, 
0.013536379, 0.017766497, 0.009306261, 0.020304569), T.A = c(0.015228426, 
0.005922166, 0.007614213, 0.010998308, 0.007614213, 0.005076142, 
0.005076142, 0.00676819, 0.003384095, 0.007614213, 0.009306261, 
0.003384095, 0.011844332, 0.002538071, 0.003384095, 0.008460237
), T.C = c(0.036379019, 0.005076142, 0.013536379, 0.013536379, 
0.011844332, 0.004230118, 0.008460237, 0.008460237, 0.013536379, 
0.011844332, 0.010998308, 0.005076142, 0.017766497, 0.004230118, 
0.011844332, 0.019458545), T.G = c(0.005922166, 0, 0.001692047, 
0.002538071, 0.001692047, 0, 0.000846024, 0.000846024, 0.002538071, 
0.003384095, 0.003384095, 0.000846024, 0.001692047, 0.002538071, 
0.000846024, 0.00676819)), .Names = c("Gene", "C.A", "C.G", "C.T", 
"T.A", "T.C", "T.G"), class = "data.frame", row.names = c(NA, 
-16L))
Sign up to request clarification or add additional context in comments.

4 Comments

This is exactly what I'm looking for, but I can't use tidyr. Is there a way to do the same thing using reshape2?
Yes, I edited my code so that you can use reshape2. Just have in mind going wide to long -> use melt, going long to wide data -> use dcast
Did you created the dataframe like that or is it the result of the melt? beacuse I don't get the same plot as you do. I got a stacked chart instead.
Do you have problem to read something ? ;) I use df as defined in data since you did not give any dput of your data. I refactored the code so that it becomes clear to you.
0

you can try something like this, where df is your dataframe:

library(ggplot2)
library(tidyr)
library(magrittr)

df$rowvalues <- rownames(df)
gather(df, Variables, Values, 2:ncol(df)) %>%
    ggplot(aes(x = rowvalues, y = Values, fill = Variables) +
    geom_bar(stat = "identity")

Version 2 without using magrittrs pipe-operator and using reshape2 instead of tidyr:

library(ggplot2)
library(reshape2)

df$rowvalues <- rownames(df)
df.long <- melt(df, id.vars = ("rowvalues"),
 variable.name = "variable_name", 
 value.name = "value_name")
ggplot(df.long, aes(x = rowvalues, y = value_name, fill = variable_name) +
    geom_bar(stat = "identity")

6 Comments

They are rowids, repeated each time for each column.
then the above answer should get you what you want, please update your original post to make this clear, have you tried the solution yet?
I am trying but tidyr package is not available for my version of R (3.0.2).
i changed my solution using reshape 2 instead, try that
does it work? if so please accept the solution by clicking the tick
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.