Plotting Graph in R

Question

I have two data tables in the form of Columns namely pair of Diseases and their measures as a pair. Below is the first one(sample data) disease_table1

  **d1**   **d2** **Value**

Disease1 Disease2  3.5
Disease3 Disease4  5
Disease5 Disease6  1.1
Disease1 Disease3  2.4
Disease6 Disease2  6.7

the real Dataset 1(disease_table1) is below:

 Bladder cancer                         X-linked ichthyosis (XLI)        3.5
 Leukocyte adhesion deficiency (LAD)    Aldosterone synthase Deficiency  1.8
 Leukocyte adhesion deficiency (LAD)    Brain Cancer                     1.5
 Tangier disease                        Pancreatic cancer                0.66

I want to show the difference between these two data tables while plotting the disease pairs and its values for both tables. I used the plot function and lines function but its too simple,and is not able to differentiate much.Also I would like to have the names of the disease pairs while plotting.

   plot(density(disease_table1$value))
   lines(density(disease_table1$value))

Thanks

With 400,000+ disease pairs you probably need a clustering approach. can you post a link to your data, or a more representative subset, say a few thousand records? — jlhoward
– jlhoward, Commented Jan 28, 2014 at 21:09

Jaap · Accepted Answer · 2014-01-28 19:41:31Z

2

Some sample code:

# creating dataframes (i made up a second one)
df1 <- read.table(text = "d1   d2 x
Disease1 Disease2  3.5
Disease3 Disease4  5
Disease5 Disease6  1.1
Disease1 Disease3  2.4
Disease6 Disease2  6.7", header = TRUE, strip.white = TRUE)

df2 <- read.table(text = "d1   d2 y
Disease1 Disease2  4.5
Disease3 Disease4  2
Disease5 Disease6  3.1
Disease1 Disease3  1.4
Disease6 Disease2  5.7", header = TRUE, strip.white = TRUE)

# needed libraries
library(reshape2)
library(ggplot2)

# merging dataframes & creating unique identifier variable
data <- merge(df1, df2, by = c("d1","d2"))
data$diseasepair <- paste0(data$d1,"-",data$d2)

data.long <- melt(data, id="diseasepair", measure=c("x","y"), variable="group")

# make the plot
ggplot(data.long) +
  geom_bar(aes(x = diseasepair, y = value, fill = group), 
           stat="identity", position = "dodge", width = 0.7) +
  scale_fill_manual("Group\n", values = c("red","blue"), 
                    labels = c(" X", " Y")) +
  labs(x="\nDisease pair",y="Value\n") +
  theme_bw()

The result:

enter image description here

Is this what you're lookin for?

answered Jan 28, 2014 at 19:41

Jaap

83.7k36 gold badges190 silver badges203 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Rgeek Over a year ago

I have 400k pairs of such kind,so I don't think this would work.It would have worked great for a smaller dataset though.I believe , a curve or heat map could work?

Jaap Over a year ago

For 400k pairs a heat map won't work either IMHO. Do you want to compare the values for each pair? Or just for specific pairs?

Rgeek Over a year ago

Basically I want to show enrichment of disease pairs using the values in one dataset vs the other.So, I want to compare the values for each pair.

Jaap Over a year ago

It's possibly a better solution to make subsets of your dataset for groups or for specific combinations. All those 400k pairs in one plot won't produce a plot of any value (at least that's what I think). First decide what you're looking for, then create subsets & create some plots.

Collectives™ on Stack Overflow

Plotting Graph in R

1 Answer 1

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related