0

I am attempting to plot four lines into a single GGPlot2 graph. The issue is that, for no apparent reason, when plotting multiple lines the data gets mixed.

When I plot a single line (line 4.) it looks like this:

enter image description here

But the same line when plotted along with a group of lines looks totally different:

enter image description here

We can see that linetype 3. now represents line 4 (= the shape of line 4 is totally different).

As this issue seems really confusing to me, I am including the relevant parts of the code:

# a single line (number 4) plot
p <- ggplot(data = mean_df, aes(x = foo)) +
  geom_line(aes(y = .data[["4"]]), size=1.2) +
  ylab("y") +
  xlab("x")

print(p)

# multiple lines plot
p <- ggplot(data = mean_df, aes(x = foo)) +
  geom_line(aes(y = .data[["1"]], linetype="dotted"), size=1.1) +
  geom_line(aes(y = .data[["2"]], linetype="twodash"), size=1.1) +
  geom_line(aes(y = .data[["3"]], linetype="longdash"), size=1.1) +
  geom_line(aes(y = .data[["4"]], linetype="solid"), size=1.1) +
  labs(
    title="",
    linetype="Z"
  ) +
  xlab("x") +
  ylab("y") +
  scale_linetype_manual(name="Z", values=c("dotted", "twodash", "longdash", "solid"), labels=c("1", "2", "3", "4")) +
  guides(linetype = guide_legend(override.aes = list(size = 2)))

print(p)

The shape of dataframe (mean_df) has 5 columns, all cells are numeric.

x <- list(0.00000000, 0.06666667, 0.13333333, 0.20000000, 0.26666667, 0.33333333, 0.40000000, 0.46666667, 0.53333333, 0.60000000, 0.66666667, 0.73333333, 0.80000000, 0.86666667, 0.93333333, 1.00000000)
col1 <- list(0.07158121, 0.09441034, 0.11920243, 0.14119030, 0.17993894, 0.20329103, 0.27479900, 0.44655523, 0.58079973, 0.62923797, 0.65742297, 0.68274665, 0.73551633, 0.91081992, 0.88468318, 0.91770913)
col2 <- list(0.01226280, 0.09927955, 0.07809336, 0.09356798, 0.13873392, 0.21159535, 0.34069621, 0.47930396, 0.59753322, 0.64535698, 0.54105539, 0.53885464, 0.74917172, 0.91578496, 0.92687179, 0.93211675)
col3 <- list(0.05849679, 0.12701451, 0.10779754, 0.12351629, 0.14365027, 0.15020727, 0.33345780, 0.48881116, 0.66081110, 0.70052420, 0.70143050, 0.65706529, 0.81447223, 0.91351115, 0.95472268, 0.94854747)
col4 <- list(0.04979115, 0.08789403, 0.06288537, 0.13375946, 0.18554486, 0.19794996, 0.25361769, 0.30654542, 0.51325469, 0.50892014, 0.52454547, 0.55476019, 0.62278916, 0.84428246, 0.88896150, 0.84863063)

mean_df <- data.frame(foo = x, 1 = col1, 2 = col2, 3 = col3, 4 = col4)

print(mean_df)

    foo   `1`    `2`    `3`    `4`
 1  0      0.0716 0.0123 0.0585 0.0498
 2  0.0667 0.0944 0.0993 0.127  0.0879
 3  0.133  0.119  0.0781 0.108  0.0629
 4  0.2    0.141  0.0936 0.124  0.134 
 5  0.267  0.180  0.139  0.144  0.186 
 6  0.333  0.203  0.212  0.150  0.198 
 7  0.4    0.275  0.341  0.333  0.254 
 8  0.467  0.447  0.479  0.489  0.307 
 9  0.533  0.581  0.598  0.661  0.513 
10  0.6    0.629  0.645  0.701  0.509 
11  0.667  0.657  0.541  0.701  0.525 
12  0.733  0.683  0.539  0.657  0.555 
13  0.8    0.736  0.749  0.814  0.623 
14  0.867  0.911  0.916  0.914  0.844 
15  0.933  0.885  0.927  0.955  0.889 
16  1      0.918  0.932  0.949  0.849 

This seemed like a trivial thing to ask but after spending four hours on this issue without any progress I had to post this question. Sorry if there is some self-evident solution for this...

4
  • 1
    If you could provide some reproducible input data, it would help a lot. And in general, in ggplot, referring to column names is a wiser way when adding aes(x,y) than by indexing. Commented May 4, 2020 at 9:21
  • I thought that .data[["1"]] referred to the column named "1" and not the index in this case? Anyhow, I did change the column names to more descriptive but with no difference... Commented May 4, 2020 at 9:46
  • My mistake, I did not notice the double quotes but in any case, yes, please use descriptive column names. How about the reproducible input data? Commented May 4, 2020 at 10:04
  • I added some input data, the mean_df <- ... should have the exact data what is used to plot the graphs with. I did get some results when I used linetype="1", linetype="2", linetype="3" and linetype="4" instead of the named linetypes in the geom_line declarations but I am kinda confused why this works. The real column names are not numeric. Commented May 4, 2020 at 10:11

1 Answer 1

2

Did you intend for your columns to initially be created as lists? I assumed otherwise. I've used them as input for values of columns in a dataframe instead. Whether I plot col4 individually or with the others, I see no difference.

Also, you would have to ideally convert your data into the "long" format using pivot_longer() before you plotted all the lines rather than use geom_line() four times.

library(tidyverse)
library(ggplot2)

mean_df <- tibble(foo = c(0.00000000, 0.06666667, 0.13333333, 0.20000000, 0.26666667, 0.33333333, 0.40000000, 0.46666667, 0.53333333, 0.60000000, 0.66666667, 0.73333333, 0.80000000, 0.86666667, 0.93333333, 1.00000000),
                  col1 = c(0.07158121, 0.09441034, 0.11920243, 0.14119030, 0.17993894, 0.20329103, 0.27479900, 0.44655523, 0.58079973, 0.62923797, 0.65742297, 0.68274665, 0.73551633, 0.91081992, 0.88468318, 0.91770913),
                  col2 = c(0.01226280, 0.09927955, 0.07809336, 0.09356798, 0.13873392, 0.21159535, 0.34069621, 0.47930396, 0.59753322, 0.64535698, 0.54105539, 0.53885464, 0.74917172, 0.91578496, 0.92687179, 0.93211675),
                  col3 = c(0.05849679, 0.12701451, 0.10779754, 0.12351629, 0.14365027, 0.15020727, 0.33345780, 0.48881116, 0.66081110, 0.70052420, 0.70143050, 0.65706529, 0.81447223, 0.91351115, 0.95472268, 0.94854747),
                  col4 = c(0.04979115, 0.08789403, 0.06288537, 0.13375946, 0.18554486, 0.19794996, 0.25361769, 0.30654542, 0.51325469, 0.50892014, 0.52454547, 0.55476019, 0.62278916, 0.84428246, 0.88896150, 0.84863063))
mean_df

# plotting column 4
ggplot(data = mean_df, aes(x = foo)) +
  geom_line(aes(y = col4), size=1.2) +
  ylab("y") +
  xlab("x")

enter image description here

# plotting all of them

mean_df %>%
  pivot_longer(cols = starts_with("col"),
               names_to = "Column Name",
               values_to = "Column Values") %>%
  ggplot(aes(x = foo)) +
  geom_line(aes(y = `Column Values`, 
                col = `Column Name`), size=1.2) +
  scale_color_manual(values = c("Grey","Grey","Grey","Black"))

enter image description here

Sign up to request clarification or add additional context in comments.

1 Comment

I got the same result when I declared the geom_line linetypes with values "1", "2", "3", and "4". The reason I did not convert it to long format was that I had done some analysis earlier and left it in wide format. I guess I should have reconsidered this decision. Your implementation verified my code and a long solution is superior to mine so thank you for that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.