1

I am trying to create linear correlation graph between to variables using ggplot2:

dput(sum)
structure(list(Date = structure(c(15218, 15248, 15279, 15309, 
15340, 15371, 15400, 15431, 15461, 15492, 15522, 15553), class = "Date"), 
    Teams = c(87, 142, 173, 85, 76, 76, 93, 49, 169, 139, 60, 
    120), Scores = c("67101651", "62214988", "63183320", "66750198", 
    "61483322", "67546775", "75290893", "60713372", "77879142", 
    "70290302", "83201853", "83837301")), .Names = c("Date", 
"Teams", "Scores"), row.names = c(NA, 12L), class = "data.frame")

this is my command:

ggplot(sum, aes(x = Scores, y = Teams, group=1)) + 
    geom_smooth(method = "lm", se=TRUE, size=2, formula = lm(Teams ~ Scores))

I get this error:

Error in eval(expr, envir, enclos) : object 'Teams' not found

any ideas?

2
  • For any future reader of this thread - to call a data frame like one of the most used base R functions is not a good idea. On this occasion, may I remind that df is also a base R function (although less often used than sum) Commented Jan 24, 2021 at 11:54
  • Does this answer your question? Adding a regression line on a ggplot Commented Jan 24, 2021 at 12:02

2 Answers 2

1

If you want to specify the formula for, e.g., linear model, use y ~ poly(x, 1). You don't need to change the formula parameter as long as you want a simple linear regression (it's the default for method = "lm"):

ggplot(sum, aes(x = Scores, y = Teams, group = 1)) +
  geom_smooth(method = "lm", formula = y ~ poly(x, 1), se = TRUE, size = 2)

I also would recommend using Scores as numeric values (as.numeric(Scores)) if you don't want this variable to be categorial. This would change the regression line.

Score as categorial variable:

categorial

Score as numeric variable:

numeric

Sign up to request clarification or add additional context in comments.

2 Comments

is there an easy way to print the r square value as a legend?
An easy way would be to print it in the the title. Just add opts(title = bquote(R^2 ~ ":" ~ .(summary(lm(Teams ~ as.numeric(Scores), sum))$r.squared))) to the plot.
1

Here's another option using stat_cor from the ggpubr package. This code will plot your points and display the correlation and p value. You can change "pearson" to "spearman" if you have non-normal data.

ggplot(sum, aes(x = Scores, y = Teams, group = 1)) +
  geom_point(aes()) +
  geom_smooth(method = "lm", se = TRUE, size = 2) +
  stat_cor(method = "pearson", cor.coef.name = "r", vjust = 1, size = 4)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.