1

I have 2 different lines of code:

ggplot(mpg,aes(displ,hwy,colour = factor(cyl))) + geom_point() + geom_smooth(method = 'lm')

ggplot(mpg,aes(displ,hwy)) + geom_point(aes(colour = factor(cyl))) + geom_smooth(method = 'lm')

The first code produces 3 regression lines for 3 different groups(factor variable) . Second code produces only one line for the whole dataset.

My question is: What is the logic behind this difference? I see, that the output depends on colour = factor(cyl) but can you explain me the logic of ggplo2 in this case?

1
  • 1
    Well, I'd think of it like the 1st one uses a global aesthetics mapping, which is - by default - inherited by every geom. The 2nd one uses a local mapping for the color aesthetic. Commented Dec 17, 2016 at 17:41

1 Answer 1

1

Consider the following 2 lines of code which are equivalent (in the first case we are grouping both the geom_point and geom_smooth by globally providing the colour variable in ggplot and in the second case locally with both geom_point and geom_smooth):

ggplot(mpg,aes(displ,hwy,colour = factor(cyl))) + 
geom_point() + 
geom_smooth(method = 'lm')

ggplot(mpg,aes(displ,hwy)) + 
geom_point(aes(colour = factor(cyl))) + 
geom_smooth(aes(colour = factor(cyl)), method = 'lm')

enter image description here

Now the other example code that you provided, there you color the points with different colors by grouping them with cyl variable but you dont group geom_smooth that's why it fits on the entire data instead of fitting on 3 groups separately like the earlier case.

ggplot(mpg,aes(displ,hwy)) + 
geom_point(aes(colour = factor(cyl))) + 
geom_smooth(method = 'lm')

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.