7

I have the following data

 dati <- read.table(text="
        class     num
    1     0.0   63530
    2     2.5   27061
    3     3.5   29938
    4     4.5   33076
    5     5.6   45759
    6     6.5   72794
    7     8.0  153177
    8    10.8  362124
    9    13.5  551051
    10   15.5  198634
  ")

And I want to produce a histogram with variable size bins, so that the area of each bar reflects the total numerosity (num) of each bin. I tried

bins <- c(0,4,8,11,16)
p <- ggplot(dati) +
  geom_histogram(aes(x=class,weight=num),breaks = bins)

however, this produces a histogram where the length of each bar is equal to total numerosity of each bin. Because bin widths are variable, areas are not proportional to numerosity. I could not solve this apparently easy problem within ggplot2. Can anyone help me?

2 Answers 2

6

I think you are looking for a density plot - this closely related question has most of the answer. You call y = ..density.. in geom_histogram().

This works because stat_bin (recall geom_histogram() is geom_bar() + stat_bin(), and stat_bin() constructs a data frame with columns count and density. Thus calling y = ..density.. pulls the right column for density, whereas the default (counts) is as if you call y = ..count...

##OP's code
ggplot(dati) +  geom_histogram(aes(x=class, weight=num),
 breaks = bins)

Count Histogram

##new code (density plot)
ggplot(dati) +  geom_histogram( aes(x=class,y = ..density.., weight=num),
 breaks = bins, position = "identity")

Density Histogram

You can find some further examples in the online ggplot2 help page for geom_histogram().

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks a lot. This is more or less what I was looking for. However, because I need to compare different populations, I would like that the total area equal the total numerosity of the population rather than one. Is it possible to scale the histogram?
Possibly by setting y = ..density.. * sum(..count..), but I hesitate to give that answer because it sounds like there may be a better way to do what you're looking for. A new question illustrating your desired output might give you a better method.
Thanks for your help. I posted a new question, in which I explain in detail what I need to do
1

It sounds to me like you asking for how to produce variable sized bar widths. If so, you just need to call the 'width' parameter in your ggplot aesthetics like this:

ggplot(data, aes(x = x, y = y, width = num))

this method is discussed more in the following question: Variable width bars in ggplot2 barplot in R

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.