0

I created a bar plot/hist of categorical data and I iterated over the number of records. The problem is the x-ticks are not centered. And somehow there's a zero there that comes from nowhere! Could someone please explain to me how I can center the x-ticks and how did I get a zero in there when it's nowhere in my dataset??!!

I'd appreciate your help. Thanks!

Here's my code:

def distribution(data, colName, title, plot_name):

    fig = plt.figure(figsize = (50,10));

    for i, feature in enumerate(colName):
        ax = fig.add_subplot(1, 2, i+1)
        ax.hist(data[feature], bins = 25, color = 'maroon')
        ax.tick_params(axis='x', rotation=90, labelsize=20)
        ax.set_ylim((0, 2000))
        ax.set_yticks([0, 500, 1000, 1500, 2000])
        ax.set_yticklabels([0, 500, 1000, 1500, ">2000"], fontsize=20)
        ax.bar_label(ax.containers[0], size=20)

    ax.set_title(title, fontsize = 30)
    plt.savefig(plot_name, format='png')
    plt.show()
        
distribution(df, ['Subtype 1'], "Distribution of Subtype 1", "Distribution of Subtype 1.png" )

enter image description here

2
  • Your code sample does not use xticks Commented Oct 18, 2023 at 14:15
  • 1
    Not an answer, but in my experience such layout-details are super-hard to get exactly right with pure matplotlib, but the more high-level-linrary 'seaborn' does not have such problems Commented Oct 18, 2023 at 14:16

1 Answer 1

0

I removed the for loop entirely, because Pandas with groupby function is much easier to work with. Also my mistake was that I used the histogram function not the barplot. In the new code I used the barplot from seaborn library. The result is perfectly representative of the dataset.

Here's my new code:

import seaborn as sns

df = df.groupby("Subtype 1").count().reset_index() 

def distribution(data, colName_x, colName_y, title, plot_name):
    x_axis = colName_x
    y_axis = colName_y
    fig = plt.figure(figsize = (50,10));
    ax = sns.barplot(x=colName_x, y=colName_y, data=data, color = 'maroon')
    ax.tick_params(axis='x', rotation=90, labelsize=20)
    ax.set_ylim((0, 2000))
    ax.set_yticks([0, 500, 1000, 1500, 2000])
    ax.set_yticklabels([0, 500, 1000, 1500, ">2000"], fontsize=20)
    ax.bar_label(ax.containers[0], size=20)

    ax.set_title(title, fontsize = 30)
    plt.savefig(plot_name, format='png', bbox_inches="tight")
    plt.show()

distribution(df, df['Subtype 1'], df['Count'], "Distribution of Subtype 1", "Distribution of Subtype 1.png" )

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.