0

I wish to compare two datasets. To this end, I want to represent the first as a violin and the second as points.

Unfortunately, there is a discrepancy between the two plots, as can be seen in the figure attached to the message.

I am unable to overlay the two plots.

Here is the part of the script that I use.

figure, ax = plt.subplots(1, 1, figsize=(10,10))

ax = sns.violinplot(data=df2.rolling(window=3).mean(),   split=False, color="pink")

sns.stripplot(data=df.rolling(window=3).mean(), dodge=True,  linewidth=1, jitter=False, ax=ax)

Here output of df.describe() and df2,describe()

df2.describe()

     Bafoussam

count 1395.000000 mean 7.030180 std 10.730945 min 0.000000 25% 0.000000 50% 2.900000 75% 9.800000 max 102.300003

df.describe()

   Bafoussam    Dschang

count 31.000000 31.000000 mean 16.980645 26.138710 std 25.344683 31.060786 min 0.000000 0.000000 25% 0.000000 0.000000 50% 8.200000 8.100000 75% 24.500000 48.800000 max 96.200000 90.600000

enter image description here

enter image description here

Finally, I reproduced what I wanted by applying this

fig = plt.figure(figsize=(10,14))
gs = fig.add_gridspec(2, 2)
ax1 = fig.add_subplot(gs[0, 0])
ax2 = fig.add_subplot(gs[0, 1])

ax3 = fig.add_subplot(gs[1, 0])
ax4 = fig.add_subplot(gs[1, 1])



sns.violinplot(data=df2["Station1"],   split=False, color="pink", inner=None, ax=ax1)
sns.stripplot(data=df["Statation1"], dodge=True,  linewidth=1, jitter=False, ax=ax1, color="red", size=16)


sns.violinplot(data=df2["Station2"].rolling(window=rday).mean(), inner=None, split=False, color="pink", ax=ax2)
sns.stripplot(data=df["Station2"].rolling(window=rday).mean(), dodge=True,  linewidth=1, jitter=False, ax=ax2, color="red", size=16)

enter image description here

6
  • I am using a daily rainfall dataset for a long period (for violin) and a specific year (for stripplot). I want to show the extremeness of this particular year. Thanks Commented Sep 4, 2024 at 13:10
  • I added the result of df.describe() Commented Sep 4, 2024 at 13:14
  • 1
    Please add more information, please never add text as image, always as text. Please explain how you expect the plot to look like when df2 has only one column ("0"), and df has two columns. Please provide some reproducible test data. Commented Sep 4, 2024 at 13:59
  • These are the asked outputs. I am selecting one column from df df2.describe() Out[865]: Bafoussam count 1395.000000 mean 7.030180 std 10.730945 min 0.000000 25% 0.000000 50% 2.900000 75% 9.800000 max 102.300003 df.describe() Out[866]: Bafoussam Dschang count 31.000000 31.000000 mean 16.980645 26.138710 std 25.344683 31.060786 min 0.000000 0.000000 25% 0.000000 0.000000 50% 8.200000 8.100000 75% 24.500000 48.800000 max 96.200000 90.600000 Commented Sep 4, 2024 at 14:06
  • 1
    Please add this to the post, not the comments. Commented Sep 4, 2024 at 14:08

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.