0

I hope someone can help me with the following problem. My dataframe is organized with insect species in the columns and locations in the rows like:

Species A Species B Species C
Location A 0 1
Location B 2 12 0
Location C 0 5 0

What I need is something like this:

Number Species Location
0 Species A Location A
2 Species A Location B
0 Species A Location C
1 Species B Location A
12 Species B Location B

and so on.

Thank you so much for your help and kindest regards, Julia

So far I have no Idea how to do this and which command will bring the desired result.

2 Answers 2

0

With Excel sheet you can do this with this formula

=TEXTSPLIT(TEXTJOIN("ß",FALSE,TRANSPOSE(MAP(B2:D4,LAMBDA(x,
TEXTJOIN("|",FALSE,x,INDEX(A1:D1,0,COLUMN(x)),INDEX(A1:A4,ROW(x),0)))))),"|","ß",FALSE)

enter image description here

Sign up to request clarification or add additional context in comments.

11 Comments

Thank you so much for your answer! But there is something not really working. Because if I enter the line: =TEXTSPLIT(TEXTJOIN("ß",FALSE,TRANSPOSE(MAP(B2:D241,LAMBDA(x, TEXTJOIN("|",FALSE,x,INDEX(A1:DQ1,0,COLUMN(x)),INDEX(A1:A241,ROW(x),0)))))),"|","ß",FALSE) which should be correct for my datasheet I get the following error: support.microsoft.com/en-us/office/… Do you know which mistake I made?
You have to place the formula with copy/paste in the formula bar directly, not in the cell. If you place it in the cell the text will be divided into two cells ( one line one cell)
I placed it in the formula bar. But there is always a marked cell or area where the text appears. How is it possible to solve this problem?
This need not be solved. This is Excel show the active cell and the split range of the formula. Click another cell and disappears.
I do not know where I made the mistake. But it still do not works and just results in the error message I linked above
|
0

One solution with Pandas:

    # "Stack" the dataframe to get the wanted format
    df = df.stack().reset_index()

    # Rename the columns
    df.columns = ['Location', 'Species', 'number']

    # Update the columns order 
    df = df[df.columns.tolist()[::-1]]

    # Order data by Species
    df = df.sort_values('Species')

    # Remove the index
    df = df.reset_index(drop=True)
    
    display(df)

df


EDIT: in R language instead of Python

library(tidyr)
library(reshape2) 

df <- data.frame(
  ind = c("Location_A", "Location_B", "Location_C"),
  Species_A = c(0, 2, 0),
  Species_B = c(1, 12, 5),
  Species_C = c(NA, 0, 0) 
)

df <- melt(df, id="ind") 
colnames(df) <- c("Location","Species","number")
df <- df[, rev(colnames(df))]
print(df)

4 Comments

Thank you a lot for your answer. Do you know if there is an alternative to pandas. I do not have Python installed so I can not use this package.
Oops, I thought you were using Pyhton/Pandas! There are no language tag on your question. Where does your dataframe comes from? R?
Thank you for your answer. Yes, I use R (RStudio) for all my Statistics and Graphics.
I am not an expert in R but the code I added in my response seems to work.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.