0

I'd creating a Pandas DataFrame in which each particular (index, column) location can be a numpy ndarray of arbitrary shape, or even a simple number.

This works:

import numpy as np, pandas as pd
x = pd.DataFrame([[np.random.rand(100, 100, 20, 2), 3], [2, 2], [3, 3], [4, 4]],
                              index=['A1', 'B2', 'C3', 'D4'], columns=['data', 'data2'])
print(x)

but takes 50 seconds to create on my computer! Why?

np.random.rand(100, 100, 20, 2) alone is super fast (< 1 second to create)

How to speed up the creation of Pandas datasets containing ndarrays of various shapes?

5
  • When a pandas DataFrame is a homogenous type, the whole thing can be a single numpy array. When you create a list like this where the columns are hetergeneous, pandas has to do a bunch of bookkeeping and reformatting to keep track of the different datatypes. Commented Jun 23, 2022 at 23:25
  • Yes probably @TimRoberts but here I only have ~400 000 coefficients to store in the dataframe. 50 seconds for this is really problematic! Is there an easy fix here? Commented Jun 23, 2022 at 23:27
  • It's not the creation taking time, it's the print. The creation is pretty much instantaneous on my computer, as is print(x['data2']). But print(x['data']) takes about 15 seconds Commented Jun 23, 2022 at 23:41
  • Oh you're right @Nick, solved! You can post as an answer! Commented Jun 23, 2022 at 23:42
  • In fact print(x['data']['A1']) and print(x['data']['B2']) are likewise super fast. So I guess print is just having trouble putting together elements of vastly different size. A bug perhaps? Commented Jun 23, 2022 at 23:43

1 Answer 1

2

It's not actually the creation that is the issue, it's the print statement. 1000 loops of the creation take 2.8 seconds on my computer. But one iteration of the print takes about 26 seconds.

Interestingly, print(x['data2']), print(x['data']['A1']) and print(x['data']['B2']) are all basically instantaneous. So it seems print is having an issue figuring out how to display items of vastly different size. Perhaps a bug?

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.