I'd creating a Pandas DataFrame in which each particular (index, column) location can be a numpy ndarray of arbitrary shape, or even a simple number.
This works:
import numpy as np, pandas as pd
x = pd.DataFrame([[np.random.rand(100, 100, 20, 2), 3], [2, 2], [3, 3], [4, 4]],
index=['A1', 'B2', 'C3', 'D4'], columns=['data', 'data2'])
print(x)
but takes 50 seconds to create on my computer! Why?
np.random.rand(100, 100, 20, 2) alone is super fast (< 1 second to create)
How to speed up the creation of Pandas datasets containing ndarrays of various shapes?
print. The creation is pretty much instantaneous on my computer, as isprint(x['data2']). Butprint(x['data'])takes about 15 secondsprint(x['data']['A1'])andprint(x['data']['B2'])are likewise super fast. So I guessprintis just having trouble putting together elements of vastly different size. A bug perhaps?