I have a column in a pandas df of type object that I want to parse to get the first number in the string, and create a new column containing that number as an int.
For example:
Existing df
col
'foo 12 bar 8'
'bar 3 foo'
'bar 32bar 98'
Desired df
col col1
'foo 12 bar 8' 12
'bar 3 foo' 3
'bar 32bar 98' 32
I have code that works on any individual cell in the column series
int(re.search(r'\d+', df.iloc[0]['col']).group())
The above code works fine and returns 12 as it should. But when I try to create a new column using the whole series:
df['col1'] = int(re.search(r'\d+', df['col']).group())
I get the following Error:
TypeError: expected string or bytes-like object
I tried wrapping a str() around df['col'] which got rid of the error but yielded all 0's in col1
I've also tried converting col to a list of strings and iterating through the list, which only yields the same error. Does anyone know what I'm doing wrong? Help would be much appreciated.
DataFrame.apply()method. Probably your computation is too complex for a simple assign.df['col'].str.extract(r'(\d+)')expand=False...