I have a large pandas DataFrames like below.
import pandas as pd
import numpy as np
df = pd.DataFrame(
[
("1", "Dixon Street", "Auckland"),
("2", "Deep Creek Road", "Wellington"),
("3", "Lyon St", "Melbourne"),
("4", "Hongmian Street", "Quinxin"),
("5", "Kadawatha Road", "Ganemulla"),
],
columns=("ad_no", "street", "city"),
)
And I have a second large pandas DataFrame as below.
dfa = pd.DataFrame(
[
("1 Dixon Street", "Auckland"),
("2 Deep Creek Road", "Wellington"),
("3 Lyon St", "Melbourne"),
("4 Hongmian Street", "Quinxin"),
("5 Federal Street", "Porac City"),
],
columns=("address", "city"),
)
I want to check street string in df is available in dfa using str.contains function. I am particularly interested in not matching ones (e.g, Kadawatha Road)Can someone please let me know how to do that? Thanks
I tried the following code. But, it doesn't provide any results.
for a in df['street']:
dfa[dfa['address'].str.contains(a, case=False)]
contains, is it partial string match? str.contains is slow BWT.