I have two date fields that I want to get the monthly difference on a 30-year loan term. The dates are actually in yyyy-mm-dd format in my dataset, however, when I check the dtype in Python it says both are objects. The sample data is below:
|End Date |StartDate|
|----------|---------|
|2/1/2018 |9/30/2016|
|1/1/2024 |9/30/2016|
|10/1/2023 |9/30/2016|
|4/1/2021 |9/30/2016|
|8/1/2030 |9/30/2016|
|10/1/2018 |9/30/2016|
|11/1/2016 |9/30/2016|
|7/1/2018 |9/30/2016|
|8/1/2020 |9/30/2016|
|2/1/2018 |9/30/2016|
|4/1/2018 |9/30/2016|
My code below works:
diff = (pd.to_datetime(df['End Date'], format = '%Y-%m-%d') - pd.to_datetime(df['Start Date'], format = '%Y-%m-%d'))/30
Huge BUT--the output looks like this:
16 days 07:12:00
88 days 07:12:00
85 days 05:36:00
54 days 19:12:00
168 days 10:24:00
24 days 08:48:00
1 days 01:36:00
21 days 07:12:00
46 days 16:48:00
16 days 07:12:00
18 days 06:24:00
How do I remove 'days' and everything after so only the value shows? How do I convert the value to an integer?