I have a sample data that looks like this.
Column DateDuration was calculated in Excel, following below logic:
- DateDuration between SecondDate and FirstDate >= 28, then DateDuration = SecondDate - FirstDate.
- DateDuration between SecondDate and FirstDate <28, if ThirdDate = nan, then DateDuration = SecondDate - FirstDate.
- DateDuration between SecondDate and FirstDate <28, if ThirdDate = not nan, then consider (ThirdDate - FirstDate):
- ThirdDate - FirstDate >= 28, then DateDuration = ThirdDate - FirstDate.
- ThirdDate - FirstDate < 28, if FourthDate = nan, then DateDuration = ThirdDate - FirstDate.
- ThirdDate - FirstDate < 28, if FourthDate = not nan, then DateDuration = FourthDate - FirstDate.
I would like to calculate DateDuration in Python but do not know how to do about this.
Types of data in Python:
- ID int64
- FirstDate object
- SecondDate object
- ThirdDate object
- FourthDate object
I am new to Python. Any help would be greatly appreciated!!
import pandas as pd
import numpy as np
df['FirstDate'] = pd.to_datetime(df['FirstDate'])
df['SecondDate'] = pd.to_datetime(df['SecondDate'])
df['DayDifference2'] = (df['SecondDate']) -(df['FirstDate'])
df['DayDifference3'] = (df['ThirdDate']) -(df['FirstDate'])
df['DayDifference4'] = (df['FourthDate']) -(df['FirstDate'])
x = df['DayDifference2'].dt.days
y = df['DayDifference3'].dt.days
z = df['DayDifference4'].dt.days
condlist = [x<28, x>=28]
choicelist = [(df['ThirdDate']) -(df['FirstDate']), (df['SecondDate']) -(df['FirstDate'])]
np.select(condlist, choicelist)
My data:
| ID | FirstDate | SecondDate | ThirdDate | FourthDate | DateDuration |
|---|---|---|---|---|---|
| 2914300 | 2021-09-23 | 2021-10-07 | 2021-11-29 | 2021-12-20 | 67 |
| 3893461 | 2021-09-08 | 2021-10-06 | 2022-04-07 | 211 | |
| 4343075 | 2021-06-23 | 2021-09-27 | 96 | ||
| 4347772 | 2021-06-23 | 2021-09-27 | 96 | ||
| 4551963 | 2021-08-02 | 2021-10-14 | 2022-03-11 | 73 | |
| 4893324 | 2021-09-30 | 2021-10-01 | 2022-03-03 | 2022-03-10 | 154 |
| 5239991 | 2021-06-24 | 2021-08-26 | 2021-09-25 | 2022-02-03 | 63 |
| 8454947 | 2021-09-28 | 2021-10-05 | 7 | ||
| 8581390 | 2021-09-27 | 2022-03-21 | 2022-03-25 | 175 | |
| 8763766 | 2021-09-20 | 2021-10-04 | 2021-12-09 | 80 | |
| 9144185 | 2021-06-18 | 2021-06-23 | 5 | ||
| 9967685 | 2021-09-13 | 2021-10-29 | 2022-02-07 | 2022-03-23 | 46 |
| 11367560 | 2021-08-31 | 2021-09-28 | 2021-10-21 | 2022-02-11 | 51 |


print(df)and copy/paste the output.