Python iterate through list with pointer

Question

I am trying to create a session template of dates for a dataframe in pandas based on the start and end day of the week of my given dataframe. I have the start and end day abbreviations (Mo, Tu, We, etc.) and the start/end time (8:30 AM, 5:30 PM, etc.).

What I want to create is a template that gives the start day abbreviation, the times over the days that it spans, and the end day. For example, my dataframe currently looks like the following:

Start Time  End Time    Start/End            Namestart Nameend    Days           Session Template   
Mo 8:30 AM  Th 5:30 PM  Mo 8:30 AM-Th 5:30 PM    Mo      Th       4 Day       4 Day Mo 8:30 AM-Th 5:30 PM
We 8:30 AM  Fr 12:30 PM We 8:30 AM-Fr 12:30 PM   We      Fr       3 Day     3 Day We 8:30 AM-Fr 12:30 PM

The current session template gives me the day count, start time, end time, and the day of week it begins/ends on. However, I would like for it to give each individual day that the item spans. For the examples above it should yield:

4 Day Mo 8:30 AM-5:30 PM, Tu 8:30 AM-5:30 PM, We 8:30 AM-5:30 PM, Th 8:30 AM-5:30 PM. 
3 Day We 8:30 AM-5:30 PM, Th 8:30 AM-5:30 PM, Fr 8:30 AM-12:30 PM

If I understand your problem correctly, I would suggest removing the day abbreviations from the Start_time and End_time. So only use times in those columns and you already have the Namestart and Nameend. You can then loop over the days from Namestart to Nameend and for each day, print the Start_time and End_time. — Rachayita Giri
– Rachayita Giri, Commented Nov 18, 2019 at 15:57
And if you do not want to modify your data frame, you should think in the direction of substring extraction from those two columns, to get the time values like 8:30 AM and 5:30 PM. You will essentially reach a similar step as above but without modifying the data frame. It can be done elegantly using regex (regular expressions). — Rachayita Giri
– Rachayita Giri, Commented Nov 18, 2019 at 16:01
Yes this is what I am trying to solve. How would you loop over the days based on the day count? I will have different values in the 'Days' field (ranging from 1 day to 5 days) and do not want to hard code each individual value. — deeth128
– deeth128, Commented Nov 18, 2019 at 16:02
One way is to store a list like days = ['Mo', 'Tu', 'We', 'Th', 'Fr', 'Sa', 'Su'] and increment a counter after you print for one day. Another way would be to store a dictionary of (day, index) pairs, like so days = {'Mo':1, 'Tu':2, 'We':3, 'Th':4, 'Fr':5, 'Sa':6, 'Su':7] and then you can just use the difference between the corresponding indices. So between Tu and Fr, you have 5-2+1=4 days. You will need to take care of the edge cases. — Rachayita Giri
– Rachayita Giri, Commented Nov 18, 2019 at 16:08

Rachayita Giri · Accepted Answer · 2019-11-18 21:00:52Z

1

Here is how you can do it:

import pandas as pd
import re
import itertools
pd.set_option('display.max_columns', 100)
pd.set_option('display.width', 1000)

df = pd.read_csv("data.csv")
print(df, "\n")

days = ['Mo', 'Tu', 'We', 'Th', 'Fr', 'Sa', 'Su']

for index, row in df.iterrows():
    # get the start and end days
    start_day = row['Namestart']
    end_day = row['Nameend']
    # get the start end end times
    start_time = re.findall(r'\s(\d+\:\d{2}\s?(?:AM|PM|am|pm))',
                           row['Start Time'])[0]
    end_time = re.findall(r'\s(\d+\:\d{2}\s?(?:AM|PM|am|pm))',
                         row['End Time'])[0]
    # get the indices corresponding to the start and end days
    start_index = days.index(start_day)
    end_index = days.index(end_day)+1
    # count the number of days
    cnt = end_index - start_index
    print(cnt, "days\t", end='')
    # slice the days list from start_index to end_index
    for day in itertools.islice(days, start_index, end_index):
        if (day!=end_day):
            print(day, start_time, "- 5:30 PM\t", end='')
        else:
            print(day, start_time, "-", end_time, end='')
    print() # to start a new line before printing each row

Output:

   Start Time     End Time               Start/End Namestart Nameend   Days              Session Template
0  Mo 8:30 AM   Th 5:30 PM   Mo 8:30 AM-Th 5:30 PM        Mo      Th  4 Day   4 Day Mo 8:30 AM-Th 5:30 PM
1  We 8:30 AM  Fr 12:30 PM  We 8:30 AM-Fr 12:30 PM        We      Fr  3 Day  3 Day We 8:30 AM-Fr 12:30 PM 

4 days  Mo 8:30 AM - 5:30 PM    Tu 8:30 AM - 5:30 PM    We 8:30 AM - 5:30 PM    Th 8:30 AM - 5:30 PM
3 days  We 8:30 AM - 5:30 PM    Th 8:30 AM - 5:30 PM    Fr 8:30 AM - 12:30 PM

The comments should explain the code. An explanation of the regular expressions I have used can be found in this answer - https://stackoverflow.com/a/49217300/6590393.

Also, please note that the above code is based on the assumption that you only move forward in the list. So for example, Sa-Mo, would not yield the expected result. I would leave it to you to handle the boundary cases if you need.

edited Nov 18, 2019 at 21:00

answered Nov 18, 2019 at 17:00

Rachayita Giri

4877 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

deeth128 Over a year ago

This is great, thank you so much for your help. There is only one issue still - the end time is only applicable on the end date, so on the days in between the item would not end at 12:30 PM but instead 5:30 PM. For example, the second output should be: 3 Day We 8:30 AM-5:30 PM, Th 8:30 AM-5:30 PM, Fr 8:30 AM-12:30 PM. Is there a way to just hard code the end time for the days in between to be 5:30 PM?

Rachayita Giri Over a year ago

Edited the code. Please accept the answer if it solved your problem.

deeth128 Over a year ago

How could I return the results as a new column say df['session template'] rather than printing the new lines? I tried: for day in itertools.islice(days, start_index, end_index): if (day!=end_day): print(day, start_time, "- 5:30 PM,\t", end='') else: print(day, start_time, "-", end_time, ",", end='') x['session temp'] = day + start_time + "-" + end_time + ","

Rachayita Giri Over a year ago

You can add new columns to a Pandas data frame. Add a column called 'session temp' and then update the values in that column instead of/apart from printing. You'll need to change your x['session temp'] to df['session temp'].

Collectives™ on Stack Overflow

Python iterate through list with pointer

1 Answer 1

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related