learn the expected output of a pandas dataframe

Question

I have the following dataframe, which in reality consists of more data points and days:

df = pd.DataFrame({'day_1': [0,1,1,0,1,1,0], 'day_2': [0,0,1,1,1,1,0], 'day_3': [0,1,1,1,0,0,0], 'day_4': [0,1,0,1,0,1,0], 'day_5': [1,0,1,1,1,0,0]})

    day_1    day_2    day_3    day_4    day_5
0       0        0        0        0        1
1       1        0        1        1        0
2       1        1        1        0        1
3       0        1        1        1        1
4       1        1        0        0        1
5       1        1        0        1        0
6       0        0        0        0        0

The zeros and ones should happen at the same indexes for consecutive days. However, due to measurement errors an expected one will sometimes be a zero. Edit: the expected zero can also be a one. I would like to build a simple model that "learns" the desired behaviour and give the expected output for day 6. The desired output is (not know beforehand, but should be learned by the model):

I know this can be done by various machine learning options. However, I'd like to implement the code in a small microcontroller, so I was wondering if there is a way to do this without using a lot of computational power.

@jezrael i had a typo in the expected output. The expected output will be one columns based on the data of all days. E.g. in day one there is a zero at index 3, which needed to be 1. On day two there is a zero on index 1 that should have been a one. I would like to detect a pattern and say before day six: probably there will be ones from index 1 to 5. Hope this makes it clear for you — Hoekieee
– Hoekieee, Commented Oct 30, 2019 at 9:43
It is un-clear how you get from the input to the output. For example, for row indexed 5, why do you think the output should be 1 and not 0? Please explain in an edit in the question, not as a comment here. — Aryerez
– Aryerez, Commented Oct 30, 2019 at 9:47
hmmm, is possible use print (df.max(axis=1)) ? If not, can you change data sample for better explain it? — jezrael
– jezrael, Commented Oct 30, 2019 at 9:49

Aryerez · Accepted Answer · 2019-10-30 10:31:38Z

1

The simplest thing that you can do is:

test_val = 0.5 # The average value for the previous days, from which you decide the output should be 1
df['day_6'] = 1 * (df.mean(axis=1) >= test_val)

This will give you output of 1 in every row in which at least 50% (the test_val value) of the columns are 1, and 0 otherwise.

answered Oct 30, 2019 at 10:31

Aryerez

3,5032 gold badges12 silver badges18 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Hoekieee Over a year ago

Thanks! I have been thinking about something similar. It can be a struggle to pick a proper value for test_val though.

Collectives™ on Stack Overflow

learn the expected output of a pandas dataframe

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related