0

I have the following dataframe, which in reality consists of more data points and days:

df = pd.DataFrame({'day_1': [0,1,1,0,1,1,0], 'day_2': [0,0,1,1,1,1,0], 'day_3': [0,1,1,1,0,0,0], 'day_4': [0,1,0,1,0,1,0], 'day_5': [1,0,1,1,1,0,0]})

    day_1    day_2    day_3    day_4    day_5
0       0        0        0        0        1
1       1        0        1        1        0
2       1        1        1        0        1
3       0        1        1        1        1
4       1        1        0        0        1
5       1        1        0        1        0
6       0        0        0        0        0    

The zeros and ones should happen at the same indexes for consecutive days. However, due to measurement errors an expected one will sometimes be a zero. Edit: the expected zero can also be a one. I would like to build a simple model that "learns" the desired behaviour and give the expected output for day 6. The desired output is (not know beforehand, but should be learned by the model):

    day_6   
0       0  
1       1 
2       1
3       1 
4       1 
5       1      
6       0

I know this can be done by various machine learning options. However, I'd like to implement the code in a small microcontroller, so I was wondering if there is a way to do this without using a lot of computational power.

7
  • 1
    Can you add expected output of all columns? Commented Oct 30, 2019 at 9:37
  • @jezrael i had a typo in the expected output. The expected output will be one columns based on the data of all days. E.g. in day one there is a zero at index 3, which needed to be 1. On day two there is a zero on index 1 that should have been a one. I would like to detect a pattern and say before day six: probably there will be ones from index 1 to 5. Hope this makes it clear for you Commented Oct 30, 2019 at 9:43
  • It is un-clear how you get from the input to the output. For example, for row indexed 5, why do you think the output should be 1 and not 0? Please explain in an edit in the question, not as a comment here. Commented Oct 30, 2019 at 9:47
  • 1
    hmmm, is possible use print (df.max(axis=1)) ? If not, can you change data sample for better explain it? Commented Oct 30, 2019 at 9:49
  • 2
    @jezrael you mean df.max(axis=1) Commented Oct 30, 2019 at 9:50

1 Answer 1

1

The simplest thing that you can do is:

test_val = 0.5 # The average value for the previous days, from which you decide the output should be 1
df['day_6'] = 1 * (df.mean(axis=1) >= test_val)

This will give you output of 1 in every row in which at least 50% (the test_val value) of the columns are 1, and 0 otherwise.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks! I have been thinking about something similar. It can be a struggle to pick a proper value for test_val though.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.