I have the following mock up table
#n a b group
1 1 1 1
2 1 2 1
3 2 2 1
4 2 3 1
5 3 4 2
6 3 5 2
7 4 5 2
I am using SAS for this problem. In column group, the rows that are interconnected through a and b are grouped. I will try to explain why these rows are in the same group
- row 1 to 2 are in group 2 since they both have a = 1
- row 3 is in group 2 since b = 2 in row 2 and 3 and row 2 is in group 1
- row 3 and 4 are in group 1 since a = 2 in both rows and row 3 is in group 1
The overall logic is that if a row x contains the same value of a or b as row y, row x also belongs to the same group as y is a part of. Following the same logic, row 5,6 and 7 are in group 2.
Is there any way to make an algorithm to find these groups?
aandbalways increasing for each successive row? If yes, then Richard's answer will work, but if not then this is a much trickier problem that will involve making multiple passes through your data to identify connected components.