(The last) follow up question: Adjusting values in a table based on another table, Splitting multiple values among multiple values
I have these tables in R:
myt_with_counts <- data.frame(
name = c("a", "a", "a", "b", "b", "c", "c", "c", "d", "e", "e"),
var1 = c(120, 80, 100, 20, 30, 15, 25, 35, 40, 50, 60),
var2 = c(90, 60, 70, 25, 35, 20, 30, 40, 45, 55, 65),
count = c(10, 5, 15, 8, 12, 6, 9, 3, 7, 11, 4)
)
rel_table <- data.frame(
name = c("a", "b", "c"),
proportion = c(0.5, 0.4, 0.1)
)
My Question: I want to redistribute var1 and var2 values from group "a" to other groups based on specified proportions (50% stays, 40% to "b", 10% to "C"), where the amount removed from each "a" row and added to each receiving row is weighted by a count variable. The total sum of each value column remains unchanged after redistribution.
I really struggled to do this:
redistribute <- function(x, val.cols, name.col, count.col, prop, from) {
val.col <- match(val.cols, names(x))
x.names <- x[[name.col]]
prop.names <- prop[[1]]
rows <- match(prop.names, x.names)
from_totals <- colSums(x[x.names == from, val.col, drop=FALSE])
from_counts <- x[x.names == from, count.col]
total_from_counts <- sum(from_counts)
for(i in seq_along(prop.names)) {
group_name <- prop.names[i]
group_prop <- prop[[2]][i]
group_rows <- x.names == group_name
if(group_name == from) {
proportion_to_remove <- 1 - group_prop
amount_to_remove <- from_totals * proportion_to_remove
for(j in seq_along(val.col)) {
removal_per_count <- amount_to_remove[j] / total_from_counts
x[group_rows, val.col[j]] <- x[group_rows, val.col[j]] - from_counts * removal_per_count
}
} else {
group_counts <- x[group_rows, count.col]
total_group_counts <- sum(group_counts)
amount_to_add <- from_totals * group_prop
for(j in seq_along(val.col)) {
addition_per_count <- amount_to_add[j] / total_group_counts
x[group_rows, val.col[j]] <- x[group_rows, val.col[j]] + group_counts * addition_per_count
}
}
}
x
}
result <- redistribute(myt_with_counts, val.cols=c("var1", "var2"), name.col="name", count.col="count", prop=rel_table, from="a")