If I understand the problem correctly it can be solved by a method called linear programming, using the R library "lpSolve":
library(lpSolve)
regression_1 <- function( data )
{
n <- nrow(data)
L.obj <- c( rep(1,n), 0, 0 )
L.con <- rbind( cbind( diag(data$y), data$x, matrix(1,n,1) ),
cbind( diag(data$y), -data$x, -matrix(1,n,1) ) )
L.rhs <- matrix( cbind( data$y, -data$y ), 2*n, 1 )
L.dir <- rep(">=",2*n)
M <- lp("min", L.obj, L.con, L.dir, L.rhs )
a <- M["solution"][[1]][n+1]
b <- M["solution"][[1]][n+2]
return ( c(a,b) )
}
#--------------------------------------------------------------------
Error <- function( data, ab )
{
ab <- unlist(ab)
sum( abs((ab[1]*data$x+ab[2]-data$y)/data$y) )
}
#====================================================================
# Example:
data.x <- 0:12
data.y <- (3.0+0.3*data.x) * (1+sample(-150:150,length(data.x),TRUE)/1000)
data <- data.frame( x = data.x,
y = data.y )
ab <- regression_1(data)
N <- 30
eps <- (-N:N)/1000
neighborhood <- array( unlist(expand.grid(ab[1]+eps,ab[2]+eps)), c(2*N+1,2*N+1,2))
E <- apply(neighborhood,c(1,2),function(ab_plus_eps){Error(data,ab_plus_eps)})
t(data)
min(E)
Error(data,ab)
ab
Let "n" be he number of rows in the data frame "data" and assume that
y[i] is the measured value given x[i] and
y[i] is positive for each i. (If positive and negative values were allowed,
using the below error function we had a problem near 0.)
(So "x" and "y" correspond to "X1" and "X0" in the formulation of the question, respectively.)
The goal is to estimate "y" by a linear function with slope "a" and y-intercept "b".
More precisely we want to minimize the error function
- Error(a,b) <- sum(abs((a*x+b-y)/y).
Our approach is to use linear programming. Define auxiliary variables "u[1],...,u[n+2]".
Later "u[i]" will become equal to "abs((a*x[i]+b)/y)" for each i<"n"
and "u[n+1],u[n+2] will become equal to the optimal values of "a" and "b", respectively.
For this
- minimize the function "u[1]+...+u[n]" suject to the constraints
- u[i]*y[i] >= u[n+1]*x[i]+u[n+2]-y[i] and
- u[i]*y[i] >= -u[n+1]*x[i]-u[n+2]+y[i] for each i<=n.
Upon minimization of "u[1]+...+u[n]", "u[i]" is equal to "abs((u[n+1]*x[i]+u[n+2])/y[i]"
for each i<="n". Otherwise the value of "y[i]" could be decreases, keeping all other "u[j]"'s
fixed. Taking this into account, under the above contraints the function "u[1]+...+u[n]"
is minimal, if "u[n+1]" and "u[n+2]" are the optimal values of "a" and "b", respectively.
Here is the output of the example:
> t(data)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
x 0.000 1.0000 2.0000 3.0000 4.0000 5.0000 6.00 7.0000 8.0000 9.0000 10.000 11.0000 12.000
y 3.081 3.4353 3.2472 4.4772 3.7758 4.4055 5.04 5.5131 5.4378 5.5119 5.784 6.0102 5.907
> min(E)
[1] 0.6575712
> Error(data,ab)
[1] 0.6575712
> ab
[1] 0.2701 3.0810
For comparison:
> lm(data$y~data$x)
Call:
lm(formula = data$y ~ data$x)
Coefficients:
(Intercept) data$x
3.1741 0.2611
> Error(data,c(0.2611,3.1741))
[1] 0.67915
The values are distinct for two reasons:
- "lm" minimizes the squared distance between the regression line and the sampled data, not the absolute value of the distance.
- In the error terms used by "lm" there is no division by the "y"-values. (In particular, one doesn't have the problem near 0, mentioned above.)