BACCO EMULATOR -- COMMENTS ## The following just describes how to do interpolation. ## You should look at the help information on emulator ## to see what other things it is possible to do. ## For example, the function estimator allows one to estimate ## each known data point using the other data points. This is ## useful for cross-validation. ## To begin, make sure you have loaded adapt, calibrator, emulator, and ## mvtnorm using the package manager. ## In BACCO emulator, the explanatory variables (training data) must be ## in a matrix. ## Each row of the matrix corresponds to a different observation. ## In this example I had to add the dim command to make sure ## x.training was in the correct format. If your data comes ## from a data frame it may already be in the correct format. ## Be sure to check this. x.training<-seq(0,1,0.1) dim(x.training) <- c(11,1) ## In BACCO, you must specify the linear model regression function. ## In computer experiments, it is customary to make this just a constant. ## This is done by defining the regression function to be 1. f<-function(x) { 1 } ## For this simple example, we generate the data according to a known function datagenerator <-function(x) { exp(-1.4*x)*cos((3.5*pi*x)) } ## In this simple example, it may be informative to plot this ## known function. In higher dimensional examples, this may not ## be useful. u<-seq(0,1,0.01) plot(u,datagenerator(u),type="l", ylab="y(x)",lwd=1.5) ## In BACCO, the response (training values) must be a column vector (or ## matrix with 1 column). Double check to see that this is the case. y.training <- datagenerator(x.training) ## BACCO will estimate the correlation parameters using the maximum of the ## posterior likelihood. ## BACCO refers to the correlation parameters as "scales". ## In particular, in estimating the correlation parameters ## BACCO uses a Gaussian correlation function, ## corr(x1 - x2) = exp(-(x1 - x2)'B(x1 - x2)), where ## B is a diagonal matrix whose diagonal entries are the scales. ## The function optimal.scales() computes the estimates. ## Optimal.scales works with the logarithm of the scales you input. ## The choice of starting values (scales.start) is important. ## A poor choice may produce a ## correlation matrix that is nearly singular. If this ## happens you will get an error message to this effect. Try ## changing the starting value for scales.start. ## The authors suggest using 1 for starting values. ## Another choice is to use values that are approximately equal ## to the reciprocal of the minimum distance between two points ## in x.training or to the reciprocal of the square of this distance. ## Note that when there is only a single explanatory variable, you may ## get a warning message. Optimal.scales uses the optim() function and ## this is intended for multi-dimensional optimization. It will do ## one-dimensional optimization, but R recommends using the function ## optimize() for one-dimensional problem. scales.optim <- optimal.scales(val = x.training, scales.start = 10, d = y.training, give=FALSE) ## Once you have estimated the scales, the estimated values can be use in ## interpolation using the function interpolant(). This function does ## prediction at a SINGLE new value (x.unknown) of the explanatory ## variable. It also estimates the following. ## 1. The MLE of the intercept (assuming you have used the intercept ## only approach). ## If you use a more general regression function, you get the estimates ## of all the regression coefficients). ## Denoted $betahat ## 2. The estimate for the prior value of the intercept. ## Denoted $prior ## 3. Posterior estimate for the variance of the est. of the intercept. ## Denoted $beta.var ## 4. Posterior estimate of the std. dev. of the est. of the intercept. ## Denoted $beta.marginal.sd ## 5. Posterior estimate of the process variance. ## Denoted $sigmahat.square ## 6. Posterior estimate of the predicted response at a new value for the ## explanatory variables. ## Denoted $mstar.star ## 7. The prior and posterior correlations of the point with itself. Denoted $cstar and $cstar.star ## 8. The standard deviation of the predicted value. Denoted $Z ## For example, suppose we want to predict at the following value of the explanatory ## (input) variables. x.unknown <- 0.05 newpred <- interpolant(x.unknown, y.training, x.training, scales = scales.optim, func = f, g = TRUE) ## Note that you could just use newpred <- interpolant(0.05, y.training, x.training, scales = scales.optim, func = f, g = TRUE) ## To get predictions and their std. errors for several values of the ## explanatory variables simultaneously, one must use the function ## interpolant.quick(). To use this, you must calculate the inverse of ## the estimated covariance matrix ## for the process, and give this as an argument to ## interpolant.quick. This can be done by ## 1. Using the scales computed from optimal.scales(). ## 2. The function corr.matrix ## 3. Using the function solve() to compute the inverse. ## For example, here is how this would work in our example. A <- corr.matrix(xold = x.training, scales = scales.optim) Ainv <- solve(A) ## The points at which new predictions are desired must be in a ## matrix whose rows are the points. x.unknown <- seq(0,1,0.01) dim(x.unknown) <- c(101,1) ## In the latest version, see the end of this document for ## a possible change in interpolant.quick that may be needed. newpreds <- interpolant.quick(x.unknown, y.training, x.training, Ainv, scales = scales.optim, func = f, give.Z = TRUE, power = 2) ## To put the predicted values in a nice format you can use inputs <- x.unknown predictions <- newpreds$mstar.star std.errs <- newpreds$Z results <- data.frame(inputs, predictions, std.errs) ## Now display them by typing results ## In this example we plotted the known function and it is ## interesting to overlay the predictions over the plot of the ## known function. lines(inputs, predictions, lty=2, lwd=1.5) ## One can also make a plot of the predicted values and overlay ## plots of +/- 2 standard errors on the plot. ## Note that the predictor given the data and the scales is t with ## n + r degrees of freedom, where n is the size of the training sample ## and r is a parameter in the conjugate prior. In the BACCO code one ## finds that the value of r used there is the number of explanatory ## variables used in the regression part of the model. For the ## intercept only model this means r = 0, so the number of ## degrees of freedom in the intercept only model is n. ## In this example the critical value for a t with 11 degrees of freedom ## for 95% confidence is 2.2, so this could be used instead of 2 below ## to give 95% bounds. dev.set(1) plot(inputs, predictions, type="l", lty = 2, lwd=1.5) lines(inputs, predictions + 2*std.errs, lty = 4, lwd=1.5) lines(inputs, predictions - 2*std.errs, lty = 4, lwd=1.5) A SECOND EXAMPLE data(toys) x.training <- D1.toy f<-function(x) 1 y.training <- y.toy scales.optim <- optimal.scales(val = x.training, scales.start = c(1,1,1,1,1), d = y.training, give=FALSE) x.unknown <- c(0.5,0.5,0.5,0.5,0.5) newpred <- interpolant(x.unknown, y.training, x.training, scales = scales.optim, func = f, g = TRUE) A <- corr.matrix(xold = x.training, scales = scales.optim) Ainv <- solve(A) x.unknown <- latin.hypercube(10, 5) newpreds <- interpolant.quick(x.unknown, y.training, x.training, Ainv, scales = scales.optim, func = f, give.Z = TRUE, power = 2) inputs <- x.unknown predictions <- newpreds$mstar.star std.errs <- newpreds$Z results <- data.frame(inputs, predictions, std.errs) results old interpolant.quick > interpolant.quick <- function (x, d, xold, Ainv, scales = NULL, pos.def.matrix = NULL, func = regressor.basis, give.Z = FALSE, power = 2) { if (is.null(scales) & is.null(pos.def.matrix)) { stop("need either scales or a pos.definite.matrix") } if (!is.null(scales) & !is.null(pos.def.matrix)) { stop("scales *and* pos.def.matrix supplied. corr() needs one only.") } if (is.null(pos.def.matrix)) { pos.def.matrix <- diag(scales, nrow = length(scales)) } betahat <- betahat.fun(xold, Ainv, d, func = func) H <- regressor.multi(xold, func = func) mstar.star <- rep(NA, nrow(x)) prior <- rep(NA, nrow(x)) if (give.Z) { Z <- rep(NA, nrow(x)) sigmahat.square <- sigmahatsquared(H, Ainv, d) } for (i in 1:nrow(x)) { hx <- func(x[i, ]) ## The next line seems to be the difference between the old and new ## versions tx <- apply(xold, 1, corr, x2 = x[i, ], pos.def.matrix = pos.def.matrix, power = power) prior[i] <- crossprod(hx, betahat) mstar.star[i] <- prior[i] + crossprod(crossprod(Ainv, tx), (d - H %*% betahat)) if (give.Z) { cstar.x.x <- 1 - quad.form(Ainv, tx) cstar.star <- cstar.x.x + quad.form.inv(quad.form(Ainv, H), hx - crossprod(H, crossprod(Ainv, tx))) Z[i] <- sqrt(abs(sigmahat.square * cstar.star)) } } if (give.Z) { return(list(mstar.star = mstar.star, Z = Z, prior = prior)) } else { return(mstar.star) } } Some output from the old code. Everything seems correct. > betahat <- betahat.fun(x.training, Ainv, y.training, func = f) > H <- regressor.multi(x.training, func = f) > mstar.star <- rep(NA, nrow(x.unknown)) > prior <- rep(NA, nrow(x.unknown)) > hx <- f(x.unknown[1, ]) > hx [1] 1 > pos.def.matrix <- diag(scales.optim, nrow = length(scales.optim)) > tx <- apply(x.training, 1, corr, x2 = x.unknown[1, ], pos.def.matrix = pos.def.matrix, + power = 2) > tx [1] 1.000000000 0.941380784 0.785346507 0.580614613 0.380404168 [6] 0.220868219 0.113645397 0.051820444 0.020940212 0.007498798 [11] 0.002379759 > betahat constant -0.8705626 > pos.def.matrix [,1] [1,] 6.040756 > > prior[1] <- crossprod(hx, betahat) > prior[1] [1] -0.8705626 > mstar.star[1] <- prior[1] + crossprod(crossprod(Ainv, + tx), (y.training - H %*% betahat)) > mstar.star[1] [1] 1 > crossprod(crossprod(Ainv, + + tx), (y.training - H %*% betahat)) [,1] [1,] 1.870563 > crossprod(Ainv,tx) [,1] [1,] 1.000000e+00 [2,] 4.975224e-11 [3,] -4.119515e-11 [4,] 2.613512e-10 [5,] 1.422429e-10 [6,] -1.561705e-10 [7,] 3.897401e-10 [8,] 2.026417e-10 [9,] 7.517307e-11 [10,] 6.853846e-11 [11,] -1.890064e-13 >