This implies that in simplifying the regression model we should never drop all individually insignificant variables at once, but should drop them one at a time. In sequential F tests the variables are added sequentially and SSR( XI), SSR( X,1 XI), SSR( X , I XI, X,),. , are calculated. A particular order for including the variables has to be specified. 32) where MSE( XI,. , XI)is the mean square error from the regression model that includes the variables XI,. , X i . 6. 2 we discussed the prediction from regression models when the parameters are known.

12), = a+--- 1 I - r:, and Corr (pq,bf) = - r I 2 Thus, as claimed earlier, if rI2approaches 1, the variances of the estimates will become very large and the estimates will be highly correlated. The estimates will become very imprecise. If multicollinearity is present, there is usually not enough information in the data to determine all parameters in the model. Only certain combinations of the parameters can be estimated. This can also be seen from the joint distribution of the parameter estimates.

1. 3. 1. 37) where s 2 = C(y, - f i ~ , ) ~ / ( n1). A lOO(1 - a) percent prediction interval for the future y k is given by The variance of the forecast error is smallest if x k = 0. This is to be expected, since the model forces the regression line through the origin; at this point there is no uncertainty due to parameter estimation. For all other x's, however, the prediction depends on the estimated slope 4. , 2Xn 2. 7). 38) p ( y k - J ? / y d ) = s2 A lOO(1 - a) prediction interval can be calculated from The variance of the prediction error is smallest if Xk = R.