1. Simple Linear Regression


{b, bse, bstan, bpval} = 819 linreg (x, y {,opt {,om}})
estimates the coefficients b for a linear regression of y on x and calculates the ANOVA table
gl = 822 grlinreg (x, {,col})
creates a plot object of the regression line estimated
b = 825 gls (x, y {,om})
calculates the generalized least squares estimator from the data x and y

In this section we consider the linear model

\begin{displaymath}Y=\beta_0+\beta_1 X + \varepsilon\,.\end{displaymath}

As an example, we use the Westwood data stored in westwood.dat. All XploRe codes for this example can be found in 834regr1.xpl. First we read the data and take a look at them.

  z=read("westwood.dat")      ; reads the data

  z                           ; shows the data

  x=z[,2]                     ; puts the x-data into x

  y=z[,3]                     ; puts the y-data into y

gives as output

  Contents of z

  [ 1,]        1       30       73 

  [ 2,]        2       20       50 

  [ 3,]        3       60      128 

  [ 4,]        4       80      170 

  [ 5,]        5       40       87 

  [ 6,]        6       50      108 

  [ 7,]        7       60      135 

  [ 8,]        8       30       69 

  [ 9,]        9       70      148 

  [10,]       10       60      132

We use the quantlet 837 linreg for simple linear regression. Since this quantlet has four values as output, we should put them into four variables. We store them in {beta, bse, bstan, bpval}. Their meaning will be explained below.

  {beta,bse,bstan,bpval}=linreg(x,y)

           ; computes the linear regression and returns the 

                variables of beta, bse, bstan and bpval

  beta     ; shows the value of beta

gives the output

  A  N  O  V  A            SS    df     MSS     F-test   P-value

  ______________________________________________________________

  Regression          13600.000   1 13600.000  1813.333   0.0000

  Residuals              60.000   8     7.500 

  Total Variation     13660.000   9  1517.778 

                                                     

  Multiple R      = 0.99780                          

  R^2             = 0.99561                          

  Adjusted R^2    = 0.99506                          

  Standard Error  = 2.73861                          

                                                     

                                                     

  PARAMETERS        Beta      SE     StandB     t-test   P-value

  ______________________________________________________________

  b[ 0,]=        10.0000    2.5029   0.0000      3.995   0.0040

  b[ 1,]=         2.0000    0.0470   0.9978     42.583   0.0000

and

  Contents of beta

  [1,]       10 

  [2,]        2

As a result, we have the ANOVA (ANalysis Of VAriance) table and the parameters. The estimates $(\widehat\beta_0,

\widehat\beta_1)$ of the parameters $(\beta_0,\beta_1)$ are stored in beta[1] and beta[2] and in this example we obtain

\begin{displaymath}\widehat Y(x)=10+2x\,.\end{displaymath}

It is not necessary to display the values of beta, bse, bstan, bpval separately because they are already written as Beta, SE, StandB and P-value in the parameter table created by 840 linreg. Before considering the graphics we give a short overview of the values returned by the ANOVA and parameter tables. The meaning of the t- and p-values is more apparent in the multiple case. That's why they are explained in the next section.

Let us now describe how to visualize these results. In the left window we show the regression result computed by 843 linreg. In the right window we use the quantlet 846 grlinreg to get the graphical object directly from the data set.


  yq=(beta[1]+beta[2]*x[1:10]) ; creates a vector with the 

                               ;    estimated values of y

  data=sort(x~y)               ; creates object with the data set

  setmaskp(data,1,11,4)        ; creates a graphical object for 

                               ;    the data points

  rdata=sort(x~yq)             ; creates an object with yq

  rdata=setmask(rdata,"reset","line","red","thin")

                               ; sets the options for the 

                               ;    regression function by linreg

  regrdata=grlinreg(data,4)    ; creates the same graphical 

                               ;    object directly from the data 

  regrdata=setmask(regrdata,"reset","line","red","thin")

                               ; sets options for the regression 

                               ;    function by grlinreg

  linregplot=createdisplay(1,2); creates display with 2 windows

  show(linregplot,1,1,data,rdata)

                               ; shows rdata in the 1st window

  show(linregplot,1,2,data,regrdata)

                               ; shows regrdata in the 2nd window

  setgopt(linregplot,1,1,"title","linreg")

                               ; sets the title of the 1st window

  setgopt(linregplot,1,2,"title","grlinreg")

                               ; sets the title of the 2nd window

Figure: Linear regression of the Westwood data: Plot using the regression function computed by 850 linreg (left) and plot using 853 grlinreg (right). 856 regr1.xpl
\includegraphics[scale=0.6]{linreg1.ps}

This will produce the results visible in Figure 1. We create a plot of the regression function by 859 grlinreg if we are only interested in a graphical exploration of the regression line.

A second tool for our simple linear regression problem is the generalized least squares (GLS) method given by the quantlet 862 gls. Here we only consider a model

\begin{displaymath}Y=bX+\varepsilon\,.\end{displaymath}

We take the Westwood data again and assume that it has already been stored in x and y using the unit matrix as weight matrix. This example is stored in 865regr2.xpl.


  b=gls(x,y)        ; computes the GLS fit and stores the

                    ;     coefficients in the variable b

  b                 ; shows b

shows

  Contents of b

  [1,]   2.1761

As a result, we get the parameter b. In our case we find that

\begin{displaymath}\widehat Y(x)=2.1761\,x\;.\end{displaymath}

Figure: Linear regression of the Westwood data: Regression line using 869 gls. 872 regr2.xpl
\includegraphics[scale=0.6]{linreg2.ps}

Note that we have got different results depending on the choice of the method. This is not surprising, as 875 gls ignores the absolute value $\beta_0$. Now we also want to visualize this result.


  yq=b*x[1:10]                 ; creates a vector with the 

                               ;    estimated values

  data=sort(x~y)               ; creates object with the data set

  setmaskp(data,1,11,8)        ; creates graphical object 

                               ;    for the data

  rdata=sort(x~yq)             ; creates object with yq

  rdata=setmask(rdata,"reset","line","red","medium")

                               ; creates graphical object for yq

  glsplot=createdisplay(1,1)   ; creates display

  show(glsplot,1,1,data,rdata) ; shows the graphical objects

  setgopt(glsplot,1,1,"title","gls")

                               ; sets the window title



Method and Data Technologies   MD*TECH Method and Data Technologies
  http://www.mdtech.de  mdtech@mdtech.de