3. Computing GPLM Estimates

Currently six types of distributions are supported by the gplm quantlib: Binomial, Normal (Gaussian), Poisson, Gamma (includes Exponential), Inverse Gaussian and Negative Binomial (includes Geometric). Table 2 summarizes the models which are available.

Table 2: Supported models.
Distribution Model Code Link Function
Gaussian "noid" identity link (canonical)
  "nopow" power link
Binomial "bilo" Logistic link (Logit, canonical)
  "bipro" Gaussian link (Probit)
  "bicll" complementary log-log link
Poisson "polog" logarithmic link (canonical)
  "popow" power link
Gamma "gacl" reciprocal link (canonical)
  "gapow" power link
Inv. Gaussian "igcl" squared reciprocal link (canonical)
  "igpow" power link
Neg. Binomial "nbcl" canonical link
  "igpow" power link

The quantlet in the gplm quantlib which is mainly responsible for GPLM estimation is 1724 gplmest.


3.1 Estimation


g = 1733 gplmest (code, x, t, y, h {, opt})
estimates a GPLM

The quantlet 1736 gplmest provides a convenient way to estimate a GPLM. The standard call is quite simple, for example


  g=gplmest("bipro",x,t,y,h)

estimates a probit model (binomial with Gaussian cdf link). For 1739 gplmest the short code of the model (here "bipro" ) needs to be given, this is the same short code as for the glm quantlib. Additionally to the data, a bandwidth parameter h needs to be given (a vector corresponding to the dimension of t or just a scalar).

The result of the estimation is assigned to the variable g which is a list containing the following output:

g.b
the estimated parameter vector
g.bv
the estimated covariance of g.b
g.m
the estimated nonparametric function
g.stat
contains the statistics (see Section 5).

Recalling our credit scoring example from Subsection 2.2, the estimation--using a logit link--would be done as follows:


  t=log(t)               ; logs of amount and age

  trange=max(t)-min(t)

  t=(t-min(t))./trange   ; transformation to [0,1]

  library("gplm")



  h=0.4

  g=gplmest("bilo",x,t,y,h)

  g.b

1745 gplm02.xpl

Now we can inspect the estimated coefficients in g.b

  Contents of b

  [1,]  0.96516 

  [2,]  0.74628 

  [3,] -0.049835

A graphical output can be created by calling


  gplmout("bilo",x,t,y,h,g.b,g.bv,g.m,g.stat)

1751 gplm02.xpl

for the current example (cf. Figure 1). For more features of 1756 gplmout see Subsections 4.7 and  5.2.

Figure 1: GPLM output display.
\includegraphics[scale=0.6]{gplmoutput.ps}

Optional parameters must be given to 1760 gplmest in a list of optional parameters. A detailed description of what is possible can be found in Section 4, which deals with the quantlet 1763 gplmopt. Set:


  opt=gplmopt("meth",1,"shf",1)

  opt=gplmopt("xvars",xvars,opt)

  opt=gplmopt("tg",grid(0|0,0.05|0.05,21|21),opt)

1767 gplm03.xpl

This will create a list opt of optional parameters. In the first call, opt is created with the first component meth (estimation method) containing the value 1 (profile likelihood algorithm) and the second component shf (show iteration) set to 1 (``true''). In the second call, the variable names for the linear part of the model are appended to opt. Finally, a grid component tg (for the estimation of the nonparametric part) is defined.

We repeat the estimation with these settings:


  g=gplmest("bilo",x,t,y,h,opt)

1773 gplm03.xpl

This instruction now computes using profile likelihood algorithm (in contrast to the default Speckman algorithm used in example 1778 gplm02.xpl ), shows the iteration in the output window and estimates the function $ m(\bullet)$ on the grid tg. The output g contains one more element now:
g.mg
the estimated nonparametric function on the grid

Since the nonparametric function $ m(\bullet)$ is estimated on two-dimensional data, we can display a surface plot using the estimated function on the grid:


  library("plot")

  mg=setmask(sort(tg~g.mg),"surface")

1782 gplm03.xpl

Figure 2 shows this surface together with a scatterplot of amount and age. The scatterplot shows that the big peak of $ \widehat{m}(\bullet)$ is caused by only a few observations. For the complete XploRe code of this example check the file 1791 gplm03.xpl .

Figure: Scatterplot for amount and age (left). Estimate $ \widehat{m}$ (right).
\includegraphics[scale=0.6]{gplmscatter.ps}
\includegraphics[scale=0.6]{gplmfunction.ps}

The estimated coefficients are slightly different here, since we used the profile likelihood instead of the Speckman algorithm in this case. Figure 3 shows the output window for the second estimation.


3.2 Estimation in Expert Mode


g = 1803 gplmcore(code, x, t, y, h, wx, wt, wc, b0, m0, off, ctrl{, upb, tg, m0g})
estimates a GPLM in expert mode
The 1806 gplmcore quantlet is the most inner ``kernel'' of the GPLM estimation. It does not provide optional parameters in the usual form of an option list as described in Section 4. Also, no check is done for erroneous input. Hence, this routine can be considered to use in expert mode. It speeds up computations and might be useful in simulations, pilot estimation for other procedures or Monte Carlo methods.

The following lines show how 1809 gplmcore could be used in our running example. Note that all data needs to be sorted by the first column of t.


  n=rows(x)

  p=cols(x)

  q=cols(t)



  tmp=sort(t~y~x)      ; sort data by first column of t

  t=tmp[,(1:q)]

  y=tmp[,(q+1)]

  x=tmp[,(q+2):cols(tmp)]



  shf   =  1      ; show iteration (1="true")

  miter = 10      ; maximal number of iterations

  cnv   =  0.0001 ; convergence criterion

  fscor =  0      ; Fisher scoring (1="true")

  pow   =  0      ; power for power link (if useful)

  nbk   =  1      ; k for neg. binomial (if useful)

  meth  =  0      ; algorithm ( -1 = backfitting,

                  ;              0 = Speckman

                  ;              1 = profile likelihood )

  ctrl=shf|miter|cnv|fscor|pow|nbk|meth



  wx    = 1       ; prior or frequency weights 

  wt    = 1       ; trimming weights for estimation of b

  wc    = 1       ; weights for the convergence criterion

  off   = 0       ; offset



  l=glmcore("bilo",x~t~matrix(n),y,wx,off,ctrl[1:6])

  b0=l.b[1:p]

  m0=l.b[p+q+1]+t*l.b[(p+1):(p+q)]



  h=0.4|0.4

  g=gplmcore("bilo",x,t,y,h,wx,wt,wc,b0,m0,off,ctrl)

1813 gplm04.xpl

Optionally, 1818 gplmcore can estimate the function $ m(\bullet)$ on a grid, if tg and m0g are given. In addition, 1821 gplmcore can be also used to compute the biased parametric estimate which is needed for the specification test in Subsection 5.3. In this case the optional parameter upb should be set to 0 (default is 1).



Method and Data Technologies   MD*TECH Method and Data Technologies
  http://www.mdtech.de  mdtech@mdtech.de