Quantlets are constructed via the XploRe editor. One opens the editor via the New item from the Programs menu. A blank window appears with the name noname.xpl.
Let us construct an example which lets us study the effect of outliers to linear least squares regression. First we generate a regression line of n points by the commands
n = 10 ; number of observations randomize(17654321) ; sets random seed beta = # (1, 2) ; defines intercept and slope x = matrix(n)~sort(uniform(n)) ; creates design matrix m = x*beta ; defines regression line
It is now about time to save this file into a user owned file. Let
us call this file myquant.xpl
We save the file by the command <Ctrl A> or by selecting
the option Save as under Programs. We enter the text
myquant (the ending ``.xpl'' is added automatically).
The editor window now changes its name. Typically, it is
called C:\
XploRe\
myquant.xpl.
We execute the quantlet
by clicking the Execute item or by entering <Alt E>.
Entering m on the action line in the input window yields the
following 10 points:
Contents of m [ 1,] 1.0858 [ 2,] 1.693 [ 3,] 1.8313 [ 4,] 2.1987 [ 5,] 2.2617 [ 6,] 2.3049 [ 7,] 2.3392 [ 8,] 2.6054 [ 9,] 2.6278 [10,] 2.8314
The vector m contains the values of the regression line 1+2x. Let's now add some noise to the regression line and produce a plot of the data. We add the following lines and obtain a picture of the 10 data points that are scattered around the line m(x) = 1 + 2x. We extracted here the second column of the design matrix x, since the first column of x is a column of ones that models the constant intercept term of the regression line:
eps = 0.05*normal(n) ; create obs error y = m + eps ; noisy line d = createdisplay(1,1) dat = x[,2]~y
show(d, 1, 1, dat)we obtain the following plot:
Let's now add the true regression line and the least squares
estimated regression line to this plot. We use the command
setmaskl
to define the line mask and the command
setmaskp
to define the point mask.
Using the command
tdat = x[,2]~mwe define the matrix tdat containing the true regression line. The command
setmaskl(tdat, (1:rows(tdat))', 1, 1, 1)connects all points (2nd parameter) of tdat and defines a blue (3rd parameter: colorcode = 1) solid (4th parameter: type code = 1) line with a certain thickness (5th parameter: thickness code = 1). The command
setmaskp(tdat, 0, 0, 0)sets the data points to its minimum size 0, i.e. invisible.
tdat = x[,2]~m setmaskl(tdat, (1:rows(tdat))', 1, 1, 1) ; thin blue line setmaskp(tdat, 0, 0, 0) ; reduces point size to min beta1 = inv(x'*x)*x'*y ; computes LS estimate yhat = x*beta1 hdat = x[,2]~yhat setmaskl(hdat, (1:rows(hdat))', 4, 1, 3) ; thick red line setmaskp(hdat, 0, 0, 0) show(d, 1, 1, dat, tdat, hdat)
The true regression line is displayed as a thin blue line and the
estimated regression line is shown as a thick red line.
In order to create a quantlet we
encapsulate these commands into a
proc
-
endp
bracket. This way
we can call the quantlet from the action line once it is loaded
into XploRe. We add the line
proc() = myquant()as the first line, indent all following commands by the Tab key or by the Format source command in the Tools menu and add as a last line the word
endp
Altogether we should have the following in the editor window. We save this quantlet by <Ctrl S> or by passing through the Programs menu item:
proc() = myquant() n = 10 ; number of observations randomize(17654321) ; sets random seed beta =#(1, 2) ; defines intercept and slope x = matrix(n)~sort(uniform(n)) ; creates design matrix m = x*beta ; defines regression line eps = 0.05*normal(n) ; creates obs error y = m + eps ; noisy line d = createdisplay(1,1) dat = x[,2]~y tdat = x[,2]~m setmaskl(tdat, (1:rows(tdat))', 1, 1, 1) ; thin blue line setmaskp(tdat, 0, 0, 0) ; reduces point size to min beta1 = inv(x'*x)*x'*y yhat = x*beta1 hdat = x[,2]~yhat setmaskl(hdat, (1:rows(hdat))', 4, 1, 3) ; thick red line setmaskp(hdat, 0, 0, 0) show(d, 1, 1, dat, tdat, hdat) endp
If we execute this program code via the Execute item, nothing will happen since the code contains only the definition of a quantlet. The quantlet performs the desired action only if it is called. By entering the command
myquant()on the action line in the input window, we obtain the same picture as before. The quantlet myquant is now loaded in XploRe, and we can repeat this plot as many times as we want.
Let's now modify the quantlet so that the user of this quantlet may add another observation to the existing 10 observations. This additional observation will be the outlier whose influence on least squares regression we wish to study. We do this by allowing myquant to process an input parameter obs1 containing the coordinates (x, y) of an additional observation. We change the first line of myquant to proc() = myquant(obs1) and add the lines
// new x-observation is x = x|(1~obs1[1])after the creation of the original design matrix. The first line is a comment, and the second line adds the x-coordinate of the new observation obs1 to the design matrix. Note that we also added a 1 to the first column of the design matrix in order to correctly reflect the intercept term in the least squares regression also for this 11th observation.
The second modification is given by the following two lines:
// new y-observation y = m[1:n] + eps ; noisy line y = y|obs1[2]The first line is again a comment and the second line adds the normal errors eps to the first n values of the m-observation. The third line adds the y-value of the new observations obs1 to the response values.
We also display the outlying observation in a different way than the other observations.
outl = obs1[1]~obs1[2] setmaskp(outl,4,12,8) show(d, 1, 1, dat[1:n], outl, tdat, hdat)The second parameter in the
setgopt(d,1,1,"title","Least squares regression with outlier")The command
If we save and execute now the quantlet, we have it ready in XploRe to be executed from the action line:
proc() = myquant(obs1) n = 10 ; number of observations randomize(17654321) ; sets random seed beta =#(1, 2) ; defines intercept and slope x = matrix(n)~sort(uniform(n)) ; creates design matrix // new x-observation is x = x|(1~obs1[1]) m = x*beta ; defines regression line eps = 0.05*normal(n) ; creates obs error // new y-observation y = m[1:n] + eps ; noisy line y = y|obs1[2] d = createdisplay(1,1) dat = x[,2]~y outl = obs1[1]~obs1[2] setmaskp(outl,0,12,8) ; outlier is black star tdat = x[,2]~m setmaskl(tdat, (1:rows(tdat))', 1, 1, 1) ; thin blue line setmaskp(tdat, 0, 0, 0) ; reduces point size to min beta1 = inv(x'*x)*x'*y yhat = x*beta1 hdat = x[,2]~yhat setmaskp(hdat, 0, 0, 0) setmaskl(hdat, (1:rows(hdat))', 4, 1, 3) ; thick red line show(d, 1, 1, dat[1:n], outl, tdat, hdat) title="Least squares regression with outlier" setgopt(d,1,1,"title",title) ; sets title endp
By entering
myquant(#(0.9,4.5))from the action line, we obtain the following graphic which shows the effects of this outlier on the least squares regression:
One clearly sees the nonrobustness of the least squares estimator. The additional observation (0.9, 4.5) influences the estimated regression line. The thick red line is different from the true regression line indicated as the thin blue line.
The situation becomes even more extreme when we move the x-observation of the new observation into the leverage zone outside the interval [0,1]. Suppose that we call the quantlet with the new observation (2.3, 45). The x-value of this new observation is clearly outside the range [0,1] of the first uniformly generated 10 design values. The y-value 45 of the new observation is enormous relative to the range of the other 10 values.
myquant(#(2.3,45))The effect will be that the thick red line will be even more apart from the blue line. This becomes clear from the following graphic:
We may now leave XploRe and recall the quantlet when we restart XploRe.
Suppose we have done this. How do we call our quantlet again?
Assuming that XploRe is installed in the C:\
XploRe
directory of our computer, we use the command
func("C:\XploRe\myquant.xpl")
Let us now do this loading of the quantlet automatically by defining a
new quantlet myquant2.xpl.
This quantlet contains the
func
command and a call to the quantlet
myquant by e.g.
myquant(#(0.9, 4.5))
If we encapsulate the code into a
proc
-
endp
bracket, we have the following quantlet
proc()=myquant2() myquant(#(0.9, 4.5)) endp func("C:\XploRe\myquant.xpl") myquant2()Executing it will reproduce the same picture as before, but note that this time the call to myquant is done from another quantlet. Let us modify this procedure further so that the user may add outlying observations interactively. Suppose we want to see the effect of adding a new observation three times. We realize this by a
proc()=myquant2() ValueNames = "x=" | "y=" defaults = 0.9 | 4.5 i = 1 while (i<=3) v = readvalue(ValueNames, defaults) myquant(v) i = i+1 endo endp func("C:\XploRe\myquant.xpl") myquant2()
The new quantlet myquant2.xpl first loads the existing quantlet
myquant.xpl by the command
func("C:\
XploRe\
myquant.xpl").
The names of the values to be read are defined by
ValueNames = "x=" | "y=", the default values are
set to (0.9,4.5) by defaults = 0.9 | 4.5.
Then the loop construction
with initial value i = 1 and end value 3
guarantees that the commands
v = readvalue(ValueNames, defaults) myquant(v)are executed exactly three times. If we enter for example (10,45) as the outlier value then we obtain the following graphic:
The
while
-
endo
construction is explained in more
detail in Subsection 2.4. For more information on
readvalue
see Section 3.
![]() |
MD*TECH Method and Data Technologies |
http://www.mdtech.de mdtech@mdtech.de |