Keywords - Function groups - @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Library: xplore
See also: sort cumsum paf diff

Quantlet: discrete
Description: Reduces a matrix to its distinct rows and gives the number of replications of each row in the original dataset. An optional second matrix y can be given, the rows of y are summed up accordingly.

Usage: {xr,yr} = discrete(x{,y})
Input:
x n x p matrix, the data matrix to reduce, in regression usually the design matrix. The matrix may be numeric or string, in the latter case no y is possible.
y optional, n x q matrix, in regression usually the observations of the dependent variable. Not possible for string matrix x.
Output:
xr m x p matrix, reduced data matrix (sorted).
yr m x 1 vector or m x (q+1) matrix, contains in the first column the number of replications. If y was given, sums of y-rows with same x-row are contained in the other q columns of r.

Example:

library("xplore")

n=100

b=1|2

x=ceil(normal(n,rows(b)))

y=x*b + normal(n)

; --------------------------------------

;  data reduction

; --------------------------------------

{xr,yr}=discrete(x,y)

r =yr[,1]

yr=yr[,2]

rows(r)

; --------------------------------------

;  descriptive statistics of x

; --------------------------------------

meanxr = sum(r.*xr)/sum(r)

varxr  = sum(r.*(xr-meanxr)^2)/(sum(r)-1)

mean(x)'~meanxr'

var(x)'~varxr'

; --------------------------------------

;  linear regression

; --------------------------------------

b=inv(x'*x)*x'*y

br=inv(xr'*diag(r)*xr)*xr'*yr

b~br

Result:

Matrices x, y with 100 rows are reduced to a matrix xr 

(containing distinct rows of x) and yr (sums of y with same 

rows in x). 

r gives the number of replications. The mean and variance 

of x coincide with the weighted mean and variance of xr.

The linear regression of y on x coincides with the weighted 

regression of yr on xr.


Library: xplore
See also: sort cumsum paf diff

Keywords - Function groups - @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Author: Thomas Koetter, Marlene Mueller, 970325
(C) MD*TECH Method and Data Technologies, 21.9.2000