4. Example: Media


4.1 Description of the Data Set

The data set comes from a survey where 12,388 contacts with various media have been identified (Lebart, L., Morineau, A., and Piron, M., 1995). These contacts are crossed by activities (the statistical units are the media contacts). Besides, they are crossed with some supplementary variables: sex, age and education level.

The active data is stored in the file m.dat which contains six items (columns) of media and eight activities (rows)


    96      118        2       71       50       17

   122      136       11       76       49       41

   193      184       74       63      103       79

   360      365       63      145      141      184

   511      593       57      217      172      306

   385      457       42      174      104      220

   156      185        8       69       42       85

  1474     1931      181      852      642      782

The column labels are stored in file mctxt.dat as shown below

  RADIO

  TV

  N_NEWS

  R_NEWS

  MAGAZ

  TVMAG

The vector of row labels is stored in the file mltxt.dat

  la_Farmer

  s_busin

  h_manag

  i_manag

  empl

  skil

  unsk

  Nowork

Supplementary row data are stored in the file msl.dat:

  1630     1900      285      854      621      776

  1667     2069      152      815      683      938

   660      713       69      216      234      360

   640      719       84      230      212      380

   888     1000      130      429      345      466

   617      774       84      391      262      263

   491      761       70      402      251      245

   908     1307       73      642      360      435

   869     1008      107      408      336      494

   901     1035       80      140      311      504

   619      612      177      209      298      281

The eleven supplementary row labels are stored in the file msltxt.dat:

  MALE

  FEMALE

  A14-24

  A25-34

  A35-49

  A50-64

  A65+

  PRIMARY

  SECOND

  H_TECH

  UNIVER


4.2 Calling the Quantlet

The next code which calls the quantlet 1151 corresp and analyzes the dataset m.dat.

  library("stats")

  corresp("m.dat","msl.dat","null","MEDIA","mltxt.dat",

                               "mctxt.dat","msltxt.dat","null")

1157 corre02.xpl


4.3 Brief Interpretation

We obtain the following output.


  [1,] EIGENVALUES AND PERCENTAGES

 

  Contents of seig 

 

  [1,]   0.0139   62.1982   62.1982

  [2,]   0.0072   32.3650   94.5632

  [3,]   0.0008    3.7018   98.2650

  [4,]   0.0003    1.3638   99.6288

  [5,]   0.0001    0.3712  100.0000

The first two axes together account for 95% of total variation and are very dominant. This percentage gives an idea of the share of information accounted for by the first two principal axes.

Coordinates on different axes and other indices helpful for interpreting the results are shown in following output which also includes the coordinates and the squared correlations of supplementary items.


  [1,] Row relative weights and distances to the origin 

 

  Contents of spdai 

 

  [1,]   0.0286    0.0032

  [2,]   0.0351    0.0016

  [3,]   0.0562    0.0039

  [4,]   0.1015    0.0011

  [5,]   0.1498    0.0009

  [6,]   0.1116    0.0011

  [7,]   0.0440    0.0014

  [8,]   0.4732    0.0005



  [1,] Coordinates of the rows



  Contents of scoordi



  [1,]   -0.0015   -0.0028    0.0006    0.0001   -0.0002

  [2,]   -0.0006   -0.0013    0.0006   -0.0002    0.0002

  [3,]    0.0039   -0.0005    0.0000   -0.0002   -0.0001

  [4,]    0.0010    0.0003    0.0003    0.0002    0.0001

  [5,]   -0.0001    0.0009    0.0000    0.0002    0.0000

  [6,]   -0.0004    0.0009    0.0002   -0.0003    0.0000

  [7,]   -0.0011    0.0009    0.0004    0.0000   -0.0002

  [8,]   -0.0003   -0.0003   -0.0002    0.0000    0.0000

In the following window we remark, for instance, that the relative frequency of national newspapers (N NEWS) (3-rd active column item) is very small (3.54%).

  [1,] Column relative weights and distances to the origin

 

  Contents of spdaj

 

  [1,]   0.2661    0.0005

  [2,]   0.3204    0.0005

  [3,]   0.0354    0.0049

  [4,]   0.1346    0.0014

  [5,]   0.1052    0.0015

  [6,]   0.1384    0.0015

 

  [1,] Coordinates of the columns



  Contents of scoordj



  [1,]    0.0001    0.0002    0.0004    0.0000    0.0000

  [2,]   -0.0005    0.0000   -0.0001   -0.0001   -0.0001

  [3,]    0.0049   -0.0001   -0.0002   -0.0004    0.0001

  [4,]   -0.0010   -0.0010    0.0000   -0.0001    0.0001

  [5,]    0.0009   -0.0012   -0.0002    0.0003    0.0000

  [6,]   -0.0001    0.0015   -0.0002    0.0001    0.0001

but its distance to the origin is very high (0.049), which tells that its profile is very specific in terms of activities. As a result it contributes 74.6% as can be seen from the following output, to the construction of the first axis. Geometrically it is very close to this axis (squared correlation is 0.99).

  [1,] Contributions of the columns

 

  Contents of scontrj

 

  [1,]   0.4287    1.8037   70.3836    0.6207    0.1489

  [2,]   6.5641    0.0192   10.5160   13.2700   37.5915

  [3,]  74.5877    0.0189    1.8090   18.1763    1.8723

  [4,]  11.5011   22.4356    0.4460    7.5324   44.6282

  [5,]   6.8233   25.6080    4.4877   50.8035    1.7592

  [6,]   0.0950   50.1145   12.3576    9.5970   13.9999

 

  [1,] Squared correlations of the columns 

 

  Contents of scorrj 



  [1,]   0.0770    0.1685    0.7520    0.0024    0.0002

  [2,]   0.8508    0.0013    0.0811    0.0377    0.0291

  [3,]   0.9930    0.0001    0.0014    0.0053    0.0001

  [4,]   0.4866    0.4940    0.0011    0.0070    0.0113

  [5,]   0.3168    0.6186    0.0124    0.0517    0.0005

  [6,]   0.0035    0.9587    0.0270    0.0077    0.0031

The first axis is highly explained by the 3-rd active row item high manager (h manag) in the following output window:

  [1,] Contributions of the rows

 

  Contents of scontri

 

  [1,]   5.6928   37.9892   17.8813    1.9590   15.8850

  [2,]   1.1848    9.9793   17.6701    4.7954   28.0180

  [3,]  74.9579    2.8872    0.0622    5.2257    8.5732

  [4,]   8.3279    1.4964   11.7552   21.4483   17.5522

  [5,]   0.2675   18.9376    0.4701   20.3081    2.1711

  [6,]   1.5383   15.9009    5.0508   46.0393    0.4038

  [7,]   4.4054    5.4906    8.4193    0.1767   26.8961

  [8,]   3.6255    7.3188   38.6910    0.0476    0.5005



  [1,] Squared correlations of the rows

 

  Contents of scorri

 

  [1,]   0.2135    0.7414    0.0399    0.0016    0.0036

  [2,]   0.1538    0.6742    0.1366    0.0137    0.0217

  [3,]   0.9782    0.0196    0.0000    0.0015    0.0007

  [4,]   0.8022    0.0750    0.0674    0.0453    0.0101

  [5,]   0.0252    0.9289    0.0026    0.0420    0.0012

  [6,]   0.1383    0.7437    0.0270    0.0907    0.0002

  [7,]   0.5557    0.3604    0.0632    0.0005    0.0202

  [8,]   0.3722    0.3910    0.2364    0.0001    0.0003


  [1,] SUPPLEMENTARY ITEMS

 

  [1,] Row relative weights and distances to the origin

 

  Contents of spdsl

 

  [ 1,]   0.1644    0.0006

  [ 2,]   0.1714    0.0006

  [ 3,]   0.0610    0.0012

  [ 4,]   0.0614    0.0012

  [ 5,]   0.0883    0.0004

  [ 6,]   0.0648    0.0010

  [ 7,]   0.0602    0.0016

  [ 8,]   0.1010    0.0015

  [ 9,]   0.0873    0.0004

  [10,]   0.0805    0.0024

  [11,]   0.0595    0.0026

The 11-th supplementary row item university education (UNIVER) is closely linked to factor 1, see the following output:

  [1,] Squared correlations of the rows

 

  Contents of scontrsi

 

  [ 1,]   0.4813    0.1104    0.0215    0.3239    0.0629

  [ 2,]   0.4910    0.1025    0.0213    0.3261    0.0591

  [ 3,]   0.0150    0.5609    0.0762    0.2102    0.1377

  [ 4,]   0.0542    0.8704    0.0100    0.0350    0.0304

  [ 5,]   0.6140    0.1026    0.0726    0.0316    0.1791

  [ 6,]   0.0478    0.8030    0.0011    0.1184    0.0296

  [ 7,]   0.1438    0.5840    0.1552    0.0894    0.0275

  [ 8,]   0.6289    0.2446    0.0209    0.1034    0.0023

  [ 9,]   0.0002    0.6872    0.0001    0.2908    0.0218

  [10,]   0.0132    0.4614    0.0187    0.1283    0.3783

  [11,]   0.9882    0.0033    0.0024    0.0025    0.0037



  [1,] Coordinates of the rows

 

  Contents of scodsi

 

  [ 1,]   0.0004   -0.0002    0.0001   -0.0004    0.0002

  [ 2,]  -0.0004    0.0002   -0.0001    0.0004   -0.0002

  [ 3,]   0.0001    0.0009    0.0003    0.0006   -0.0004

  [ 4,]   0.0003    0.0011    0.0001    0.0002   -0.0002

  [ 5,]   0.0003    0.0001    0.0001    0.0001    0.0001

  [ 6,]  -0.0002   -0.0009    0.0000   -0.0003    0.0002

  [ 7,]  -0.0006   -0.0012   -0.0006   -0.0005    0.0003

  [ 8,]  -0.0012   -0.0007   -0.0002   -0.0005    0.0001

  [ 9,]   0.0000    0.0004    0.0000    0.0002    0.0001

  [10,]   0.0003    0.0017    0.0003    0.0009   -0.0015

  [11,]   0.0026   -0.0002    0.0001    0.0001   -0.0002

Figure 2: Biplot for media data set.
\includegraphics[scale=0.6]{media.ps}

It is clear in this analysis that main trait (first axis) is that the contact of national newspapers corresponds, in a highly significant way, to high manager and (or) people with university education.

The second axis characterizes mostly an opposition between TV magazines (TVMAG) (associated with employer, worker , and the younger people) and magazine (MAGAZ), and regional newspapers (R NEWS) associated with farmer, small business (s busin) and older people (A50-64, A65+). Figure 2 summarizes this set of associations.

The positions of items on Figure 2 explain a nuance interpretation on the second axis: the employer and worker, people of middle level education (SECOND), associated in particular with the young (A25-34, A14-24) (contact media such as TV magazine), are opposed to small business and farmers, who are primarily older (A50-64, A65+) with less education (PRIMARY) and contact media such as magazine (MAGA) and regional newspapers (R NEWS).



Method and Data Technologies   MD*TECH Method and Data Technologies
  http://www.mdtech.de  mdtech@mdtech.de