FAQ
R version 2.10.1 (2009-12-14)
Copyright (C) 2009 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.
'citation()' on how to cite R or R packages in publications.
Below are two cases where I don't seem to be getting contr.sum
contrasts even though they were specified. Are these bugs?

The first case is an interaction between continuous and factor variables.

The second case contrasts= was specified as an arg to lm. The second
works ok if we set the contrasts through options but not if we set it
through an lm argument.
# 1. In this case I don't seem to be getting contr.sum contrasts:

options(contrasts = c("contr.sum", "contr.poly"))
getOption("contrasts")
[1] "contr.sum" "contr.poly"
scores <- rep(seq(-2, 2), 3); scores
[1] -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2
fac <- gl(3, 5); fac
[1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
Levels: 1 2 3
# I get this:
model.matrix(~ scores:fac)
(Intercept) scores:fac1 scores:fac2 scores:fac3
1 1 -2 0 0
2 1 -1 0 0
3 1 0 0 0
4 1 1 0 0
5 1 2 0 0
6 1 0 -2 0
7 1 0 -1 0
8 1 0 0 0
9 1 0 1 0
10 1 0 2 0
11 1 0 0 -2
12 1 0 0 -1
13 1 0 0 0
14 1 0 0 1
15 1 0 0 2
attr(,"assign")
[1] 0 1 1 1
attr(,"contrasts")
attr(,"contrasts")\$fac
[1] "contr.sum"
# But I was expecting this since I am using contr.sum
cbind(1, model.matrix(~ fac)[,2:3] * scores)
fac1 fac2
1 1 -2 0
2 1 -1 0
3 1 0 0
4 1 1 0
5 1 2 0
6 1 0 -2
7 1 0 -1
8 1 0 0
9 1 0 1
10 1 0 2
11 1 2 2
12 1 1 1
13 1 0 0
14 1 -1 -1
15 1 -2 -2

# 2.
# here I don't get contr.sum but rather get contr.treatment
options(contrasts = c("contr.treatment", "contr.poly"))
getOption("contrasts")
[1] "contr.treatment" "contr.poly"
model.matrix(lm(seq(15) ~ fac, contrasts = c("contr.sum", "contr.poly")))
(Intercept) fac2 fac3
1 1 0 0
2 1 0 0
3 1 0 0
4 1 0 0
5 1 0 0
6 1 1 0
7 1 1 0
8 1 1 0
9 1 1 0
10 1 1 0
11 1 0 1
12 1 0 1
13 1 0 1
14 1 0 1
15 1 0 1
attr(,"assign")
[1] 0 1 1
attr(,"contrasts")
attr(,"contrasts")\$fac
[1] "contr.treatment"
model.matrix(lm(seq(15) ~ fac)) # same
(Intercept) fac2 fac3
1 1 0 0
2 1 0 0
3 1 0 0
4 1 0 0
5 1 0 0
6 1 1 0
7 1 1 0
8 1 1 0
9 1 1 0
10 1 1 0
11 1 0 1
12 1 0 1
13 1 0 1
14 1 0 1
15 1 0 1
attr(,"assign")
[1] 0 1 1
attr(,"contrasts")
attr(,"contrasts")\$fac
[1] "contr.treatment"
# I was expecting the first one to give me this
options(contrasts = c("contr.sum", "contr.poly"))
model.matrix(lm(seq(15) ~ fac))
(Intercept) fac1 fac2
1 1 1 0
2 1 1 0
3 1 1 0
4 1 1 0
5 1 1 0
6 1 0 1
7 1 0 1
8 1 0 1
9 1 0 1
10 1 0 1
11 1 -1 -1
12 1 -1 -1
13 1 -1 -1
14 1 -1 -1
15 1 -1 -1
attr(,"assign")
[1] 0 1 1
attr(,"contrasts")
attr(,"contrasts")\$fac
[1] "contr.sum"
R.version.string
[1] "R version 2.10.1 (2009-12-14)"
win.version()
[1] "Windows Vista (build 6002) Service Pack 2"

## Search Discussions

•  at Apr 21, 2010 at 8:26 pm ⇧

Gabor Grothendieck wrote:
R version 2.10.1 (2009-12-14)
Copyright (C) 2009 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.
'citation()' on how to cite R or R packages in publications.
Below are two cases where I don't seem to be getting contr.sum
contrasts even though they were specified. Are these bugs?

The first case is an interaction between continuous and factor variables.

The second case contrasts= was specified as an arg to lm. The second
works ok if we set the contrasts through options but not if we set it
through an lm argument.
In case #2, I think you just failed to read the help page.
model.matrix(~fac, contrasts=list(fac="contr.sum"))
(Intercept) fac1 fac2
1 1 1 0
2 1 1 0
3 1 1 0
4 1 1 0
5 1 1 0
6 1 0 1
7 1 0 1
8 1 0 1
9 1 0 1
10 1 0 1
11 1 -1 -1
12 1 -1 -1
13 1 -1 -1
14 1 -1 -1
15 1 -1 -1
attr(,"assign")
[1] 0 1 1
attr(,"contrasts")
attr(,"contrasts")\$fac
[1] "contr.sum"

As for case #1, the rules are tricky in cases where interactions are
present without main effects, but AFAICS, what you observe is
essentially the same effect as
model.matrix(~fac-1, contrasts=list(fac="contr.sum"))
fac1 fac2 fac3
1 1 0 0
2 1 0 0
3 1 0 0
4 1 0 0
5 1 0 0
6 0 1 0
7 0 1 0
8 0 1 0
9 0 1 0
10 0 1 0
11 0 0 1
12 0 0 1
13 0 0 1
14 0 0 1
15 0 0 1
attr(,"assign")
[1] 1 1 1
attr(,"contrasts")
attr(,"contrasts")\$fac
[1] "contr.sum"

I.e., that R reverts to using indicator variables when the intercept is
absent.
# 1. In this case I don't seem to be getting contr.sum contrasts:

options(contrasts = c("contr.sum", "contr.poly"))
getOption("contrasts")
[1] "contr.sum" "contr.poly"
scores <- rep(seq(-2, 2), 3); scores
[1] -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2
fac <- gl(3, 5); fac
[1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
Levels: 1 2 3
# I get this:
model.matrix(~ scores:fac)
(Intercept) scores:fac1 scores:fac2 scores:fac3
1 1 -2 0 0
2 1 -1 0 0
3 1 0 0 0
4 1 1 0 0
5 1 2 0 0
6 1 0 -2 0
7 1 0 -1 0
8 1 0 0 0
9 1 0 1 0
10 1 0 2 0
11 1 0 0 -2
12 1 0 0 -1
13 1 0 0 0
14 1 0 0 1
15 1 0 0 2
attr(,"assign")
[1] 0 1 1 1
attr(,"contrasts")
attr(,"contrasts")\$fac
[1] "contr.sum"
# But I was expecting this since I am using contr.sum
cbind(1, model.matrix(~ fac)[,2:3] * scores)
fac1 fac2
1 1 -2 0
2 1 -1 0
3 1 0 0
4 1 1 0
5 1 2 0
6 1 0 -2
7 1 0 -1
8 1 0 0
9 1 0 1
10 1 0 2
11 1 2 2
12 1 1 1
13 1 0 0
14 1 -1 -1
15 1 -2 -2
# 2.
# here I don't get contr.sum but rather get contr.treatment
options(contrasts = c("contr.treatment", "contr.poly"))
getOption("contrasts")
[1] "contr.treatment" "contr.poly"
model.matrix(lm(seq(15) ~ fac, contrasts = c("contr.sum", "contr.poly")))
(Intercept) fac2 fac3
1 1 0 0
2 1 0 0
3 1 0 0
4 1 0 0
5 1 0 0
6 1 1 0
7 1 1 0
8 1 1 0
9 1 1 0
10 1 1 0
11 1 0 1
12 1 0 1
13 1 0 1
14 1 0 1
15 1 0 1
attr(,"assign")
[1] 0 1 1
attr(,"contrasts")
attr(,"contrasts")\$fac
[1] "contr.treatment"
model.matrix(lm(seq(15) ~ fac)) # same
(Intercept) fac2 fac3
1 1 0 0
2 1 0 0
3 1 0 0
4 1 0 0
5 1 0 0
6 1 1 0
7 1 1 0
8 1 1 0
9 1 1 0
10 1 1 0
11 1 0 1
12 1 0 1
13 1 0 1
14 1 0 1
15 1 0 1
attr(,"assign")
[1] 0 1 1
attr(,"contrasts")
attr(,"contrasts")\$fac
[1] "contr.treatment"
# I was expecting the first one to give me this
options(contrasts = c("contr.sum", "contr.poly"))
model.matrix(lm(seq(15) ~ fac))
(Intercept) fac1 fac2
1 1 1 0
2 1 1 0
3 1 1 0
4 1 1 0
5 1 1 0
6 1 0 1
7 1 0 1
8 1 0 1
9 1 0 1
10 1 0 1
11 1 -1 -1
12 1 -1 -1
13 1 -1 -1
14 1 -1 -1
15 1 -1 -1
attr(,"assign")
[1] 0 1 1
attr(,"contrasts")
attr(,"contrasts")\$fac
[1] "contr.sum"
R.version.string
[1] "R version 2.10.1 (2009-12-14)"
win.version()
[1] "Windows Vista (build 6002) Service Pack 2"

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

--
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
•  at Apr 21, 2010 at 8:43 pm ⇧

On Wed, Apr 21, 2010 at 4:26 PM, Peter Dalgaard wrote:
As for case #1, the rules are tricky in cases where interactions are
present without main effects, but AFAICS, what you observe is
essentially the same effect as
model.matrix(~fac-1, contrasts=list(fac="contr.sum"))
? fac1 fac2 fac3
1 ? ? 1 ? ?0 ? ?0
2 ? ? 1 ? ?0 ? ?0
3 ? ? 1 ? ?0 ? ?0
4 ? ? 1 ? ?0 ? ?0
5 ? ? 1 ? ?0 ? ?0
6 ? ? 0 ? ?1 ? ?0
7 ? ? 0 ? ?1 ? ?0
8 ? ? 0 ? ?1 ? ?0
9 ? ? 0 ? ?1 ? ?0
10 ? ?0 ? ?1 ? ?0
11 ? ?0 ? ?0 ? ?1
12 ? ?0 ? ?0 ? ?1
13 ? ?0 ? ?0 ? ?1
14 ? ?0 ? ?0 ? ?1
15 ? ?0 ? ?0 ? ?1
attr(,"assign")
[1] 1 1 1
attr(,"contrasts")
attr(,"contrasts")\$fac
[1] "contr.sum"

I.e., that R reverts to using indicator variables when the intercept is
absent.
Is there any nice way of getting contr.sum coding for the interaction
as opposed to the ugly code in my post that I used to force it? i.e.
cbind(1, model.matrix(~ fac)[,2:3] * scores)
•  at Apr 22, 2010 at 6:32 am ⇧

Gabor Grothendieck wrote:
On Wed, Apr 21, 2010 at 4:26 PM, Peter Dalgaard wrote:
...
I.e., that R reverts to using indicator variables when the intercept is
absent.
Is there any nice way of getting contr.sum coding for the interaction
as opposed to the ugly code in my post that I used to force it? i.e.
cbind(1, model.matrix(~ fac)[,2:3] * scores)
I think not. In general, an interaction like ~fac:scores indicates three
lines with a common intercept and three different slopes, and changing
the parametrization is not supposed to change the model, whereas your
model inserts a restriction that the slopes sum to zero (if I understand
correctly). So if you want to fit "ugly" models, you get to do a little
ugly footwork.

(A similar, simpler, issue arises if you want to have a 2x2 design with
no effect in one column and/or one row (think clinical trial, placebo
vs. active, baseline vs. treated. You can only do this us explicit dummy
variables, not with the two classifications represented as factors.)

--
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
•  at Apr 22, 2010 at 11:30 am ⇧
On Thu, Apr 22, 2010 at 2:32 AM, Peter Dalgaard wrote:
Gabor Grothendieck wrote:
On Wed, Apr 21, 2010 at 4:26 PM, Peter Dalgaard wrote:
...
I.e., that R reverts to using indicator variables when the intercept is
absent.
Is there any nice way of getting contr.sum coding for the interaction
as opposed to the ugly code in my post that I used to force it? i.e.
cbind(1, model.matrix(~ fac)[,2:3] * scores)
I think not. In general, an interaction like ~fac:scores indicates three
lines with a common intercept and three different slopes, and changing
the parametrization is not supposed to change the model, whereas your
model inserts a restriction that the slopes sum to zero (if I understand
correctly). So if you want to fit "ugly" models, you get to do a little
ugly footwork.
OK. Thanks. I guess that's fair.
(A similar, simpler, issue arises if you want to have a 2x2 design with
no effect in one column and/or one row (think clinical trial, placebo
vs. active, baseline vs. treated. You can only do this us explicit dummy
variables, not with the two classifications represented as factors.)

--
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd.mes at cbs.dk ?Priv: PDalgd at gmail.com
•  at Apr 22, 2010 at 3:38 am ⇧
G'day Gabor,

On Wed, 21 Apr 2010 14:00:33 -0400
Gabor Grothendieck wrote:
Below are two cases where I don't seem to be getting contr.sum
contrasts even though they were specified. Are these bugs?
The first case is an interaction between continuous and factor
variables.

The second case contrasts= was specified as an arg to lm. The second
works ok if we set the contrasts through options but not if we set it
through an lm argument.
# 1. In this case I don't seem to be getting contr.sum contrasts:

options(contrasts = c("contr.sum", "contr.poly"))
getOption("contrasts")
[1] "contr.sum" "contr.poly"
scores <- rep(seq(-2, 2), 3); scores
[1] -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2
fac <- gl(3, 5); fac
[1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
Levels: 1 2 3
# I get this:
model.matrix(~ scores:fac)
(Intercept) scores:fac1 scores:fac2 scores:fac3
1 1 -2 0 0
2 1 -1 0 0
3 1 0 0 0
4 1 1 0 0
5 1 2 0 0
6 1 0 -2 0
7 1 0 -1 0
8 1 0 0 0
9 1 0 1 0
10 1 0 2 0
11 1 0 0 -2
12 1 0 0 -1
13 1 0 0 0
14 1 0 0 1
15 1 0 0 2
attr(,"assign")
[1] 0 1 1 1
attr(,"contrasts")
attr(,"contrasts")\$fac
[1] "contr.sum"
When creating the model matrix, the various levels of a factor are
first coded as indicator variables (for creation of the main effects
and for creation of interactions). The contrast matrix is then
applied to remove redundant columns from the design matrix; that is,
columns that are known to be redundant due to the *design* (i.e. model
formula as a whole), depending on whether data is available for all
levels of a factor (or combinations of levels if several factors are in
the model) you can still end up with a design matrix that does not have
full column rank.

In this case, since there is no main effect fac, the three columns that
are put into the design matrix due to the score:fac term do not create
a singularity, hence the contrast matrix is not applied to these three
columns.

The above description is probably a bit rough and lacking in detail (if
not even wrong in some details). IMHO, the best explanation of how R
goes from the model formula to the actual design matrix (for linear
models) can still be found in MASS; though, if I am not mistaken,
current versions of R seem to proceed slightly different to that
description in more complicated models (i.e. several factors and
continuous explanatory variables).
# But I was expecting this since I am using contr.sum
cbind(1, model.matrix(~ fac)[,2:3] * scores)
fac1 fac2
1 1 -2 0
2 1 -1 0
3 1 0 0
4 1 1 0
5 1 2 0
6 1 0 -2
7 1 0 -1
8 1 0 0
9 1 0 1
10 1 0 2
11 1 2 2
12 1 1 1
13 1 0 0
14 1 -1 -1
15 1 -2 -2
If you wish to have the design matrix that contains the interaction
terms as if the model had the main effects too but without the columns
corresponding to the main effects, then just instruct R that you want
to have such a matrix:

R> model.matrix(~ scores*fac)[,-(2:4)]
(Intercept) scores:fac1 scores:fac2
1 1 -2 0
2 1 -1 0
3 1 0 0
4 1 1 0
5 1 2 0
6 1 0 -2
7 1 0 -1
8 1 0 0
9 1 0 1
10 1 0 2
11 1 2 2
12 1 1 1
13 1 0 0
14 1 -1 -1
15 1 -2 -2

Though, I am not sure why one wants to fit such a model.
# 2.
# here I don't get contr.sum but rather get contr.treatment
options(contrasts = c("contr.treatment", "contr.poly"))
getOption("contrasts")
[1] "contr.treatment" "contr.poly"
model.matrix(lm(seq(15) ~ fac, contrasts = c("contr.sum",
"contr.poly")))
(Intercept) fac2 fac3
1 1 0 0
2 1 0 0
3 1 0 0
4 1 0 0
5 1 0 0
6 1 1 0
7 1 1 0
8 1 1 0
9 1 1 0
10 1 1 0
11 1 0 1
12 1 0 1
13 1 0 1
14 1 0 1
15 1 0 1
attr(,"assign")
[1] 0 1 1
attr(,"contrasts")
attr(,"contrasts")\$fac
[1] "contr.treatment"
No bug either, but wrong use of the optional argument "contrasts" for
"lm". Please read the help page of "lm", which points you to the help
page of "model.matrix.default", which contains pertinent examples; in

R> model.matrix(lm(seq(15) ~ fac, contrasts = list(fac="contr.sum")))
(Intercept) fac1 fac2
1 1 1 0
2 1 1 0
3 1 1 0
4 1 1 0
5 1 1 0
6 1 0 1
7 1 0 1
8 1 0 1
9 1 0 1
10 1 0 1
11 1 -1 -1
12 1 -1 -1
13 1 -1 -1
14 1 -1 -1
15 1 -1 -1
attr(,"assign")
[1] 0 1 1
attr(,"contrasts")
attr(,"contrasts")\$fac
[1] "contr.sum"

HTH.

Cheers,

Berwin

Berwin A Turlach Tel.: +61 (8) 6488 3338 (secr)
School of Maths and Stats (M019) +61 (8) 6488 3383 (self)
The University of Western Australia FAX : +61 (8) 6488 1028
35 Stirling Highway
Crawley WA 6009 e-mail: berwin at maths.uwa.edu.au
Australia http://www.maths.uwa.edu.au/~berwin
•  at Apr 22, 2010 at 4:26 am ⇧

On Wed, Apr 21, 2010 at 11:38 PM, Berwin A Turlach wrote:
# But I was expecting this since I am using contr.sum
cbind(1, model.matrix(~ fac)[,2:3] * scores)
? ? ?fac1 fac2
1 ?1 ? -2 ? ?0
2 ?1 ? -1 ? ?0
3 ?1 ? ?0 ? ?0
4 ?1 ? ?1 ? ?0
5 ?1 ? ?2 ? ?0
6 ?1 ? ?0 ? -2
7 ?1 ? ?0 ? -1
8 ?1 ? ?0 ? ?0
9 ?1 ? ?0 ? ?1
10 1 ? ?0 ? ?2
11 1 ? ?2 ? ?2
12 1 ? ?1 ? ?1
13 1 ? ?0 ? ?0
14 1 ? -1 ? -1
15 1 ? -2 ? -2
If you wish to have the design matrix that contains the interaction
terms as if the model had the main effects too but without the columns
corresponding to the main effects, then just instruct R that you want
to have such a matrix:

R> model.matrix(~ scores*fac)[,-(2:4)]
? (Intercept) scores:fac1 scores:fac2
1 ? ? ? ? ? ?1 ? ? ? ? ?-2 ? ? ? ? ? 0
2 ? ? ? ? ? ?1 ? ? ? ? ?-1 ? ? ? ? ? 0
3 ? ? ? ? ? ?1 ? ? ? ? ? 0 ? ? ? ? ? 0
4 ? ? ? ? ? ?1 ? ? ? ? ? 1 ? ? ? ? ? 0
5 ? ? ? ? ? ?1 ? ? ? ? ? 2 ? ? ? ? ? 0
6 ? ? ? ? ? ?1 ? ? ? ? ? 0 ? ? ? ? ?-2
7 ? ? ? ? ? ?1 ? ? ? ? ? 0 ? ? ? ? ?-1
8 ? ? ? ? ? ?1 ? ? ? ? ? 0 ? ? ? ? ? 0
9 ? ? ? ? ? ?1 ? ? ? ? ? 0 ? ? ? ? ? 1
10 ? ? ? ? ? 1 ? ? ? ? ? 0 ? ? ? ? ? 2
11 ? ? ? ? ? 1 ? ? ? ? ? 2 ? ? ? ? ? 2
12 ? ? ? ? ? 1 ? ? ? ? ? 1 ? ? ? ? ? 1
13 ? ? ? ? ? 1 ? ? ? ? ? 0 ? ? ? ? ? 0
14 ? ? ? ? ? 1 ? ? ? ? ?-1 ? ? ? ? ?-1
15 ? ? ? ? ? 1 ? ? ? ? ?-2 ? ? ? ? ?-2

Though, I am not sure why one wants to fit such a model.
To save on degrees of freedom.

I had reduced it to the minimum needed to illustrate but in fact its
closer to this (except the coding is not the one needed):

options(contrasts = c("contr.sum", "contr.poly"))
tab <- as.table(matrix(1:21, 7))
dimnames(tab) = list(X = letters[1:7], Y = LETTERS[1:3])
rr <- factor(row(tab))
cc <- factor(col(tab))
scores <- rep(seq(-3,3), 3)
model.matrix( ~ rr + cc + scores:cc)

so the main effects are rr and cc but scores takes the place of rr in
the interaction.

Your description of the process seems right since it would predict
that the following gives the required coding and it does:

model.matrix(~ scores*cc + rr)[,-2]

Thanks.

## Related Discussions

Discussion Overview
 group r-devel categories r posted Apr 21, '10 at 6:00p active Apr 22, '10 at 11:30a posts 7 users 3 website r-project.org irc #r

### 3 users in discussion

Content

People

Support

Translate

site design / logo © 2022 Grokbase