FAQ

[BioC] Limma topTable; fold changes look completely different to the normalized data and Limma fold change

John herbert
Jul 6, 2012 at 7:12 pm
Dear all,
I have a problem with the log Fold changes calculated in Limma. I am
using protein abundance index of proteomic data
The log2 of this data is normally distributed and after log2, I use
quantile normalization

This is then the data matrix I use as input to Limma
class(norm_ctw)
[1] "matrix"
dim(norm_ctw)
[1] 683 9
design <- model.matrix(~ 0+factor(c(1,1,1,2,2,2,3,3,3)))
colnames(design) <- c("cam", "tumour", "wound")
fit <- lmFit(norm_ctw, design)

contrast.matrix <- makeContrasts(tumour-wound, tumour-cam, levels=design)
fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2)

topTable(fit2, coef=1, adjust="BH")

Taking one gene as an example. NAMPT in tumour versus wound and
calculating fold change by hand of normalized data;
norm_ctw["NAMPT",]
cam1 cam2 cam3 tumour1 tumour2 tumour3 wound1
wound2 wound3
19.80164 19.46355 19.26075 22.75347 22.62651 22.39521 16.17398 16.60262 16.72368

In Excel, calculating log2 fold change using Average of Tumour/Average
of wound =
T1 22.75347 T2 22.62651 T3 22.39521 W1 16.17398 W2 16.60262 W3 16.72368
Tumour average = 22.59173
Wound average = 16.50009333
Log2 Fold change = 0.453320567

However, from TopTable....
topTable(fit2,coef=1)
ID logFC AveExpr t P.Value adj.P.Val B
431 NAMPT 6.091632 19.53349 20.16810 2.688444e-09 1.750946e-06 11.409857
From toptable, NAMPT has an apparent log2 FC of 6 or 64 fold change
but that is impossible right??

Please can someone explain if I am using Limma wrong or how the fold
change can be massively different between "by hand" and with Limma.

Thank you very much for any advice.

John.
reply

Search Discussions

4 responses

  • James W. MacDonald at Jul 6, 2012 at 7:36 pm
    Hi John,
    On 7/6/2012 3:12 PM, john herbert wrote:
    Dear all,
    I have a problem with the log Fold changes calculated in Limma. I am
    using protein abundance index of proteomic data
    The log2 of this data is normally distributed and after log2, I use
    quantile normalization

    This is then the data matrix I use as input to Limma
    class(norm_ctw)
    [1] "matrix"
    dim(norm_ctw)
    [1] 683 9
    design<- model.matrix(~ 0+factor(c(1,1,1,2,2,2,3,3,3)))
    colnames(design)<- c("cam", "tumour", "wound")
    fit<- lmFit(norm_ctw, design)

    contrast.matrix<- makeContrasts(tumour-wound, tumour-cam, levels=design)
    fit2<- contrasts.fit(fit, contrast.matrix)
    fit2<- eBayes(fit2)

    topTable(fit2, coef=1, adjust="BH")

    Taking one gene as an example. NAMPT in tumour versus wound and
    calculating fold change by hand of normalized data;
    norm_ctw["NAMPT",]
    cam1 cam2 cam3 tumour1 tumour2 tumour3 wound1
    wound2 wound3
    19.80164 19.46355 19.26075 22.75347 22.62651 22.39521 16.17398 16.60262 16.72368

    In Excel, calculating log2 fold change using Average of Tumour/Average
    of wound =
    T1 22.75347 T2 22.62651 T3 22.39521 W1 16.17398 W2 16.60262 W3 16.72368
    Tumour average = 22.59173
    Wound average = 16.50009333
    Log2 Fold change = 0.453320567
    Wait a minute... Are these data logged or not? You say above that you
    take logs and then normalize, and then you present some data that would
    be really big if they were log2 variates (but then I have no idea of the
    scale for protein abundance data).

    Anyway, you are acting like these data are not logged, whereas limma
    assumes they are. So you either have to take logs before feeding into
    limma, or you need to compute the fold change by subtraction (if the
    data above are already logged).

    Best,

    Jim



    However, from TopTable....
    topTable(fit2,coef=1)
    ID logFC AveExpr t P.Value adj.P.Val B
    431 NAMPT 6.091632 19.53349 20.16810 2.688444e-09 1.750946e-06 11.409857
    From toptable, NAMPT has an apparent log2 FC of 6 or 64 fold change
    but that is impossible right??

    Please can someone explain if I am using Limma wrong or how the fold
    change can be massively different between "by hand" and with Limma.

    Thank you very much for any advice.

    John.

    _______________________________________________
    Bioconductor mailing list
    Bioconductor at r-project.org
    https://stat.ethz.ch/mailman/listinfo/bioconductor
    Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
    --
    James W. MacDonald, M.S.
    Biostatistician
    University of Washington
    Environmental and Occupational Health Sciences
    4225 Roosevelt Way NE, # 100
    Seattle WA 98105-6099
  • John herbert at Jul 6, 2012 at 8:00 pm
    Thanks a lot James,
    Yes, the raw PAI values are very big and I am feeding Limma log2 and
    normalized values.

    So if I have a log2 value of 22.59173 for Tumour and a log2 value for
    wound of 16.50009333
    subtracting tumour - wound = 6.09 (the same number toptable comes up with)

    What is this 6.09 value, is that fold change or log2 fold change?

    I would guess fold change as 2 to the power of 6 = 64 fold change but
    topTable labels it as logFC; please explain why?

    Thank you,

    John.




    Tumour average = 22.59173
    Wound average = 16.50009333
    Log2 Fold change = 0.453320567
    On Fri, Jul 6, 2012 at 8:36 PM, James W. MacDonald wrote:
    Hi John,

    On 7/6/2012 3:12 PM, john herbert wrote:

    Dear all,
    I have a problem with the log Fold changes calculated in Limma. I am
    using protein abundance index of proteomic data
    The log2 of this data is normally distributed and after log2, I use
    quantile normalization

    This is then the data matrix I use as input to Limma
    class(norm_ctw)
    [1] "matrix"
    dim(norm_ctw)
    [1] 683 9
    design<- model.matrix(~ 0+factor(c(1,1,1,2,2,2,3,3,3)))
    colnames(design)<- c("cam", "tumour", "wound")
    fit<- lmFit(norm_ctw, design)

    contrast.matrix<- makeContrasts(tumour-wound, tumour-cam, levels=design)
    fit2<- contrasts.fit(fit, contrast.matrix)
    fit2<- eBayes(fit2)

    topTable(fit2, coef=1, adjust="BH")

    Taking one gene as an example. NAMPT in tumour versus wound and
    calculating fold change by hand of normalized data;
    norm_ctw["NAMPT",]
    cam1 cam2 cam3 tumour1 tumour2 tumour3 wound1
    wound2 wound3
    19.80164 19.46355 19.26075 22.75347 22.62651 22.39521 16.17398 16.60262
    16.72368

    In Excel, calculating log2 fold change using Average of Tumour/Average
    of wound =
    T1 22.75347 T2 22.62651 T3 22.39521 W1 16.17398 W2
    16.60262 W3 16.72368
    Tumour average = 22.59173
    Wound average = 16.50009333
    Log2 Fold change = 0.453320567

    Wait a minute... Are these data logged or not? You say above that you take
    logs and then normalize, and then you present some data that would be really
    big if they were log2 variates (but then I have no idea of the scale for
    protein abundance data).

    Anyway, you are acting like these data are not logged, whereas limma assumes
    they are. So you either have to take logs before feeding into limma, or you
    need to compute the fold change by subtraction (if the data above are
    already logged).

    Best,

    Jim



    However, from TopTable....
    topTable(fit2,coef=1)
    ID logFC AveExpr t P.Value adj.P.Val
    B
    431 NAMPT 6.091632 19.53349 20.16810 2.688444e-09 1.750946e-06
    11.409857
    From toptable, NAMPT has an apparent log2 FC of 6 or 64 fold change
    but that is impossible right??

    Please can someone explain if I am using Limma wrong or how the fold
    change can be massively different between "by hand" and with Limma.

    Thank you very much for any advice.

    John.

    _______________________________________________
    Bioconductor mailing list
    Bioconductor at r-project.org
    https://stat.ethz.ch/mailman/listinfo/bioconductor
    Search the archives:
    http://news.gmane.org/gmane.science.biology.informatics.conductor

    --
    James W. MacDonald, M.S.
    Biostatistician
    University of Washington
    Environmental and Occupational Health Sciences
    4225 Roosevelt Way NE, # 100
    Seattle WA 98105-6099
  • John herbert at Jul 6, 2012 at 8:19 pm
    OK, I solved it using raw values and see 6.09 is log2 FC.
    Thanks.

    John.
    On Fri, Jul 6, 2012 at 9:00 PM, john herbert wrote:
    Thanks a lot James,
    Yes, the raw PAI values are very big and I am feeding Limma log2 and
    normalized values.

    So if I have a log2 value of 22.59173 for Tumour and a log2 value for
    wound of 16.50009333
    subtracting tumour - wound = 6.09 (the same number toptable comes up with)

    What is this 6.09 value, is that fold change or log2 fold change?

    I would guess fold change as 2 to the power of 6 = 64 fold change but
    topTable labels it as logFC; please explain why?

    Thank you,

    John.




    Tumour average = 22.59173
    Wound average = 16.50009333
    Log2 Fold change = 0.453320567
    On Fri, Jul 6, 2012 at 8:36 PM, James W. MacDonald wrote:
    Hi John,

    On 7/6/2012 3:12 PM, john herbert wrote:

    Dear all,
    I have a problem with the log Fold changes calculated in Limma. I am
    using protein abundance index of proteomic data
    The log2 of this data is normally distributed and after log2, I use
    quantile normalization

    This is then the data matrix I use as input to Limma
    class(norm_ctw)
    [1] "matrix"
    dim(norm_ctw)
    [1] 683 9
    design<- model.matrix(~ 0+factor(c(1,1,1,2,2,2,3,3,3)))
    colnames(design)<- c("cam", "tumour", "wound")
    fit<- lmFit(norm_ctw, design)

    contrast.matrix<- makeContrasts(tumour-wound, tumour-cam, levels=design)
    fit2<- contrasts.fit(fit, contrast.matrix)
    fit2<- eBayes(fit2)

    topTable(fit2, coef=1, adjust="BH")

    Taking one gene as an example. NAMPT in tumour versus wound and
    calculating fold change by hand of normalized data;
    norm_ctw["NAMPT",]
    cam1 cam2 cam3 tumour1 tumour2 tumour3 wound1
    wound2 wound3
    19.80164 19.46355 19.26075 22.75347 22.62651 22.39521 16.17398 16.60262
    16.72368

    In Excel, calculating log2 fold change using Average of Tumour/Average
    of wound =
    T1 22.75347 T2 22.62651 T3 22.39521 W1 16.17398 W2
    16.60262 W3 16.72368
    Tumour average = 22.59173
    Wound average = 16.50009333
    Log2 Fold change = 0.453320567

    Wait a minute... Are these data logged or not? You say above that you take
    logs and then normalize, and then you present some data that would be really
    big if they were log2 variates (but then I have no idea of the scale for
    protein abundance data).

    Anyway, you are acting like these data are not logged, whereas limma assumes
    they are. So you either have to take logs before feeding into limma, or you
    need to compute the fold change by subtraction (if the data above are
    already logged).

    Best,

    Jim



    However, from TopTable....
    topTable(fit2,coef=1)
    ID logFC AveExpr t P.Value adj.P.Val
    B
    431 NAMPT 6.091632 19.53349 20.16810 2.688444e-09 1.750946e-06
    11.409857
    From toptable, NAMPT has an apparent log2 FC of 6 or 64 fold change
    but that is impossible right??

    Please can someone explain if I am using Limma wrong or how the fold
    change can be massively different between "by hand" and with Limma.

    Thank you very much for any advice.

    John.

    _______________________________________________
    Bioconductor mailing list
    Bioconductor at r-project.org
    https://stat.ethz.ch/mailman/listinfo/bioconductor
    Search the archives:
    http://news.gmane.org/gmane.science.biology.informatics.conductor

    --
    James W. MacDonald, M.S.
    Biostatistician
    University of Washington
    Environmental and Occupational Health Sciences
    4225 Roosevelt Way NE, # 100
    Seattle WA 98105-6099
  • James W. MacDonald at Jul 6, 2012 at 8:39 pm
    Yep. You have to remember that log2(this/that) = log2(this) -
    log2(that), so if you are in the log space you have to subtract to
    compute what would be division on the natural scale.

    Best,

    Jim

    On 7/6/2012 4:19 PM, john herbert wrote:
    OK, I solved it using raw values and see 6.09 is log2 FC.
    Thanks.

    John.

    On Fri, Jul 6, 2012 at 9:00 PM, john herbertwrote:
    Thanks a lot James,
    Yes, the raw PAI values are very big and I am feeding Limma log2 and
    normalized values.

    So if I have a log2 value of 22.59173 for Tumour and a log2 value for
    wound of 16.50009333
    subtracting tumour - wound = 6.09 (the same number toptable comes up with)

    What is this 6.09 value, is that fold change or log2 fold change?

    I would guess fold change as 2 to the power of 6 = 64 fold change but
    topTable labels it as logFC; please explain why?

    Thank you,

    John.




    Tumour average = 22.59173
    Wound average = 16.50009333
    Log2 Fold change = 0.453320567

    On Fri, Jul 6, 2012 at 8:36 PM, James W. MacDonaldwrote:
    Hi John,

    On 7/6/2012 3:12 PM, john herbert wrote:
    Dear all,
    I have a problem with the log Fold changes calculated in Limma. I am
    using protein abundance index of proteomic data
    The log2 of this data is normally distributed and after log2, I use
    quantile normalization

    This is then the data matrix I use as input to Limma
    class(norm_ctw)
    [1] "matrix"
    dim(norm_ctw)
    [1] 683 9
    design<- model.matrix(~ 0+factor(c(1,1,1,2,2,2,3,3,3)))
    colnames(design)<- c("cam", "tumour", "wound")
    fit<- lmFit(norm_ctw, design)

    contrast.matrix<- makeContrasts(tumour-wound, tumour-cam, levels=design)
    fit2<- contrasts.fit(fit, contrast.matrix)
    fit2<- eBayes(fit2)

    topTable(fit2, coef=1, adjust="BH")

    Taking one gene as an example. NAMPT in tumour versus wound and
    calculating fold change by hand of normalized data;
    norm_ctw["NAMPT",]
    cam1 cam2 cam3 tumour1 tumour2 tumour3 wound1
    wound2 wound3
    19.80164 19.46355 19.26075 22.75347 22.62651 22.39521 16.17398 16.60262
    16.72368

    In Excel, calculating log2 fold change using Average of Tumour/Average
    of wound =
    T1 22.75347 T2 22.62651 T3 22.39521 W1 16.17398 W2
    16.60262 W3 16.72368
    Tumour average = 22.59173
    Wound average = 16.50009333
    Log2 Fold change = 0.453320567
    Wait a minute... Are these data logged or not? You say above that you take
    logs and then normalize, and then you present some data that would be really
    big if they were log2 variates (but then I have no idea of the scale for
    protein abundance data).

    Anyway, you are acting like these data are not logged, whereas limma assumes
    they are. So you either have to take logs before feeding into limma, or you
    need to compute the fold change by subtraction (if the data above are
    already logged).

    Best,

    Jim


    However, from TopTable....
    topTable(fit2,coef=1)
    ID logFC AveExpr t P.Value adj.P.Val
    B
    431 NAMPT 6.091632 19.53349 20.16810 2.688444e-09 1.750946e-06
    11.409857
    From toptable, NAMPT has an apparent log2 FC of 6 or 64 fold change
    but that is impossible right??

    Please can someone explain if I am using Limma wrong or how the fold
    change can be massively different between "by hand" and with Limma.

    Thank you very much for any advice.

    John.

    _______________________________________________
    Bioconductor mailing list
    Bioconductor at r-project.org
    https://stat.ethz.ch/mailman/listinfo/bioconductor
    Search the archives:
    http://news.gmane.org/gmane.science.biology.informatics.conductor
    --
    James W. MacDonald, M.S.
    Biostatistician
    University of Washington
    Environmental and Occupational Health Sciences
    4225 Roosevelt Way NE, # 100
    Seattle WA 98105-6099
    --
    James W. MacDonald, M.S.
    Biostatistician
    University of Washington
    Environmental and Occupational Health Sciences
    4225 Roosevelt Way NE, # 100
    Seattle WA 98105-6099

Related Discussions

Discussion Navigation
viewthread | post