Grokbase Groups Pig user May 2008
FAQ
I have update to trunk and Hadoop 0.17.0. The memory limit per task is
400 Mb. An OutOfMemory exception is launched at first reduce. I have
notice that this Pig script worked with 1GB of memory per task. What are
the memory requirements for PIG?

Thanks!
Iván de Prado
www.ivanprado.es

2008-05-30 11:21:29,863 INFO
org.apache.pig.impl.util.SpillableMemoryManager: low memory handler
called init = 5439488(5312K) used = 166885368(162973K) committed =
246087680(240320K) max = 279642112(273088K)
2008-05-30 11:21:33,069 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 225822592(220529K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-30 11:21:36,047 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 169349352(165380K) committed = 267780096(261504K) max = 279642112(273088K)
2008-05-30 11:21:39,369 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 267780072(261503K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-30 11:21:44,505 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 255668880(249676K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-30 11:21:51,019 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 265970168(259736K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-30 11:21:58,115 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 266914224(260658K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-30 11:22:01,423 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 223674352(218431K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-30 11:22:05,163 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 258252264(252199K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-30 11:22:41,457 ERROR org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce: java.lang.OutOfMemoryError: Java heap space

________________________________________________________________________
Explain:

Logical Plan:
---LOSort ( BY GENERATE {[FLATTEN PROJECT $1]} )
---LOEval ( GENERATE {[FLATTEN PROJECT $1],[FLATTEN PROJECT $2],[FLATTEN PROJECT $3],[FLATTEN PROJECT $4],[FLATTEN PROJECT $5]} )
---LOCogroup ( GENERATE {[PROJECT $0],[*]}, GENERATE {[PROJECT $0],[*]}, GENERATE {[PROJECT $0],[*]}, GENERATE {[PROJECT $0],[*]}, GENERATE {[PROJECT $0],[*]} )
---LOEval ( GENERATE {[PROJECT $0],[COUNT(GENERATE {[PROJECT $1]})]} )
---LOCogroup ( GENERATE {[PROJECT $2],[*]} )
---LOEval ( [FILTER BY ([PROJECT $6] == ['1'])] )
---LOEval ( GENERATE {[FLATTEN PROJECT $1],[FLATTEN PROJECT $2]} )
---LOCogroup ( GENERATE {[PROJECT $1],[*]}, GENERATE {[PROJECT $0],[*]} )
---LOEval ( [FILTER BY (([PROJECT $3] == ['2']) AND ([PROJECT $4] != ['0']) AND ([PROJECT $6] != ['0']) AND ([PROJECT $6] != ['2']))] )
---LOLoad ( file = /user/properazzi/mc/mc_20080529000002/input/partition_B.dump AS id,wid,locid,status,proptype,country,sor )
---LOLoad ( file = /user/properazzi/flm/quotas.txt AS wqid )
---LOEval ( GENERATE {[PROJECT $0],[COUNT(GENERATE {[PROJECT $1]})]} )
---LOCogroup ( GENERATE {[PROJECT $0],[*]} )
---LOEval ( GENERATE {[FLATTEN PROJECT $0]} )
---LOCogroup ( GENERATE {[*],[*]} )
---LOEval ( GENERATE {[PROJECT $2],[PROJECT $1]} )
---LOEval ( [FILTER BY ([PROJECT $6] == ['1'])] )
---LOEval ( GENERATE {[FLATTEN PROJECT $1],[FLATTEN PROJECT $2]} )
---LOCogroup ( GENERATE {[PROJECT $1],[*]}, GENERATE {[PROJECT $0],[*]} )
---LOEval ( [FILTER BY (([PROJECT $3] == ['2']) AND ([PROJECT $4] != ['0']) AND ([PROJECT $6] != ['0']) AND ([PROJECT $6] != ['2']))] )
---LOLoad ( file = /user/properazzi/mc/mc_20080529000002/input/partition_B.dump AS id,wid,locid,status,proptype,country,sor )
---LOLoad ( file = /user/properazzi/flm/quotas.txt AS wqid )
---LOEval ( GENERATE {[PROJECT $0],[COUNT(GENERATE {[PROJECT $1]})]} )
---LOCogroup ( GENERATE {[PROJECT $2],[*]} )
---LOEval ( [FILTER BY (([PROJECT $6] == ['3']) OR ([PROJECT $6] == ['4']) OR ([PROJECT $6] == ['5']))] )
---LOEval ( GENERATE {[FLATTEN PROJECT $1],[FLATTEN PROJECT $2]} )
---LOCogroup ( GENERATE {[PROJECT $1],[*]}, GENERATE {[PROJECT $0],[*]} )
---LOEval ( [FILTER BY (([PROJECT $3] == ['2']) AND ([PROJECT $4] != ['0']) AND ([PROJECT $6] != ['0']) AND ([PROJECT $6] != ['2']))] )
---LOLoad ( file = /user/properazzi/mc/mc_20080529000002/input/partition_B.dump AS id,wid,locid,status,proptype,country,sor )
---LOLoad ( file = /user/properazzi/flm/quotas.txt AS wqid )
---LOEval ( GENERATE {[PROJECT $0],[COUNT(GENERATE {[PROJECT $1]})]} )
---LOCogroup ( GENERATE {[PROJECT $0],[*]} )
---LOEval ( GENERATE {[FLATTEN PROJECT $0]} )
---LOCogroup ( GENERATE {[*],[*]} )
---LOEval ( GENERATE {[PROJECT $2],[PROJECT $1]} )
---LOEval ( [FILTER BY (([PROJECT $6] == ['3']) OR ([PROJECT $6] == ['4']) OR ([PROJECT $6] == ['5']))] )
---LOEval ( GENERATE {[FLATTEN PROJECT $1],[FLATTEN PROJECT $2]} )
---LOCogroup ( GENERATE {[PROJECT $1],[*]}, GENERATE {[PROJECT $0],[*]} )
---LOEval ( [FILTER BY (([PROJECT $3] == ['2']) AND ([PROJECT $4] != ['0']) AND ([PROJECT $6] != ['0']) AND ([PROJECT $6] != ['2']))] )
---LOLoad ( file = /user/properazzi/mc/mc_20080529000002/input/partition_B.dump AS id,wid,locid,status,proptype,country,sor )
---LOLoad ( file = /user/properazzi/flm/quotas.txt AS wqid )
---LOEval ( GENERATE {[FLATTEN PROJECT $0]} )
---LOCogroup ( GENERATE {[*],[*]} )
---LOEval ( GENERATE {[PROJECT $2],[PROJECT $5]} )
---LOLoad ( file = /user/properazzi/mc/mc_20080529000002/input/partition_B.dump AS id,wid,locid,status,proptype,country,sor )
-----------------------------------------------
Physical Plan:
---POMapreduce
Partition Function: org.apache.pig.backend.hadoop.executionengine.mapreduceExec.SortPartitioner

Map : *
Reduce : Generate(Project(1))
Grouping : Generate(Generate(Project(1)),*)
Input File(s) : /tmp/temp1398936874/tmp-1538794351
Properties :
---POMapreduce
Map : Composite(*,Generate(Project(1)))
Reduce : Generate(FuncEval(org.apache.pig.impl.builtin.FindQuantiles(Generate(Const(1),Composite(Project(1),Sort(*))))))
Grouping : Generate(Const(all),*)
Input File(s) : /tmp/temp1398936874/tmp-1538794351
Properties :
---POMapreduce
Map : *****
Reduce : Generate(Project(1),Project(2),Project(3),Project(4),Project(5))
Grouping : Generate(Project(0),*)Generate(Project(0),*)Generate(Project(0),*)Generate(Project(0),*)Generate(Project(0),*)
Input File(s) : /tmp/temp1398936874/tmp-585863913, /tmp/temp1398936874/tmp-536934015, /tmp/temp1398936874/tmp23578316, /tmp/temp1398936874/tmp662497645, /tmp/temp1398936874/tmp582570364
Properties : pig.input.splittable:true
---POMapreduce
Map : *
Combine : Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Initial(Generate(Project(1)))))
Reduce : Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Final(Generate(Composite(Project(1),Project(1))))))
Grouping : Generate(Project(2),*)
Input File(s) : /tmp/temp1398936874/tmp-1880872512
Properties : pig.input.splittable:true
---POMapreduce
Map : Composite(*,Filter: AND )*
Reduce : Composite(Generate(Project(1),Project(2)),Filter: COMP )
Grouping : Generate(Project(1),*)Generate(Project(0),*)
Input File(s) : /user/properazzi/mc/mc_20080529000002/input/partition_B.dump, /user/properazzi/flm/quotas.txt
Properties : pig.input.splittable:true
---POMapreduce
Map : *
Combine : Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Initial(Generate(Project(1)))))
Reduce : Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Final(Generate(Composite(Project(1),Project(1))))))
Grouping : Generate(Project(0),*)
Input File(s) : /tmp/temp1398936874/tmp-1242543041
Properties : pig.input.splittable:true
---POMapreduce
Map : *
Reduce : Generate(Project(0))
Grouping : Generate(*,*)
Input File(s) : /tmp/temp1398936874/tmp2015750396
Properties : pig.input.splittable:true
---POMapreduce
Map : Composite(*,Filter: AND )*
Reduce : Composite(Generate(Project(1),Project(2)),Filter: COMP ,Generate(Project(2),Project(1)))
Grouping : Generate(Project(1),*)Generate(Project(0),*)
Input File(s) : /user/properazzi/mc/mc_20080529000002/input/partition_B.dump, /user/properazzi/flm/quotas.txt
Properties : pig.input.splittable:true
---POMapreduce
Map : *
Combine : Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Initial(Generate(Project(1)))))
Reduce : Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Final(Generate(Composite(Project(1),Project(1))))))
Grouping : Generate(Project(2),*)
Input File(s) : /tmp/temp1398936874/tmp-1934972255
Properties : pig.input.splittable:true
---POMapreduce
Map : Composite(*,Filter: AND )*
Reduce : Composite(Generate(Project(1),Project(2)),Filter: OR )
Grouping : Generate(Project(1),*)Generate(Project(0),*)
Input File(s) : /user/properazzi/mc/mc_20080529000002/input/partition_B.dump, /user/properazzi/flm/quotas.txt
Properties : pig.input.splittable:true
---POMapreduce
Map : *
Combine : Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Initial(Generate(Project(1)))))
Reduce : Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Final(Generate(Composite(Project(1),Project(1))))))
Grouping : Generate(Project(0),*)
Input File(s) : /tmp/temp1398936874/tmp799024189
Properties : pig.input.splittable:true
---POMapreduce
Map : *
Reduce : Generate(Project(0))
Grouping : Generate(*,*)
Input File(s) : /tmp/temp1398936874/tmp1055965366
Properties : pig.input.splittable:true
---POMapreduce
Map : Composite(*,Filter: AND )*
Reduce : Composite(Generate(Project(1),Project(2)),Filter: OR ,Generate(Project(2),Project(1)))
Grouping : Generate(Project(1),*)Generate(Project(0),*)
Input File(s) : /user/properazzi/mc/mc_20080529000002/input/partition_B.dump, /user/properazzi/flm/quotas.txt
Properties : pig.input.splittable:true
---POMapreduce
Map : Composite(*,Generate(Project(2),Project(5)))
Reduce : Generate(Project(0))
Grouping : Generate(*,*)
Input File(s) : /user/properazzi/mc/mc_20080529000002/input/partition_B.dump
Properties : pig.input.splittable:true


El vie, 30-05-2008 a las 20:01 +1000, pi song escribió:
We've already fixed the memory issue introduced in Pig-85. Could you please
update to the latest version and try again?

Pi
On Wed, May 28, 2008 at 9:18 AM, pi song wrote:

This might have nothing to do with Hadoop 0.17 but something else that we
fixed right after it. I'm investigating. Sorry for inconvenience.

FYI,
Pi

On 5/28/08, Tanton Gibbs wrote:

I think you need to increase the amount of memory you give to java.

It looks like it is currently set to 256M. I upped mine to 2G. Of
course it depends on how much ram you have available.

mapred.child.java.opts is the parameter
mine is currently set to 2048M in my hadoop-site.xml file.

For performance reasons, I upped the io.sort.mb parameter. However,
if this is too close to 50% of the total memory, you will get the
Spillable messages.

HTH,
Tanton

Search Discussions

Discussion Posts

Previous

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 12 of 12 | next ›
Discussion Overview
groupuser @
categoriespig, hadoop
postedMay 23, '08 at 5:52a
activeMay 30, '08 at 4:32p
posts12
users5
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase