FAQ
This is interesting, does that mean hadoop can use the S3 as the
DistributedFileSystem and EC2 machine as the Computing Node ?
If so, how does the namenode communicate with the datanode (s3) ?




-----Original Message-----
From: Irfan Mohammed
Sent: 2009年9月9日 15:03
To: pig-user@hadoop.apache.org
Subject: s3n intermediate storage problem

Hi,
I have a pig script reading/writing to S3.

$ export PIG_OPTS="-Dfs.default.name=s3n://bucket_1/";
$ pig
>>>>
r0 = LOAD 'input2/transaction_ar20090909_14*' using PigStorage('\u0002');
r1 = FILTER r0 by client_id = 'xxxx';
store r1 into 'output2/' using PigStorage(',');
>>>>>

I get the following error. Looks like it is trying to write/read the
intermediate data from some temporary storage in S3. I cannot find this
folder under S3.

1. How do I know where it is writing or reading the intermediate files?
2. Can I use S3 urls for only load/store and the intermediate files are
in the another hdfs? If so, how do I give the url paths?

Thanks,
Irfan

java.lang.Exception: org.apache.hadoop.fs.s3.S3Exception:
org.jets3t.service.S3ServiceException: S3 GET failed for
'/tmp%2Ftemp666717117%2Ftmp-2105109046%2Fpart-00000' XML Error Message:
<?xml version="1.0"
encoding="UTF-8"?><Error><Code>InvalidRange</Code><Message>The requested
range is not
satisfiable</Message><ActualObjectSize>0</ActualObjectSize><RequestId>1CF939
F219CF8549</RequestId><HostId>w05Yp+WVCk2k/N9iVnqYmbFZzEiqszGYV3++yZjj+J/oaO
JAifjUW4b5ZxIFDH2C(Jets3tNat
iveFileSystemStore.java:154)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocati
onHandler.java:82)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHand
ler.java:59)
at org.apache.hadoop.fs.s3native.$Proxy1.retrieve(Unknown Source)
at
org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.seek(
NativeS3FileSystem.java:111)
at
org.apache.hadoop.fs.BufferedFSInputStream.seek(BufferedFSInputStream.java:7
6)
at
org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:37)
at
org.apache.pig.backend.hadoop.datastorage.HSeekableInputStream.seek(HSeekabl
eInputStream.java:64)
at
org.apache.pig.backend.executionengine.PigSlice.init(PigSlice.java:85)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SliceWrapper.ma
keReader(SliceWrapper.java:127)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.
getRecordReader(PigInputFormat.java:253)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:336)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: org.jets3t.service.S3ServiceException: S3 GET failed for
'/tmp%2Ftemp666717117%2Ftmp-2105109046%2Fpart-00000' XML Error Message:
<?xml version="1.0"
encoding="UTF-8"?><Error><Code>InvalidRange</Code><Message>The requested
range is not
satisfiable</Message><ActualObjectSize>0</ActualObjectSize><RequestId>1CF939
F219CF8549</RequestId><HostId>w05Yp+WVCk2k/N9iVnqYmbFZzEiqszGYV3++yZjj+J/oaO
JAifjUW4b5ZxIFDH2C(RestS3S
ervice.java:424)
at
org.jets3t.service.impl.rest.httpclient.RestS3Service.performRestGet(RestS3S
ervice.java:686)
at
org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectImpl(RestS3Se
rvice.java:1558)
at
org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectImpl(RestS3Se
rvice.java:1501)
at org.jets3t.service.S3Service.getObject(S3Service.java:1876)
at
org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieve(Jets3tNat
iveFileSystemStore.java:144)
... 17 more

at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErr
orMessages(Launcher.java:230)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getSta
ts(Launcher.java:179)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLaunch
er.launchPig(MapReduceLauncher.java:204)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExec
utionEngine.java:265)
at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
at org.apache.pig.PigServer.execute(PigServer.java:760)
at org.apache.pig.PigServer.access$100(PigServer.java:89)
at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168
)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140
)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
at org.apache.pig.Main.main(Main.java:307)

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedSep 11, '09 at 4:40p
activeSep 11, '09 at 4:40p
posts1
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Zjffdu: 1 post

People

Translate

site design / logo © 2022 Grokbase