Grokbase Groups Pig user July 2011
FAQ
Hello,

I have a custom loader function to read in a parsed schema from some log files, but it seems there is a problem with some of the log files and I need to detect if the end of a line in the log does not end with '\n' or is EOF when loading from the file. I'm currently running Pig 0.8.0 with Hadoop 0.20.2, and I'm using the RecordReader class in my loader function to read in from a text file in the following way:

RecordReader in = null;

public Tuple getNext() throws IOException {
try {
boolean notDone = in.nextKeyValue();
if (!notDone) {
return null;
}
Text tval = (Text)in.getCurrentValue();
String val = tval.toString();

However, there's no way using this method to check for '\n' or EOF in the String val, so I'm not sure if it's possible to use another type of Record Reader or some other method to check for these values. Any suggestions on how to do this in a custom Pig loader function?

Search Discussions

  • Dmitriy Ryaboy at Jul 26, 2011 at 6:21 pm
    It sounds like you need to write your own recordReader (and associated
    inputFormat)

    D
    On Tue, Jul 26, 2011 at 11:12 AM, wrote:

    Hello,

    I have a custom loader function to read in a parsed schema from some log
    files, but it seems there is a problem with some of the log files and I need
    to detect if the end of a line in the log does not end with '\n' or is EOF
    when loading from the file. I'm currently running Pig 0.8.0 with Hadoop
    0.20.2, and I'm using the RecordReader class in my loader function to read
    in from a text file in the following way:

    RecordReader in = null;

    public Tuple getNext() throws IOException {
    try {
    boolean notDone = in.nextKeyValue();
    if (!notDone) {
    return null;
    }
    Text tval = (Text)in.getCurrentValue();
    String val = tval.toString();

    However, there's no way using this method to check for '\n' or EOF in the
    String val, so I'm not sure if it's possible to use another type of Record
    Reader or some other method to check for these values. Any suggestions on
    how to do this in a custom Pig loader function?

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedJul 26, '11 at 6:12p
activeJul 26, '11 at 6:21p
posts2
users2
websitepig.apache.org

2 users in discussion

Dmitriy Ryaboy: 1 post Brett_meyer: 1 post

People

Translate

site design / logo © 2021 Grokbase