FAQ
Try this rather small C++ program...it will more than likley be a LOT faster than anything you could do in hadoop. Hadoop is not the hammer for every nail. Too many people think that any "cluster" solution will automagically scale their problem...tain't true.

I'd appreciate hearing your results with this.

#include <iostream>
#include <fstream>
#include <string>

using namespace std;

int main(int argc, char *argv[])
{
if (argc < 2) {
cerr << "Usage: " << argv[0] << " [filename]" << endl;
return -1;
}
ifstream in(argv[1]);
if (!in) {
perror(argv[1]);
return -1;
}
string str;
in >> str;
int n=0;
while(!in.eof()) {
++n;
//cout << str << endl;
in >> str;
}
in.close();
cout << n << " words" << endl;
return 0;
}

Michael D. Black
Senior Scientist
NG Information Systems
Advanced Analytics Directorate



________________________________________
From: Igor Bubkin [[email protected]]
Sent: Tuesday, February 01, 2011 2:19 AM
To: [email protected]
Cc: [email protected]
Subject: EXTERNAL:How to speed up of Map/Reduce job?

Hello everybody

I have a problem. I installed Hadoop on 2-nodes cluster and run Wordcount
example. It takes about 20 sec for processing of 1,5MB text file. We want to
use Map/Reduce in real time (interactive: by user's requests). User can't
wait for his request 20 sec. This is too long. Is it possible to reduce time
of Map/Reduce job? Or may be I misunderstand something?

BR,
Igor Babkin, Mifors.com

Search Discussions

  • Madhu phatak at Feb 3, 2011 at 10:58 am
    Most of the Hadoop uses includes processing of large data. But in real time
    applications , the data provided by user will be relatively small ,in which
    its not advised to use Hadoop
    On Tue, Feb 1, 2011 at 10:01 PM, Black, Michael (IS) wrote:

    Try this rather small C++ program...it will more than likley be a LOT
    faster than anything you could do in hadoop. Hadoop is not the hammer for
    every nail. Too many people think that any "cluster" solution will
    automagically scale their problem...tain't true.

    I'd appreciate hearing your results with this.

    #include <iostream>
    #include <fstream>
    #include <string>

    using namespace std;

    int main(int argc, char *argv[])
    {
    if (argc < 2) {
    cerr << "Usage: " << argv[0] << " [filename]" << endl;
    return -1;
    }
    ifstream in(argv[1]);
    if (!in) {
    perror(argv[1]);
    return -1;
    }
    string str;
    in >> str;
    int n=0;
    while(!in.eof()) {
    ++n;
    //cout << str << endl;
    in >> str;
    }
    in.close();
    cout << n << " words" << endl;
    return 0;
    }

    Michael D. Black
    Senior Scientist
    NG Information Systems
    Advanced Analytics Directorate



    ________________________________________
    From: Igor Bubkin [[email protected]]
    Sent: Tuesday, February 01, 2011 2:19 AM
    To: [email protected]
    Cc: [email protected]
    Subject: EXTERNAL:How to speed up of Map/Reduce job?

    Hello everybody

    I have a problem. I installed Hadoop on 2-nodes cluster and run Wordcount
    example. It takes about 20 sec for processing of 1,5MB text file. We want
    to
    use Map/Reduce in real time (interactive: by user's requests). User can't
    wait for his request 20 sec. This is too long. Is it possible to reduce
    time
    of Map/Reduce job? Or may be I misunderstand something?

    BR,
    Igor Babkin, Mifors.com

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedFeb 1, '11 at 4:32p
activeFeb 3, '11 at 10:58a
posts2
users2
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2023 Grokbase