FAQ
Dear all,

I am going to work on a Project that includes " Working on CUDA in
Hadoop Environment ".

I work on Hadoop Platforms ( Hadoop, Hive, Hbase, Map-Reduce ) from the
past 8 months.

If anyone has some working experience or some pointers to basic steps
includes Basic Introduction, Configuring & Running CUDA programs in
Hadoop Cluster , any White Papers or any sort of helpful information,
Please let me know through links or materials.

I shall be grateful for any kindness.



Thanks & Best Regards

Adarsh Sharma

Search Discussions

  • Michael Segel at Feb 9, 2011 at 1:55 pm
    First, CUDA means C/C++ on the GPU, so if you're going to do M/R you will need to bone up on your JNI.
    Second... I'd make sure you have written some CUDA modified code first and test it outside of a M/R framework.


    Beyond that... I'd say it was still leading edge.
    Date: Wed, 9 Feb 2011 18:38:41 +0530
    From: adarsh.sharma@orkash.com
    To: common-user@hadoop.apache.org
    Subject: CUDA on Hadoop

    Dear all,

    I am going to work on a Project that includes " Working on CUDA in
    Hadoop Environment ".

    I work on Hadoop Platforms ( Hadoop, Hive, Hbase, Map-Reduce ) from the
    past 8 months.

    If anyone has some working experience or some pointers to basic steps
    includes Basic Introduction, Configuring & Running CUDA programs in
    Hadoop Cluster , any White Papers or any sort of helpful information,
    Please let me know through links or materials.

    I shall be grateful for any kindness.



    Thanks & Best Regards

    Adarsh Sharma
  • Harsh J at Feb 9, 2011 at 1:59 pm
    You can check-out this project which did some work for Hama+CUDA:
    http://code.google.com/p/mrcl/
    On Wed, Feb 9, 2011 at 6:38 PM, Adarsh Sharma wrote:
    Dear all,

    I am going to work on a Project that includes " Working on CUDA in Hadoop
    Environment ".

    I work on Hadoop Platforms ( Hadoop, Hive, Hbase, Map-Reduce ) from the past
    8 months.

    If anyone has some working experience or some pointers to basic steps
    includes Basic Introduction, Configuring & Running CUDA programs in Hadoop
    Cluster , any White Papers or any sort of helpful information, Please let me
    know through links or materials.

    I shall be grateful for any kindness.



    Thanks & Best Regards

    Adarsh Sharma


    --
    Harsh J
    www.harshj.com
  • Adarsh Sharma at Feb 9, 2011 at 2:36 pm
    Thanx Harsh, I find the below link to start with some practical knowledge.

    http://cs.smith.edu/dftwiki/index.php/Hadoop_Tutorial_2.2_--_Running_C%2B%2B_Programs_on_Hadoop

    But Is HAMA Project has some usefulness for making a sort of Analysis
    Engine that analysis TB's data in Hadoop HDFS.



    Best Regards

    Adarsh Sharma


    Harsh J wrote:
    You can check-out this project which did some work for Hama+CUDA:
    http://code.google.com/p/mrcl/
    On Wed, Feb 9, 2011 at 6:38 PM, Adarsh Sharma wrote:

    Dear all,

    I am going to work on a Project that includes " Working on CUDA in Hadoop
    Environment ".

    I work on Hadoop Platforms ( Hadoop, Hive, Hbase, Map-Reduce ) from the past
    8 months.

    If anyone has some working experience or some pointers to basic steps
    includes Basic Introduction, Configuring & Running CUDA programs in Hadoop
    Cluster , any White Papers or any sort of helpful information, Please let me
    know through links or materials.

    I shall be grateful for any kindness.



    Thanks & Best Regards

    Adarsh Sharma

  • Steve Loughran at Feb 9, 2011 at 2:46 pm

    On 09/02/11 13:58, Harsh J wrote:
    You can check-out this project which did some work for Hama+CUDA:
    http://code.google.com/p/mrcl/
    Amazon let you bring up a Hadoop cluster on machines with GPUs you can
    code against, but I haven't heard of anyone using it. The big issue is
    bandwidth; it just doesn't make sense for a classic "scan through the
    logs" kind of problem as the disk:GPU bandwidth ratio is even worse than
    disk:CPU.

    That said, if you were doing something that involved a lot of compute on
    a block of data (e.g. rendering tiles in a map), this could work.
  • He Chen at Feb 9, 2011 at 5:13 pm
    Hi Sharma

    I have some experiences on working Hybrid Hadoop with GPU. Our group has
    tested CUDA performance on Hadoop clusters. We obtain 20 times speedup and
    save up to 95% power consumption in some computation-intensive test case.

    You can parallel your Java code by using JCUDA which is a kind of API to
    help you call CUDA in your Java code.

    Chen
    On Wed, Feb 9, 2011 at 8:45 AM, Steve Loughran wrote:
    On 09/02/11 13:58, Harsh J wrote:

    You can check-out this project which did some work for Hama+CUDA:
    http://code.google.com/p/mrcl/
    Amazon let you bring up a Hadoop cluster on machines with GPUs you can code
    against, but I haven't heard of anyone using it. The big issue is bandwidth;
    it just doesn't make sense for a classic "scan through the logs" kind of
    problem as the disk:GPU bandwidth ratio is even worse than disk:CPU.

    That said, if you were doing something that involved a lot of compute on a
    block of data (e.g. rendering tiles in a map), this could work.
  • He Chen at Feb 9, 2011 at 5:32 pm
    Hi sharma

    I shared our slides about CUDA performance on Hadoop clusters. Feel free to
    modified it, please mention the copyright!

    Chen
    On Wed, Feb 9, 2011 at 11:13 AM, He Chen wrote:

    Hi Sharma

    I have some experiences on working Hybrid Hadoop with GPU. Our group has
    tested CUDA performance on Hadoop clusters. We obtain 20 times speedup and
    save up to 95% power consumption in some computation-intensive test case.

    You can parallel your Java code by using JCUDA which is a kind of API to
    help you call CUDA in your Java code.

    Chen

    On Wed, Feb 9, 2011 at 8:45 AM, Steve Loughran wrote:
    On 09/02/11 13:58, Harsh J wrote:

    You can check-out this project which did some work for Hama+CUDA:
    http://code.google.com/p/mrcl/
    Amazon let you bring up a Hadoop cluster on machines with GPUs you can
    code against, but I haven't heard of anyone using it. The big issue is
    bandwidth; it just doesn't make sense for a classic "scan through the logs"
    kind of problem as the disk:GPU bandwidth ratio is even worse than disk:CPU.

    That said, if you were doing something that involved a lot of compute on a
    block of data (e.g. rendering tiles in a map), this could work.
  • Adarsh Sharma at Feb 10, 2011 at 7:29 am

    He Chen wrote:
    Hi sharma

    I shared our slides about CUDA performance on Hadoop clusters. Feel
    free to modified it, please mention the copyright!

    Chen

    On Wed, Feb 9, 2011 at 11:13 AM, He Chen wrote:

    Hi Sharma

    I have some experiences on working Hybrid Hadoop with GPU. Our
    group has tested CUDA performance on Hadoop clusters. We obtain 20
    times speedup and save up to 95% power consumption in some
    computation-intensive test case.

    You can parallel your Java code by using JCUDA which is a kind of
    API to help you call CUDA in your Java code.

    Chen


    On Wed, Feb 9, 2011 at 8:45 AM, Steve Loughran wrote:

    On 09/02/11 13:58, Harsh J wrote:

    You can check-out this project which did some work for
    Hama+CUDA:
    http://code.google.com/p/mrcl/


    Amazon let you bring up a Hadoop cluster on machines with GPUs
    you can code against, but I haven't heard of anyone using it.
    The big issue is bandwidth; it just doesn't make sense for a
    classic "scan through the logs" kind of problem as the
    disk:GPU bandwidth ratio is even worse than disk:CPU.

    That said, if you were doing something that involved a lot of
    compute on a block of data (e.g. rendering tiles in a map),
    this could work.

    Thanks Chen , I am looking for some White-Papers on the mentioned topic
    or concerning.
    I think no one has write any white paper on this topic Or I'm wrong.

    However U'r Ppt is very nice.
    Thanx Once again .

    Adarsh
  • Lance Norskog at Feb 10, 2011 at 9:19 am
    If you want to use Python, one of the Py+CUDA projects generates CUDA
    C from the Python byte-codes. You don't have to write any C. I don't
    remember which project it is.

    This lets you debug the CUDA code in isolation, then run it from the
    Hadoop streaming mode.

    On 2/9/11, Adarsh Sharma wrote:
    He Chen wrote:
    Hi sharma

    I shared our slides about CUDA performance on Hadoop clusters. Feel
    free to modified it, please mention the copyright!

    Chen

    On Wed, Feb 9, 2011 at 11:13 AM, He Chen <airbots@gmail.com
    wrote:

    Hi Sharma

    I have some experiences on working Hybrid Hadoop with GPU. Our
    group has tested CUDA performance on Hadoop clusters. We obtain 20
    times speedup and save up to 95% power consumption in some
    computation-intensive test case.

    You can parallel your Java code by using JCUDA which is a kind of
    API to help you call CUDA in your Java code.

    Chen


    On Wed, Feb 9, 2011 at 8:45 AM, Steve Loughran <stevel@apache.org
    wrote:

    On 09/02/11 13:58, Harsh J wrote:

    You can check-out this project which did some work for
    Hama+CUDA:
    http://code.google.com/p/mrcl/


    Amazon let you bring up a Hadoop cluster on machines with GPUs
    you can code against, but I haven't heard of anyone using it.
    The big issue is bandwidth; it just doesn't make sense for a
    classic "scan through the logs" kind of problem as the
    disk:GPU bandwidth ratio is even worse than disk:CPU.

    That said, if you were doing something that involved a lot of
    compute on a block of data (e.g. rendering tiles in a map),
    this could work.

    Thanks Chen , I am looking for some White-Papers on the mentioned topic
    or concerning.
    I think no one has write any white paper on this topic Or I'm wrong.

    However U'r Ppt is very nice.
    Thanx Once again .

    Adarsh

    --
    Lance Norskog
    goksron@gmail.com
  • Steve Loughran at Feb 10, 2011 at 11:39 am

    On 09/02/11 17:31, He Chen wrote:
    Hi sharma

    I shared our slides about CUDA performance on Hadoop clusters. Feel free to
    modified it, please mention the copyright!
    This is nice. If you stick it up online you should link to it from the
    Hadoop wiki pages -maybe start a hadoop+cuda page and refer to it
  • Adarsh Sharma at Feb 10, 2011 at 11:43 am

    Steve Loughran wrote:
    On 09/02/11 17:31, He Chen wrote:
    Hi sharma

    I shared our slides about CUDA performance on Hadoop clusters. Feel
    free to
    modified it, please mention the copyright!
    This is nice. If you stick it up online you should link to it from the
    Hadoop wiki pages -maybe start a hadoop+cuda page and refer to it
    Yes, This will be very helpful for others too. But This much
    information is not sufficient , need more.



    Best Regards

    Adarsh Sharma
  • He Chen at Feb 10, 2011 at 5:39 pm
    Thank you Steve Loughran. I just created a new page on Hadoop wiki, however,
    how can I create a new document page on Hadoop Wiki?

    Best wishes

    Chen
    On Thu, Feb 10, 2011 at 5:38 AM, Steve Loughran wrote:
    On 09/02/11 17:31, He Chen wrote:

    Hi sharma

    I shared our slides about CUDA performance on Hadoop clusters. Feel free
    to
    modified it, please mention the copyright!
    This is nice. If you stick it up online you should link to it from the
    Hadoop wiki pages -maybe start a hadoop+cuda page and refer to it
  • Milind Bhandarkar at Feb 9, 2011 at 6:07 pm
    My ex-colleague, Sanjiv Satoor (currently at NVidia in Pune, India), and I have had some discussions about it. He is (obviously) very interested. Please contact him for more info.

    - milind
    On Feb 9, 2011, at 6:45 AM, Steve Loughran wrote:
    On 09/02/11 13:58, Harsh J wrote:
    You can check-out this project which did some work for Hama+CUDA:
    http://code.google.com/p/mrcl/
    Amazon let you bring up a Hadoop cluster on machines with GPUs you can code against, but I haven't heard of anyone using it. The big issue is bandwidth; it just doesn't make sense for a classic "scan through the logs" kind of problem as the disk:GPU bandwidth ratio is even worse than disk:CPU.

    That said, if you were doing something that involved a lot of compute on a block of data (e.g. rendering tiles in a map), this could work.
    ---
    Milind Bhandarkar
    mbhandarkar@linkedin.com

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedFeb 9, '11 at 1:04p
activeFeb 10, '11 at 5:39p
posts13
users7
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase