Grokbase Groups Pig user March 2011
FAQ
Hi,
I have a big dataset which contains mainly urls and their html
contents. Now given a regular expression I want to get 'x' number of
urls matching the regex pattern. I have written a UDF to filter out
urls based on regular expression. Is there a way in Pig script to
limit the number of results to 'x' ? ( 'x' is some configurable value)

Thanks,
Souri

Search Discussions

  • Eric Lubow at Mar 9, 2011 at 7:08 pm
    Are you looking for:
    udf_regex_results = my_UDF(...);
    limited_regex_results = LIMIT udf_regex_results 10; -- 10 is configurable

    -e
    On Wed, Mar 9, 2011 at 13:58, souri datta wrote:

    Hi,
    I have a big dataset which contains mainly urls and their html
    contents. Now given a regular expression I want to get 'x' number of
    urls matching the regex pattern. I have written a UDF to filter out
    urls based on regular expression. Is there a way in Pig script to
    limit the number of results to 'x' ? ( 'x' is some configurable value)

    Thanks,
    Souri
    Eric Lubow e: eric.lubow@gmail.com w: eric.lubow.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedMar 9, '11 at 6:59p
activeMar 9, '11 at 7:08p
posts2
users2
websitepig.apache.org

2 users in discussion

Eric Lubow: 1 post Souri datta: 1 post

People

Translate

site design / logo © 2021 Grokbase