FAQ
For example, the long string is 'abcabc' and the given string is
'abc', then 'abc' appears 2 times in 'abcabc'. Currently, I am calling
'find()' multiple times to figure out how many times a given string
appears in a long string. I'm wondering if there is a function in
python which can directly return this information.

Search Discussions

  • Stephen Hansen at Oct 24, 2009 at 3:21 am

    On Fri, Oct 23, 2009 at 7:31 PM, Peng Yu wrote:

    For example, the long string is 'abcabc' and the given string is
    'abc', then 'abc' appears 2 times in 'abcabc'. Currently, I am calling
    'find()' multiple times to figure out how many times a given string
    appears in a long string. I'm wondering if there is a function in
    python which can directly return this information.
    'abcabc'.count('abc')
    2
    print ''.count.__doc__
    S.count(sub[, start[, end]]) -> int

    Return the number of non-overlapping occurrences of substring sub in
    string S[start:end]. Optional arguments start and end are interpreted
    as in slice notation.

    HTH,

    --S
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/python-list/attachments/20091023/6b2a6776/attachment.htm>
  • Erik Max Francis at Oct 24, 2009 at 3:22 am

    Peng Yu wrote:
    For example, the long string is 'abcabc' and the given string is
    'abc', then 'abc' appears 2 times in 'abcabc'. Currently, I am calling
    'find()' multiple times to figure out how many times a given string
    appears in a long string. I'm wondering if there is a function in
    python which can directly return this information.
    The .count string method.

    --
    Erik Max Francis && max at alcyone.com && http://www.alcyone.com/max/
    San Jose, CA, USA && 37 18 N 121 57 W && AIM/Y!M/Skype erikmaxfrancis
    Diplomacy and defense are not substitutes for one another. Either
    alone would fail. -- John F. Kennedy, 1917-1963
  • Gerard Flanagan at Oct 26, 2009 at 10:46 am

    Peng Yu wrote:
    For example, the long string is 'abcabc' and the given string is
    'abc', then 'abc' appears 2 times in 'abcabc'. Currently, I am calling
    'find()' multiple times to figure out how many times a given string
    appears in a long string. I'm wondering if there is a function in
    python which can directly return this information.
    re.findall?
    patt = re.compile('abc')
    len(patt.findall('abcabc'))
    2

    For groups of non-overlapping substrings, tested only as far as you see:

    8<----------------------------------------------------------------------

    import re
    from collections import defaultdict

    def count(text, *args):
    """
    ret = count('abcabc', 'abc')
    ret['abc']
    2
    ret = count('xabcxabcx', 'abc', 'x')
    ret['abc']
    2
    ret['x']
    3
    ret = count('abcabc', 'abc', 'cab')
    ret['abc']
    2
    ret['cab']
    ret = count('abcabc', 'abc', 'ab')
    ret['abc']
    2
    ret['ab']
    """
    args = map(re.escape, args)
    args.sort()
    args.reverse()
    pattern = re.compile('|'.join(args))
    result = defaultdict(int)
    def callback(match):
    matched = match.group(0)
    result[matched] += 1
    return matched
    pattern.sub(callback, text)
    return result


    if __name__ == '__main__':
    import doctest
    doctest.testmod()
    8<----------------------------------------------------------------------
  • Alex23 at Oct 27, 2009 at 1:10 am

    Gerard Flanagan wrote:
    def count(text, *args):
    Other than the ability to handle multiple substrings, you do realise
    you've effectively duplicated str.count()?
  • Gerard Flanagan at Oct 27, 2009 at 7:31 am

    alex23 wrote:
    Gerard Flanagan wrote:
    def count(text, *args):
    Other than the ability to handle multiple substrings, you do realise
    you've effectively duplicated str.count()?
    I realise that calling this count function with a single argument would
    be functionally identical to calling str.count(), yes. But I can imagine
    the situation of wanting to find multiple (disjoint) substrings. Is
    there a reason for preferring multiple calls to str.count() in such a
    case? Or is there a more obvious approach?
    Gerard Flanagan wrote:
    re.findall?
    Forget that, that was stupid.
  • Gabriel Genellina at Oct 27, 2009 at 8:09 am
    En Tue, 27 Oct 2009 04:31:22 -0300, Gerard Flanagan <grflanagan at gmail.com>
    escribi?:
    alex23 wrote:
    Gerard Flanagan wrote:
    def count(text, *args):
    Other than the ability to handle multiple substrings, you do realise
    you've effectively duplicated str.count()?
    I realise that calling this count function with a single argument would
    be functionally identical to calling str.count(), yes. But I can imagine
    the situation of wanting to find multiple (disjoint) substrings. Is
    there a reason for preferring multiple calls to str.count() in such a
    case? Or is there a more obvious approach?
    There is a more efficient algorithm (Aho-Corasick) which computes all
    occurences of a set of substrings inside a given string at once. It
    *should* be faster than repeatly calling str.find for every substring, and
    faster than using regular expressions too. (note that you get not only the
    count of occurences, but their positions too).
    I've seen a couple implementations for Python; if they're *actually*
    faster is to be determined...

    --
    Gabriel Genellina
  • Alex23 at Oct 27, 2009 at 8:37 am

    Gerard Flanagan wrote:
    I realise that calling this count function with a single argument would
    be functionally identical to calling str.count(), yes. But I can imagine
    the situation of wanting to find multiple (disjoint) substrings. Is
    there a reason for preferring multiple calls to str.count() in such a
    case? Or is there a more obvious approach?
    No, I totally missed that it was counting disjoint substrings :)

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedOct 24, '09 at 2:31a
activeOct 27, '09 at 8:37a
posts8
users6
websitepython.org

People

Translate

site design / logo © 2023 Grokbase