FAQ
Hey,


is there a easy way to copy the content between 2 unique keywords in a .txt file?


example.txt


1, 2, 3, 4
#keyword1
3, 4, 5, 6
2, 3, 4, 5
#keyword2
4, 5, 6 ,7




Thank you very much

Search Discussions

  • Steven D'Aprano at Sep 10, 2015 at 12:10 pm

    On Thu, 10 Sep 2015 09:18 pm, Gerald wrote:


    Hey,

    is there a easy way to copy the content between 2 unique keywords in a
    .txt file?

    example.txt

    1, 2, 3, 4
    #keyword1
    3, 4, 5, 6
    2, 3, 4, 5
    #keyword2
    4, 5, 6 ,7


    Thank you very much



    Copy in what sense? Write to another file, or just copy to memory?


    Either way, your solution will look something like this:


    * read each line from the input file, until you reach the first keyword;
    * as soon as you see the first keyword, change to "copy mode" and start
    copying lines in whatever way you feel is best;
    * until you see the second keyword, then stop.




    E.g.


    with open("input.txt") as f:
         # Skip lines as fast as possible.
         for line in f:
             if line == "START\n":
                 break
         # Instead of copying, I'll just print the lines. That's sort of a copy.
         for line in f: # This will pick up where the previous for loop ended.
             if line == "STOP\n":
                 break
             print(line)
         # If you like, you can just finish now.
         # Or, we can read the rest of the lines.
         for line in f: # continue from just after the STOP keyword.
             pass # This is a waste of time...
    print("Done!")






    --
    Steven
  • Jussi Piitulainen at Sep 10, 2015 at 1:47 pm

    Gerald writes:


    Hey,

    is there a easy way to copy the content between 2 unique keywords in a
    .txt file?

    example.txt

    1, 2, 3, 4
    #keyword1
    3, 4, 5, 6
    2, 3, 4, 5
    #keyword2
    4, 5, 6 ,7

    Depending on your notion of easy, you may or may not like itertools.
    The following code gets you the first keyword and the lines between but
    consumes the second keyword. If I needed more control, I'd probably
    write what Steven D'Aprano wrote but as a generator function, to get the
    flexibility of deciding separately what kind of copy I want in the end.


    And I'd be anxious about the possibility that the second keyword is not
    there in the input at all. Steven's code and mine simply take every line
    after the first keyword in that case. Worth a comment in the code, if
    not an exception. Depends.


    Code:


    from itertools import dropwhile, takewhile
    from sys import stdin


    def notbeg(line): return line != '#keyword1\n'
    def notend(line): return line != '#keyword2 \n' # sic!


    if __name__ == '__main__':
         print(list(takewhile(notend, dropwhile(notbeg, stdin))))


    Output with your original mail as input in stdin:


    ['#keyword1\n', '3, 4, 5, 6\n', '2, 3, 4, 5\n']
  • Vlastimil Brom at Sep 10, 2015 at 2:33 pm

    2015-09-10 13:18 GMT+02:00 Gerald <schweiger.gerald@gmail.com>:
    Hey,

    is there a easy way to copy the content between 2 unique keywords in a .txt file?

    example.txt

    1, 2, 3, 4
    #keyword1
    3, 4, 5, 6
    2, 3, 4, 5
    #keyword2
    4, 5, 6 ,7


    Thank you very much

    Hi,
    just to add another possible approach, you can use regular expression
    search for this task, e.g.
    (after you have read the text content to an input string):

    import re
    input_txt ="""1, 2, 3, 4
    ... #keyword1
    ... 3, 4, 5, 6
    ... 2, 3, 4, 5
    ... #keyword2
    ... 4, 5, 6 ,7"""
    re.findall(r"(?s)(#keyword1)(.*?)(#keyword2)", input_txt)
    [('#keyword1', '\n3, 4, 5, 6\n2, 3, 4, 5\n', '#keyword2')]
    >>>


    like in the other approaches, you might need to specify the details
    for specific cases (no keywords, only one of them, repeated keywords
    (possible in different order, overlapping or "crossed"), handling of
    newlines etc.


    hth,
        vbr
  • Jussi Piitulainen at Sep 10, 2015 at 3:48 pm

    Vlastimil Brom writes:


    just to add another possible approach, you can use regular expression

    Now you have three problems: whatever the two problems are that you are
    alleged to have whenever you decide to use regular expressions for
    anything at all, plus all the people piling on you to tell that a Jamie
    Zawinski once said that whenever you decide to use regular expressions
    to solve a problem, you end up with two problems.


    :)
  • Christian Gollwitzer at Sep 10, 2015 at 5:29 pm

    Am 10.09.15 um 13:18 schrieb Gerald:
    Hey,

    is there a easy way to copy the content between 2 unique keywords in a .txt file?

    example.txt

    1, 2, 3, 4
    #keyword1
    3, 4, 5, 6
    2, 3, 4, 5
    #keyword2
    4, 5, 6 ,7

    If "copying" does mean copy it to another file, and you are not obliged
    to use Python, this is unmatched in awk:


    Apfelkiste:Tests chris$ cat kw.txt
    1, 2, 3, 4
    #keyword1
    3, 4, 5, 6
    2, 3, 4, 5
    #keyword2
    4, 5, 6 ,7
    Apfelkiste:Tests chris$ awk '/keyword1/,/keyword2/' kw.txt
    #keyword1
    3, 4, 5, 6
    2, 3, 4, 5
    #keyword2


    Consequently, awk '/keyword1/,/keyword2/' kw.txt > kw_copy.txt


    would write it out to kw_copy.txt


    Beware that between the two slashes there are regexps, so if you have
    metacharacters in your keywords, you need to quote them.


      Christian
  • Alister at Sep 10, 2015 at 7:41 pm

    On Thu, 10 Sep 2015 12:11:55 -0700, wxjmfauth wrote:


    s = """1, 2, 3, 4
    ... #keyword1 ... 3, 4, 5, 6 ... 2, 3, 4, 5 ... #keyword2 ... 4, 5, 6
    ,7"""
    s[s.find('keyword1') + len('keyword1'):s.find('keyword2') - 1]
    '\n3, 4, 5, 6\n2, 3, 4, 5\n'
    #or s[s.find('keyword1') + len('keyword1') + 1:s.find('keyword2') -
    2]
    '3, 4, 5, 6\n2, 3, 4, 5'

    split works well
    as a simple 1 liner (well 2 if you include the string setup)

    a="crap word1 more crap word1 again word2 still more crap"
    a.split('word1',1)[1].split('word2')[0]

    ' more crap word1 again '






    --
    All bad precedents began as justifiable measures.
       -- Gaius Julius Caesar, quoted in "The Conspiracy of
          Catiline", by Sallust

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedSep 10, '15 at 11:18a
activeSep 10, '15 at 7:41p
posts7
users6
websitepython.org

People

Translate

site design / logo © 2019 Grokbase