FAQ
I have a huge mysql.log file full of errors. I'd like to sort it by
the most common line, and work from there. I did go through the
manpage for sort, and googled a bit, but I found nothing relevant.

Here is an example of the output:
[root@ log]# tail mysqld.log
110925 11:05:35 [ERROR] /usr/libexec/mysqld: Incorrect key file for
table './ox_data_summary_ad_hourly.MYI'; try to repair it
110925 11:05:35 [ERROR] /usr/libexec/mysqld: Incorrect key file for
table './ox_data_summary_ad_hourly.MYI'; try to repair it
110925 12:05:28 [ERROR] /usr/libexec/mysqld: Incorrect key file for
table './ox_data_intermediate_ad.MYI'; try to repair it
110925 12:05:28 [ERROR] /usr/libexec/mysqld: Incorrect key file for
table './ox_data_intermediate_ad.MYI'; try to repair it
110925 12:05:28 [ERROR] /usr/libexec/mysqld: Incorrect key file for
table './ox_data_intermediate_ad.MYI'; try to repair it
110925 12:05:28 [ERROR] /usr/libexec/mysqld: Incorrect key file for
table './ox_data_summary_ad_hourly.MYI'; try to repair it
110925 13:09:43 [ERROR] /usr/libexec/mysqld: Incorrect key file for
table './ox_data_intermediate_ad.MYI'; try to repair it
110925 13:09:43 [ERROR] /usr/libexec/mysqld: Incorrect key file for
table './ox_data_intermediate_ad.MYI'; try to repair it
110925 13:09:43 [ERROR] /usr/libexec/mysqld: Incorrect key file for
table './ox_data_intermediate_ad.MYI'; try to repair it
110925 13:09:43 [ERROR] /usr/libexec/mysqld: Incorrect key file for
table './ox_data_summary_ad_hourly.MYI'; try to repair it
[root@ log]# wc -l mysqld.log
20686 mysqld.log
[root@ log]# cat mysqld.log | grep ERROR | wc -l
20332
[root@ log]#


Is there a way to get the most common (unique) lines of the file?


By the way, I'm not sure if this is RHEL or CentOS, or which version:
[root@ log]# uname -a
Linux example.com 2.6.18-194.32.1.el5xen #1 SMP Wed Jan 5 18:44:24 EST
2011 x86_64 x86_64 x86_64 GNU/Linux
[root@ log]# uname -o
GNU/Linux
[root@ log]#

I assume that it is one of these, as Yum is installed. How would I find out?

Thanks!

Search Discussions

  • John R Pierce at Sep 25, 2011 at 3:06 pm

    On 09/25/11 11:51 AM, Dotan Cohen wrote:
    ...
    110925 13:09:43 [ERROR] /usr/libexec/mysqld: Incorrect key file for
    table './ox_data_summary_ad_hourly.MYI'; try to repair it
    [root@ log]# wc -l mysqld.log
    20686 mysqld.log
    [root@ log]# cat mysqld.log | grep ERROR | wc -l
    20332
    [root@ log]#


    Is there a way to get the most common (unique) lines of the file?
    sort -k 3 | uniq -f 2


    which will sort starting at field 3, and then print lines that are
    unique, skipping the first 2 fields, where fields by default are blank
    separated.



    --
    john r pierce N 37, W 122
    santa cruz ca mid-left coast
  • Dotan Cohen at Sep 25, 2011 at 3:18 pm

    On Sun, Sep 25, 2011 at 22:06, John R Pierce wrote:
    Is there a way to get the most common (unique) lines of the file?
    sort -k 3 | uniq -f 2


    which will sort starting at field 3, and then print lines that are
    unique, skipping the first 2 fields, where fields by default are blank
    separated.
    Thanks, John. This looks to me that it will sort alphabetically, not
    by commonness. For instance:
    ERROR b
    ERROR a
    ERROR b

    Since "ERROR b" was reported more often than "ERROR a", I would prefer
    that the output be:
    ERROR b
    ERROR a

    I'm sorry for not making that so clear! Is there a good word for "most
    common" or "used most often" that would be concise in this context?

    Thanks!
  • John R Pierce at Sep 25, 2011 at 3:43 pm

    On 09/25/11 12:18 PM, Dotan Cohen wrote:
    On Sun, Sep 25, 2011 at 22:06, John R Piercewrote:
    Is there a way to get the most common (unique) lines of the file?
    sort -k 3 | uniq -f 2


    which will sort starting at field 3, and then print lines that are
    unique, skipping the first 2 fields, where fields by default are blank
    separated.
    Thanks, John. This looks to me that it will sort alphabetically, not
    by commonness. For instance:
    ERROR b
    ERROR a
    ERROR b

    Since "ERROR b" was reported more often than "ERROR a", I would prefer
    that the output be:
    ERROR b
    ERROR a

    I'm sorry for not making that so clear! Is there a good word for "most
    common" or "used most often" that would be concise in this context?

    uniq can count occurances. will require two sorts. one to get all
    similar errors adjacent, the other to sort by count order. instead of
    using field selects, lets just clip the timestamps off up front...

    cut -c 17- | sort | uniq -c | sort -rn

    (17- means from char 17 on... I may have miscounted)



    --
    john r pierce N 37, W 122
    santa cruz ca mid-left coast
  • Dotan Cohen at Sep 26, 2011 at 7:19 pm

    On Sun, Sep 25, 2011 at 22:43, John R Pierce wrote:
    uniq can count occurances. ?will require two sorts. ?one to get all
    similar errors adjacent, the other to sort by count order. ? instead of
    using field selects, lets just clip the timestamps off up front...

    ? cut -c 17- | sort | uniq -c | sort -rn

    (17- means from char 17 on... I may have miscounted)
    Thank you John! That is perfect! I'm going through the uniq manpage
    now. Have a great night!
  • Frank Cox at Sep 25, 2011 at 3:10 pm

    On Sun, 25 Sep 2011 21:51:51 +0300 Dotan Cohen wrote:

    Is there a way to get the most common (unique) lines of the file?
    If you want what I think you want, a combination of cut and sort will do it.
    By the way, I'm not sure if this is RHEL or CentOS, or which version:
    I assume that it is one of these, as Yum is installed. How would I find out?
    cat /etc/redhat-release

    --
    MELVILLE THEATRE ~ Real D 3D Digital Cinema ~ www.melvilletheatre.com
    www.creekfm.com - FIFTY THOUSAND WATTS of POW WOW POWER!
  • Dotan Cohen at Sep 25, 2011 at 3:21 pm

    On Sun, Sep 25, 2011 at 22:10, Frank Cox wrote:
    Is there a way to get the most common (unique) lines of the file?
    If you want what I think you want, a combination of cut and sort will do it.
    Neither seem to have the "most common line" ability built in. I might
    have to resort to either Perl, or just attacking the logfile errors at
    random!

    cat /etc/redhat-release
    Thanks! I is more up to date than I thought!

    [root at gastricsleeve html]# cat /etc/redhat-release
    CentOS release 5.5 (Final)
  • John R. Dennison at Sep 25, 2011 at 4:34 pm

    On Sun, Sep 25, 2011 at 10:21:11PM +0300, Dotan Cohen wrote:
    Thanks! I is more up to date than I thought!

    [root at gastricsleeve html]# cat /etc/redhat-release
    CentOS release 5.5 (Final)
    Actually you are 2 full point releases behind; current is 5.7. I would
    strongly suggest you update.




    John
    --
    Politics is just show business for ugly people.

    -- Jay Leno
    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: not available
    Type: application/pgp-signature
    Size: 189 bytes
    Desc: not available
    Url : http://lists.centos.org/pipermail/centos/attachments/20110925/58c3fbe3/attachment.bin
  • Dotan Cohen at Sep 26, 2011 at 7:41 pm

    On Sun, Sep 25, 2011 at 23:34, John R. Dennison wrote:
    Actually you are 2 full point releases behind; current is 5.7. ?I would
    strongly suggest you update.
    Thanks. I will mention that to the sysadmin.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcentos @
categoriescentos
postedSep 25, '11 at 2:51p
activeSep 26, '11 at 7:41p
posts9
users4
websitecentos.org
irc#centos

People

Translate

site design / logo © 2022 Grokbase