FAQ
The buildfarm is now going on six years old (time flies when you're
having fun!) and the database is now rather large - around 76Gb on disk.
We'd like to reduce that quite a lot, especially by purging out the logs
of old builds. And while the old data isn't publicly accessible, it has
occasionally been used to run specialised queries to research particular
issues. It's also arguably a useful historical resource that shouldn't
be lightly abandoned.

I'd like to get an idea of what the community regards as a reasonable
amount of data to keep online and readily handy? Six months worth? A
year? two years? Is it worth keeping logs of error stages longer than
successful stages? If so, what should the periods be?

One of the things that I'd like to be able to do is FTS on the logs.
Part of our plan is to move to a much more modern version of Postgres.
Keeping the logs to a reasonable size will possibly allow us to provide
FTS, although I haven't discussed that part with Josh Drake yet, and as
it's hosted at CMD he does get a say :-)

cheers

andrew

Search Discussions

  • Tom Lane at Jul 16, 2010 at 12:13 am

    Andrew Dunstan writes:
    The buildfarm is now going on six years old (time flies when you're
    having fun!) and the database is now rather large - around 76Gb on disk.
    We'd like to reduce that quite a lot, especially by purging out the logs
    of old builds. And while the old data isn't publicly accessible, it has
    occasionally been used to run specialised queries to research particular
    issues. It's also arguably a useful historical resource that shouldn't
    be lightly abandoned.
    As long as the historical data is kept somewhere, I agree that it
    doesn't need to be readily available on-line. 10GB a year is not a lot
    of data these days, so it seems like we ought to be able to archive it
    indefinitely; but I can see that keeping it available on the web might
    run into some money. (You could also argue that there's no need to
    archive more than say five years back, but I think that's a different
    discussion.)
    I'd like to get an idea of what the community regards as a reasonable
    amount of data to keep online and readily handy? Six months worth? A
    year? two years? Is it worth keeping logs of error stages longer than
    successful stages? If so, what should the periods be?
    Six months is probably plenty, really, especially if that means we can
    make the data more available than it is now. I'm not convinced that
    "successful" builds should be purged more quickly, as there's often
    reason to look for warnings, funny events in the postmaster log, etc.
    One of the things that I'd like to be able to do is FTS on the logs.
    +1. +10 even. I think this'd be a quantum jump in the usefulness of
    the log archives. I frequently wonder things like "what other machines
    are showing this warning", and right now it's impractical to research
    that.

    regards, tom lane

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedJul 15, '10 at 11:57p
activeJul 16, '10 at 12:13a
posts2
users2
websitepostgresql.org...
irc#postgresql

2 users in discussion

Tom Lane: 1 post Andrew Dunstan: 1 post

People

Translate

site design / logo © 2021 Grokbase