FAQ
I have a difficult race condition scenario that I'm trying to work out.

I do a lot of work with log files producing a lot of data. I have a Go program reading lines indefinitely from stdin and writing lines out to many separate files based on some business rules.

In order to minimise the effect of blocking I/O* and take advantage of the many CPU cores available, my current implementation of this has many worker goroutines that recieve bytes on a channel and write them to a file.

Each worker goroutine runs a single function with two arguments: a channel to read []byte values from and an *os.File to write them to. When the channel closes, the goroutine closes its file and ends.

The main goroutine reads lines from input, peeks into the lines to figure out where they should go, and looks up a "Router" structure (a map[string] chan []byte) to find the channel to send them to. If there's no result from the map, it opens or creates the relevant file, spawns a go routine to write to it and records it in the Router.

The channels may or may not be buffered in production - I haven't yet done any benchmarking.

Sounds a bit convoluted, but the implementation is simple and reliable.

Periodically I need to rotate these logs. My preferred way to do this - and one which matches other systems we run - is to move all the files to an archive directory and send SIGHUP to the process, telling it to reopen them. Due to the above implementation, I can basically just do a range over the "Router", closing the channels and deleting all the keys from the map. This means that as the "main" goroutine reads in new lines from input, the goroutines for handling data will come back organically,

Here's the race condition: the case where the main goroutine has identified the channel for some data by looking up the map, but before it can do its send, a SIGHUP handler comes along and closes all the channels in the router. The main goroutine's attempted send will cause a panic.

The only way I can think of to get around this is to take a lock on a mutex every time I read or write from the Router map. This seems like a huge performance hit for such a small case - so much so that it'd be better to run the whole program in a single goroutines, or maybe two.

Is there another way around this while maintaining the general approach?

Hope I didn't make too much of a mess of this typing on my phone.

Thanks,

Daniel

* this is an environment with _very_ limited storage throughout / IOPS

--

Search Discussions

  • David DENG at Dec 12, 2012 at 8:57 am
    What if you don't close the channels in the SIGHUP routine, but just send a
    command to the worker to replace the output file.

    func worker(in chan []byte, f os.File, cf chan os.File) {
    for b := range(in) {

    Output b to f...

    select {
    case <-cf {
    f = cf // change to another file
    }
    default:
    // do nothing
    }
    } // for b
    }


    routers := make(map[string][2]interface{}) // you need to store both
    channels.

    David

    On Wednesday, December 12, 2012 4:30:50 PM UTC+8, Daniel Bryan wrote:

    I have a difficult race condition scenario that I'm trying to work out.

    I do a lot of work with log files producing a lot of data. I have a Go
    program reading lines indefinitely from stdin and writing lines out to many
    separate files based on some business rules.

    In order to minimise the effect of blocking I/O* and take advantage of the
    many CPU cores available, my current implementation of this has many worker
    goroutines that recieve bytes on a channel and write them to a file.

    Each worker goroutine runs a single function with two arguments: a channel
    to read []byte values from and an *os.File to write them to. When the
    channel closes, the goroutine closes its file and ends.

    The main goroutine reads lines from input, peeks into the lines to figure
    out where they should go, and looks up a "Router" structure (a map[string]
    chan []byte) to find the channel to send them to. If there's no result from
    the map, it opens or creates the relevant file, spawns a go routine to
    write to it and records it in the Router.

    The channels may or may not be buffered in production - I haven't yet done
    any benchmarking.

    Sounds a bit convoluted, but the implementation is simple and reliable.

    Periodically I need to rotate these logs. My preferred way to do this -
    and one which matches other systems we run - is to move all the files to an
    archive directory and send SIGHUP to the process, telling it to reopen
    them. Due to the above implementation, I can basically just do a range over
    the "Router", closing the channels and deleting all the keys from the map.
    This means that as the "main" goroutine reads in new lines from input, the
    goroutines for handling data will come back organically,

    Here's the race condition: the case where the main goroutine has
    identified the channel for some data by looking up the map, but before it
    can do its send, a SIGHUP handler comes along and closes all the channels
    in the router. The main goroutine's attempted send will cause a panic.

    The only way I can think of to get around this is to take a lock on a
    mutex every time I read or write from the Router map. This seems like a
    huge performance hit for such a small case - so much so that it'd be better
    to run the whole program in a single goroutines, or maybe two.

    Is there another way around this while maintaining the general approach?

    Hope I didn't make too much of a mess of this typing on my phone.

    Thanks,

    Daniel

    * this is an environment with _very_ limited storage throughout / IOPS
    --
  • David DENG at Dec 12, 2012 at 9:10 am
    Actually this method is only more go-style (by his rule of share by
    communicating. But actually the channel itself may perform lock or similar
    things. So I don't think this could be more effeciency, if you original
    proposal needs optimization. So choose only by readability and
    maintainability.

    David
    On Wednesday, December 12, 2012 4:57:35 PM UTC+8, David DENG wrote:

    What if you don't close the channels in the SIGHUP routine, but just send
    a command to the worker to replace the output file.

    func worker(in chan []byte, f os.File, cf chan os.File) {
    for b := range(in) {

    Output b to f...

    select {
    case <-cf {
    f = cf // change to another file
    }
    default:
    // do nothing
    }
    } // for b
    }


    routers := make(map[string][2]interface{}) // you need to store both
    channels.

    David

    On Wednesday, December 12, 2012 4:30:50 PM UTC+8, Daniel Bryan wrote:

    I have a difficult race condition scenario that I'm trying to work out.

    I do a lot of work with log files producing a lot of data. I have a Go
    program reading lines indefinitely from stdin and writing lines out to many
    separate files based on some business rules.

    In order to minimise the effect of blocking I/O* and take advantage of
    the many CPU cores available, my current implementation of this has many
    worker goroutines that recieve bytes on a channel and write them to a file.

    Each worker goroutine runs a single function with two arguments: a
    channel to read []byte values from and an *os.File to write them to. When
    the channel closes, the goroutine closes its file and ends.

    The main goroutine reads lines from input, peeks into the lines to figure
    out where they should go, and looks up a "Router" structure (a map[string]
    chan []byte) to find the channel to send them to. If there's no result from
    the map, it opens or creates the relevant file, spawns a go routine to
    write to it and records it in the Router.

    The channels may or may not be buffered in production - I haven't yet
    done any benchmarking.

    Sounds a bit convoluted, but the implementation is simple and reliable.

    Periodically I need to rotate these logs. My preferred way to do this -
    and one which matches other systems we run - is to move all the files to an
    archive directory and send SIGHUP to the process, telling it to reopen
    them. Due to the above implementation, I can basically just do a range over
    the "Router", closing the channels and deleting all the keys from the map.
    This means that as the "main" goroutine reads in new lines from input, the
    goroutines for handling data will come back organically,

    Here's the race condition: the case where the main goroutine has
    identified the channel for some data by looking up the map, but before it
    can do its send, a SIGHUP handler comes along and closes all the channels
    in the router. The main goroutine's attempted send will cause a panic.

    The only way I can think of to get around this is to take a lock on a
    mutex every time I read or write from the Router map. This seems like a
    huge performance hit for such a small case - so much so that it'd be better
    to run the whole program in a single goroutines, or maybe two.

    Is there another way around this while maintaining the general approach?

    Hope I didn't make too much of a mess of this typing on my phone.

    Thanks,

    Daniel

    * this is an environment with _very_ limited storage throughout / IOPS
    --
  • Sanjay at Dec 12, 2012 at 9:27 am
    It sounds like there will be no contention on that Mutex. Only the process
    that receives the SIGHUP and the process that is routing requests ever
    tries to lock that Mutex. This will tickle the fast path of the Mutex, and
    perform a single CAS. This is a bit of a performance hit, but certainly not
    a huge one.
    Sanjay

    --
  • Jesse McNelis at Dec 12, 2012 at 9:32 am

    On Wed, Dec 12, 2012 at 7:30 PM, Daniel Bryan wrote:

    looking up the map, but before it can do its send, a SIGHUP handler comes
    along and closes all the channels in the router. The main goroutine's
    attempted send will cause a panic.

    Why not have the main goroutine receive the SIGHUP and close the channels
    itself?
    The main goroutine could select between receiving from the signal channel
    and sending to one of the output channels. If it gets a SIGHUP, it can
    complete the send to the output and then close all the channels.


    --
    =====================
    http://jessta.id.au

    --
  • Kyle Lemons at Dec 12, 2012 at 4:59 pm
    In a case like this, I would have the Router map would as a local variable
    in the distributor goroutine and wouldn't even have a lock. The
    distributor goroutine, in addition to receiving lines from stdin on a
    channel, would receive the SIGHUP signal in a select statement that would
    close all of the goroutines in the router.

    On Wed, Dec 12, 2012 at 3:30 AM, Daniel Bryan wrote:

    I have a difficult race condition scenario that I'm trying to work out.

    I do a lot of work with log files producing a lot of data. I have a Go
    program reading lines indefinitely from stdin and writing lines out to many
    separate files based on some business rules.

    In order to minimise the effect of blocking I/O* and take advantage of the
    many CPU cores available, my current implementation of this has many worker
    goroutines that recieve bytes on a channel and write them to a file.

    Each worker goroutine runs a single function with two arguments: a channel
    to read []byte values from and an *os.File to write them to. When the
    channel closes, the goroutine closes its file and ends.

    The main goroutine reads lines from input, peeks into the lines to figure
    out where they should go, and looks up a "Router" structure (a map[string]
    chan []byte) to find the channel to send them to. If there's no result from
    the map, it opens or creates the relevant file, spawns a go routine to
    write to it and records it in the Router.

    The channels may or may not be buffered in production - I haven't yet done
    any benchmarking.

    Sounds a bit convoluted, but the implementation is simple and reliable.

    Periodically I need to rotate these logs. My preferred way to do this -
    and one which matches other systems we run - is to move all the files to an
    archive directory and send SIGHUP to the process, telling it to reopen
    them. Due to the above implementation, I can basically just do a range over
    the "Router", closing the channels and deleting all the keys from the map.
    This means that as the "main" goroutine reads in new lines from input, the
    goroutines for handling data will come back organically,

    Here's the race condition: the case where the main goroutine has
    identified the channel for some data by looking up the map, but before it
    can do its send, a SIGHUP handler comes along and closes all the channels
    in the router. The main goroutine's attempted send will cause a panic.

    The only way I can think of to get around this is to take a lock on a
    mutex every time I read or write from the Router map. This seems like a
    huge performance hit for such a small case - so much so that it'd be better
    to run the whole program in a single goroutines, or maybe two.

    Is there another way around this while maintaining the general approach?

    Hope I didn't make too much of a mess of this typing on my phone.

    Thanks,

    Daniel

    * this is an environment with _very_ limited storage throughout / IOPS

    --

    --
  • Daniel Bryan at Dec 12, 2012 at 10:43 pm

    In a case like this, I would have the Router map would as a local variable
    in the distributor goroutine and wouldn't even have a lock. The
    distributor goroutine, in addition to receiving lines from stdin on a
    channel, would receive the SIGHUP signal in a select statement that would
    close all of the goroutines in the router.

    Why not have the main goroutine receive the SIGHUP and close the channels
    itself?
    The main goroutine could select between receiving from the signal channel
    and sending to one of the output channels. If it gets a SIGHUP, it can
    complete the send to the output and then close all the channels.
    Yess, these are exactly what I wanted. Nice and idiomatic, and a select is
    conceptually what I want to do.

    It sounds like there will be no contention on that Mutex. Only the process
    that receives the SIGHUP and the process that is routing requests ever
    tries to lock that Mutex. This will tickle the fast path of the Mutex, and
    perform a single CAS. This is a bit of a performance hit, but certainly not
    a huge one.
    Good to know. I should read up on mutex implementation.
    On Thursday, December 13, 2012 3:51:11 AM UTC+11, Kyle Lemons wrote:

    In a case like this, I would have the Router map would as a local variable
    in the distributor goroutine and wouldn't even have a lock. The
    distributor goroutine, in addition to receiving lines from stdin on a
    channel, would receive the SIGHUP signal in a select statement that would
    close all of the goroutines in the router.


    On Wed, Dec 12, 2012 at 3:30 AM, Daniel Bryan <danb...@gmail.com<javascript:>
    wrote:
    I have a difficult race condition scenario that I'm trying to work out.

    I do a lot of work with log files producing a lot of data. I have a Go
    program reading lines indefinitely from stdin and writing lines out to many
    separate files based on some business rules.

    In order to minimise the effect of blocking I/O* and take advantage of
    the many CPU cores available, my current implementation of this has many
    worker goroutines that recieve bytes on a channel and write them to a file.

    Each worker goroutine runs a single function with two arguments: a
    channel to read []byte values from and an *os.File to write them to. When
    the channel closes, the goroutine closes its file and ends.

    The main goroutine reads lines from input, peeks into the lines to figure
    out where they should go, and looks up a "Router" structure (a map[string]
    chan []byte) to find the channel to send them to. If there's no result from
    the map, it opens or creates the relevant file, spawns a go routine to
    write to it and records it in the Router.

    The channels may or may not be buffered in production - I haven't yet
    done any benchmarking.

    Sounds a bit convoluted, but the implementation is simple and reliable.

    Periodically I need to rotate these logs. My preferred way to do this -
    and one which matches other systems we run - is to move all the files to an
    archive directory and send SIGHUP to the process, telling it to reopen
    them. Due to the above implementation, I can basically just do a range over
    the "Router", closing the channels and deleting all the keys from the map.
    This means that as the "main" goroutine reads in new lines from input, the
    goroutines for handling data will come back organically,

    Here's the race condition: the case where the main goroutine has
    identified the channel for some data by looking up the map, but before it
    can do its send, a SIGHUP handler comes along and closes all the channels
    in the router. The main goroutine's attempted send will cause a panic.

    The only way I can think of to get around this is to take a lock on a
    mutex every time I read or write from the Router map. This seems like a
    huge performance hit for such a small case - so much so that it'd be better
    to run the whole program in a single goroutines, or maybe two.

    Is there another way around this while maintaining the general approach?

    Hope I didn't make too much of a mess of this typing on my phone.

    Thanks,

    Daniel

    * this is an environment with _very_ limited storage throughout / IOPS

    --

    --

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-nuts @
categoriesgo
postedDec 12, '12 at 8:30a
activeDec 12, '12 at 10:43p
posts7
users5
websitegolang.org

People

Translate

site design / logo © 2022 Grokbase