FAQ
I am writing this post because it seems that the GO JSON encoder is not a
truly streaming encoder. That is, the memory overhead of encoding JSON is
not constant but rather is proportional to the size of the data structure
being encoded.

I am writing a server to store health metrics of ~10,000 machines in a
cluster. This server always stores the latest health metrics of all 10,000
machines in its local memory. When other processes request these health
metrics, this server must send them over the wire via JSON or Go RPC.

The health metrics for 10,000 machines is quite large. Encoding a slice of
10,000 health metrics is 313M in JSON and 196M in GOB.

Both the Go JSON encoder and GOB encoder wrap an io.Writer, so I would
expect that using them does not incur much additional memory overhead even
when encoding very large data structures. However, when I run experiments
I see the contrary.

In my experiment, I stream my 10,000 length slice of health metrics to
ioutil.Discard rather than a byte.Buffer to ensure that the act of writing
is not consuming additional memory. Moreover, my 10,000 length slice merely
contains pointers to the health metrics already in local memory, so the
slice itself consumes minimal additional memory. Yet my working set size
doubles after encoding JSON or GOB.

After encoding my 10,000 health metrics as JSON, my working set increases
from 780MiB to 1.62 GiB. I am getting my working set size by using
syscall.Getrusage.

Suspecting that the JSON encoder is increasing the page size, I broke my
10,000 slice into 100 smaller slices and JSON encoded each of the 100
slices of length 100 to ioutil.Discard. I still encoded the same amount of
data, but in 100 smaller chunks instead of 1 big chunk. When I did this, my
working set increased by only 7MiB instead of by over 700MiB.

The results of my experiment suggest that the JSON encoder may not be a
truly streaming encoder. That is the memory overhead is not constant but
rather proportional to the size of the data structure being encoded. I saw
similar results with the GOB encoder, but with the GOB encoder the memory
overhead decreased by half instead of by a factor of 1/100 when I ran the
same experiment.

Has anyone else in the go community run into this? Are there any third
party encoders that can stream JSON with nearly constant memory overhead?
Is there anything I can do to make the encoders that come with Go use
memory more economically?




--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
For more options, visit https://groups.google.com/d/optout.

Search Discussions

  • Caleb Spare at Dec 2, 2015 at 7:39 pm
    I am writing this post because it seems that the GO JSON encoder is not a truly streaming encoder.
    No it isn't, nor does it claim to be (that I can find).

    The encoder marshals (and buffers) an entire object at a time. If you
    have a slice of 10k items you want to send as a JSON array, you can do
    the array encoding yourself: send a "[", then encode each item with a
    "," in between, then a final "]".

    http://play.golang.org/p/NTLgY2CRXp

    -Caleb

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
    For more options, visit https://groups.google.com/d/optout.
  • Travis Keep at Dec 2, 2015 at 7:48 pm
    Just curious, is there an architectural reason that the JSON encoder is not
    streaming?
    On Wednesday, December 2, 2015 at 11:40:18 AM UTC-8, Caleb Spare wrote:

    I am writing this post because it seems that the GO JSON encoder is not
    a truly streaming encoder.

    No it isn't, nor does it claim to be (that I can find).

    The encoder marshals (and buffers) an entire object at a time. If you
    have a slice of 10k items you want to send as a JSON array, you can do
    the array encoding yourself: send a "[", then encode each item with a
    "," in between, then a final "]".

    http://play.golang.org/p/NTLgY2CRXp

    -Caleb
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
    For more options, visit https://groups.google.com/d/optout.
  • Alb Donizetti at Dec 2, 2015 at 8:20 pm
    It's a known problem.
    Se discussion on https://github.com/golang/go/issues/7872


    On Wednesday, December 2, 2015 at 8:48:02 PM UTC+1, Travis Keep wrote:

    Just curious, is there an architectural reason that the JSON encoder is
    not streaming?
    On Wednesday, December 2, 2015 at 11:40:18 AM UTC-8, Caleb Spare wrote:

    I am writing this post because it seems that the GO JSON encoder is not
    a truly streaming encoder.

    No it isn't, nor does it claim to be (that I can find).

    The encoder marshals (and buffers) an entire object at a time. If you
    have a slice of 10k items you want to send as a JSON array, you can do
    the array encoding yourself: send a "[", then encode each item with a
    "," in between, then a final "]".

    http://play.golang.org/p/NTLgY2CRXp

    -Caleb
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
    For more options, visit https://groups.google.com/d/optout.
  • Travis Keep at Dec 2, 2015 at 9:56 pm
    many thanks.

    On Wednesday, December 2, 2015 at 12:20:12 PM UTC-8, [email protected]
    wrote:
    It's a known problem.
    Se discussion on https://github.com/golang/go/issues/7872


    On Wednesday, December 2, 2015 at 8:48:02 PM UTC+1, Travis Keep wrote:

    Just curious, is there an architectural reason that the JSON encoder is
    not streaming?
    On Wednesday, December 2, 2015 at 11:40:18 AM UTC-8, Caleb Spare wrote:

    I am writing this post because it seems that the GO JSON encoder is
    not a truly streaming encoder.

    No it isn't, nor does it claim to be (that I can find).

    The encoder marshals (and buffers) an entire object at a time. If you
    have a slice of 10k items you want to send as a JSON array, you can do
    the array encoding yourself: send a "[", then encode each item with a
    "," in between, then a final "]".

    http://play.golang.org/p/NTLgY2CRXp

    -Caleb
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
    For more options, visit https://groups.google.com/d/optout.
  • Matt Harden at Dec 4, 2015 at 3:17 am
    Actually they added the ability to stream-decode arrays and objects (in
    1.5, I think): https://godoc.org/encoding/json#example-Decoder-Decode-Stream
    On Wed, Dec 2, 2015 at 1:56 PM Travis Keep wrote:

    many thanks.

    On Wednesday, December 2, 2015 at 12:20:12 PM UTC-8, [email protected]
    wrote:
    It's a known problem.
    Se discussion on https://github.com/golang/go/issues/7872


    On Wednesday, December 2, 2015 at 8:48:02 PM UTC+1, Travis Keep wrote:

    Just curious, is there an architectural reason that the JSON encoder is
    not streaming?
    On Wednesday, December 2, 2015 at 11:40:18 AM UTC-8, Caleb Spare wrote:

    I am writing this post because it seems that the GO JSON encoder is
    not a truly streaming encoder.

    No it isn't, nor does it claim to be (that I can find).

    The encoder marshals (and buffers) an entire object at a time. If you
    have a slice of 10k items you want to send as a JSON array, you can do
    the array encoding yourself: send a "[", then encode each item with a
    "," in between, then a final "]".

    http://play.golang.org/p/NTLgY2CRXp

    -Caleb
    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to [email protected].
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
    For more options, visit https://groups.google.com/d/optout.
  • Travis Keep at Dec 2, 2015 at 7:46 pm
    Just curious, is there an architectural reason that the JSON encoder is not
    streaming?
    On Wednesday, December 2, 2015 at 11:09:51 AM UTC-8, Travis Keep wrote:

    I am writing this post because it seems that the GO JSON encoder is not a
    truly streaming encoder. That is, the memory overhead of encoding JSON is
    not constant but rather is proportional to the size of the data structure
    being encoded.

    I am writing a server to store health metrics of ~10,000 machines in a
    cluster. This server always stores the latest health metrics of all 10,000
    machines in its local memory. When other processes request these health
    metrics, this server must send them over the wire via JSON or Go RPC.

    The health metrics for 10,000 machines is quite large. Encoding a slice of
    10,000 health metrics is 313M in JSON and 196M in GOB.

    Both the Go JSON encoder and GOB encoder wrap an io.Writer, so I would
    expect that using them does not incur much additional memory overhead even
    when encoding very large data structures. However, when I run experiments
    I see the contrary.

    In my experiment, I stream my 10,000 length slice of health metrics to
    ioutil.Discard rather than a byte.Buffer to ensure that the act of writing
    is not consuming additional memory. Moreover, my 10,000 length slice merely
    contains pointers to the health metrics already in local memory, so the
    slice itself consumes minimal additional memory. Yet my working set size
    doubles after encoding JSON or GOB.

    After encoding my 10,000 health metrics as JSON, my working set increases
    from 780MiB to 1.62 GiB. I am getting my working set size by using
    syscall.Getrusage.

    Suspecting that the JSON encoder is increasing the page size, I broke my
    10,000 slice into 100 smaller slices and JSON encoded each of the 100
    slices of length 100 to ioutil.Discard. I still encoded the same amount of
    data, but in 100 smaller chunks instead of 1 big chunk. When I did this, my
    working set increased by only 7MiB instead of by over 700MiB.

    The results of my experiment suggest that the JSON encoder may not be a
    truly streaming encoder. That is the memory overhead is not constant but
    rather proportional to the size of the data structure being encoded. I saw
    similar results with the GOB encoder, but with the GOB encoder the memory
    overhead decreased by half instead of by a factor of 1/100 when I ran the
    same experiment.

    Has anyone else in the go community run into this? Are there any third
    party encoders that can stream JSON with nearly constant memory overhead?
    Is there anything I can do to make the encoders that come with Go use
    memory more economically?



    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
    For more options, visit https://groups.google.com/d/optout.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-nuts @
categoriesgo
postedDec 2, '15 at 7:09p
activeDec 4, '15 at 3:17a
posts7
users4
websitegolang.org

People

Translate

site design / logo © 2023 Grokbase