truly streaming encoder. That is, the memory overhead of encoding JSON is
not constant but rather is proportional to the size of the data structure
being encoded.
I am writing a server to store health metrics of ~10,000 machines in a
cluster. This server always stores the latest health metrics of all 10,000
machines in its local memory. When other processes request these health
metrics, this server must send them over the wire via JSON or Go RPC.
The health metrics for 10,000 machines is quite large. Encoding a slice of
10,000 health metrics is 313M in JSON and 196M in GOB.
Both the Go JSON encoder and GOB encoder wrap an io.Writer, so I would
expect that using them does not incur much additional memory overhead even
when encoding very large data structures. However, when I run experiments
I see the contrary.
In my experiment, I stream my 10,000 length slice of health metrics to
ioutil.Discard rather than a byte.Buffer to ensure that the act of writing
is not consuming additional memory. Moreover, my 10,000 length slice merely
contains pointers to the health metrics already in local memory, so the
slice itself consumes minimal additional memory. Yet my working set size
doubles after encoding JSON or GOB.
After encoding my 10,000 health metrics as JSON, my working set increases
from 780MiB to 1.62 GiB. I am getting my working set size by using
syscall.Getrusage.
Suspecting that the JSON encoder is increasing the page size, I broke my
10,000 slice into 100 smaller slices and JSON encoded each of the 100
slices of length 100 to ioutil.Discard. I still encoded the same amount of
data, but in 100 smaller chunks instead of 1 big chunk. When I did this, my
working set increased by only 7MiB instead of by over 700MiB.
The results of my experiment suggest that the JSON encoder may not be a
truly streaming encoder. That is the memory overhead is not constant but
rather proportional to the size of the data structure being encoded. I saw
similar results with the GOB encoder, but with the GOB encoder the memory
overhead decreased by half instead of by a factor of 1/100 when I ran the
same experiment.
Has anyone else in the go community run into this? Are there any third
party encoders that can stream JSON with nearly constant memory overhead?
Is there anything I can do to make the encoders that come with Go use
memory more economically?
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
For more options, visit https://groups.google.com/d/optout.