On Wed, Jun 8, 2016 at 4:45 AM, Nigel Tao wrote:

publicsuffix.org maintains a list of public suffixes, used to e.g.
scope HTTP cookies.

The golang.org/x/net/publicsuffix package contains a 'compiled'
version of the plain text list [0]. That compiled version that is more
efficient at runtime, but not human-readable.

The canonical upstream list gets ad hoc updates roughly once a week in
recent times [1]. We (usually me, sometimes Brad or Volker)
occasionally update the Go package, maybe once every few months [2]. I
hesitate to do so more frequently because even a one line change to
the upstream list pretty much changes the entire generated table ([3]
is a typical diff), and I don't want to bloat the golang.org/x/net git
repo with lots of essentially binary changes. I'm really only hand
waving and guessing here, as I don't really know how git works under
the hood, but for example, the generated table.go file currently
weighs 528K. At [4], Brad (who's away for some weeks) said that each
publicsuffix commit grows the x/net repo by 0.1MB, which isn't huge,
but it would add up over time.
Internally git stores a zlib compressed version of the file contents in a
under .git/objects named after the SHA1 hash of the file contents for every
version of every file tracked by git, but since it's a SHA1 hash it's not
duplicated if different files or commits have the same content.

Occasionally, either when you run "git gc" or git decides to on its own, it
takes the
seldom used objects and compressed them into a pack file, which is the
same as the format used to clone or fetch a repo over the wire. It's a file
containing a
bunch of objects where each one is represented as either the same zlib
compressed entire contents, or as a (zlib compressed) binary delta from
another object or a delta from an absolute offset in the pack file.

So if your "essentially binary" changes are localized to one part of the
file (or
get repetitive inside the file itself in a way that compresses well), then
it will eventually
get compressed into a way that's diskspace efficient (and regardless it'll
use that compressed
version to be network efficient while cloning. The git client and server
negotiate what objects
they have in common beforehand so that it can minimize bandwidth.), but you
don't have any
control over when that will happen on other people's local repos.
Otherwise, it'll continuous grow
at the size that you're seeing forever.

(At least, that's my understanding from having tried to write a pure go git
client and
giving up before getting it to a useable state due to the complexity of
git's command line..)

Automatic or not, it might then make sense to move the package to its
own dedicated git repo, instead of filling golang.org/x/net with noisy
churn. If so, any bikeshedding opinions between
golang.org/x/publicsuffix (gerrit + codereview) or
github.com/golang/publicsuffix (vanilla git) or something else?

Any other thoughts, golang-dev?
Speaking as a Go user, the fact that some things are golang.org/x/ and some
things are github.com/golang/x has
always been weird and inconsistent to me. golang.org/x looks more
"official", while I actually initially thought that
github.com/golang was just a mirror.

- Dave

You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 5 of 7 | next ›
Discussion Overview
groupgolang-dev @
postedJun 8, '16 at 8:45a
activeJun 12, '16 at 5:53p



site design / logo © 2021 Grokbase