On Tuesday, March 3, 2015 at 1:23:47 PM UTC-5, Eric Myhre wrote:
Whatever comes of this discussion: Distancing any proposed model from how
Google works internally is going to be difficult, but necessary.
The open source community as a whole has been burned before by the
behaviors that emerge from a company culture that attempts open source but
maintains things only in their personal vendorized ecosystem: remember the
release of thrift.... followed by the release of thrift? This is -- and
there's no shame in this, but I remember a number of central gophers
admitting their workplace basically means they never use 'go get' -- a
large part of the reason 'go get' is in such a dilapidated state compared
to the rest of the otherwise charming and excellent go tool.
I don't want to see go getting hung up on root misunderstandings of how a
needs of a global collaboration culture may differ from those of a single
company with a single dictatorial source tree with a single snapshot view
available for a single point in time. The latter is easier, to be sure.
Let's focus on moving beyond that. Planet Earth does not have a single
shared source tree.
Solutions need to account for usage *without prior coordination* between
... And that's where a whole bunch of the proposed stuff fails. Badly.
This whole discussion started with vendoring presumed. (Why is it even in
the thread title?) Import rewriting is then proposed as a workaround.
There's still huge trouble with this when extended to more than *literally
one* project, so it's proposed (with *colossal* handwaving) that libraries
and programs should be different.
These are both ridiculous duct tape workarounds. No: import rewriting is
a ridiculous ducktape workaround, and claiming that libraries and programs
are clearly distinct is just lying back and thinking of England.
I propose we take a moment and clearly define the goalposts again:
everyone's concerns are oriented around isolation, repeatability, and
composability. Are there better ways we can serve these goals, especially
with scaling across many repositories and many uncoordinated authors?
Notice how I didn't say "import rewriting". What everyone wants is
isolation. We should discuss how to implement that concept. "import
rewriting" is one possible way to implement isolation; let's phrase the
conversation around the concept first.
I second every single person (there have been many) who suggest either
using per-project GOPATH settings, or -- if this hasn't been suggested
outloud yet, I'd like to make the proposal -- creating a new system with
similar semantics. Having wholey separate directory trees for each project
is, in fact, absolutely the semantic that I want. Global variables are
bad; global library heaps are equally bad, particularly when shared between
many different programs of wildly unrelated authorship. (Anyone from the
maven world -- remember how it feels to `rm -rf ~/.m2`? Tension just flows
out of the body. Tension that never should have existed.)
Much like we gophers value our statically linked single file binaries for
their ease of shipping, so too do I value a project folder that is clearly
that project, only that project, and all of that project. This is what
happens when I set `GOPATH=$project/.gopath/`. Much like the whole truth
and nothing but the truth, it's great! This means I can sync severable
parts of my work to various computers easily, etc, and is super empowering.
I regret to say I cannot buy suggestions that this would be an
overwhelmingly disruptive chance to the go ecosystem. Currently doing so
with every single go library and program I've ever touched in fact seems
like rather concrete proof that this is doable, since it is in fact, done.
Import rewriting makes sense in the situation that I have a diamond
dependency (A -> B -> D.v1 & A -> C -> D.v2) and I'm willing to duplicate
the two different versions of that transitively required library in order
to isolate them while retaining both. This is by far the exception case,
not the norm. Imagine if every package that used something as common as
"fmt" from the standard library re-wrote and re-vendored it!
Note that I have little care for the size of my working tree in an active
development environment -- duplicated libraries on disk there per project
is fine; disk is cheap. Duplication that ends up in version control is
where concern lies, because this duplication has troublesome impacts over
deep time, and the trouble can be pushed to other people who are then
powerless to address it.
Notice how I didn't say "vendoring". What everyone wants is
repeatability. "vendoring" is one possible way to implement repeatability;
let's phrase the conversation around the concept first.
As pointed out earlier in the threat, repeatability is also maintained by
systems such as Nix: hashes do wonders. Similarly, I've been using git
submodules to get perfectly reproducible builds. Bower can pin
dependencies by hashes. There's a huge list of concrete examples of
repeatability without vendoring. Hash-based resource references solve
Vendoring may score points in the immediate offline mode. And I'm deeply
pleased with importance placed on offline operation -- I share this
priority the extent that I label my business cards "sneakernet advocate" --
but offline operation is orthogonal to vendoring and also available through
other choices. It's good to have a command after which the build is
guaranted not to need network; that command need not be `git clone`, and
could just as easily be some other subcommand of the go tool. I think folk
coming from other languages would be perfectly unsurprised by such a
feature in the go tool, and we could have very satisfying outcomes from a
command that does all resource acquisition and then stops.
Vendoring has gained traction largely because it works without additional
tooling, and that makes it the most contributor-friendly thing in the
current landscape where the go tool has no chosen stance and thus the
community as a whole is uncommitted. In discussing additions to the go
tool, this impetus is made irrelevant.
Vendoring has a variety of serious drawbacks. It gets repeatability right
and *everything else* wrong. It's not transparent to libraries; it results
in globally multiple histories for files; and it results in permanent bloat
to a project source repo. (This was discussed before at length in a
particular golang project forum and this table of comparisons was made back
in 2013, which remains fully accurate today:https://gist.github.com/heavenlyhash/6343783
The most major threat to repeatability when using hash-based resource
references is network available in the deep time sense: what happens when
the organization behind mycompany.net goes out of business and drops
their domain, or library bananapancakes is renamed to something more
user-centric, etc. In tracking recent changes, this is not frequently an
issue, but doing a bisect across a well-aged repo can become troublesome.
Fortunately, this is solvable: The system must have a way to replace old
network addresses with updated aliases. We can do this.
I'm not sure this is in the discussion as such yet, but it's been
mentioned in passing, and so I figured I might as well give it a name and a
As a user of go, when examining a new library for potential use, I
regularly want to see if it builds. Thereafter, I want to see if its tests
pass. This requires that the library specify it's dependencies;
specifically, it means the library must have repeatability just like a
Bundler and rubygems actually got this fairly right: specifying a semver
dependency range in one file, and specifying the "locked", pre-resolved
versions of everything depended on (including transitively) is a great
example of unifying both composability and repeatability goals. Bundler
fell short in that it still only resolves down to a precise semver string,
did similar things, and also carried the resolve results all the way to a
hash: this truly satisfies repeatability. We should take a page from these
While it is possible to ship executables at the end of the day without
tackling this (and that is the most important thing), as Matthew Sackman
has pointed out, it's a severe cramp to developer usability if we ignore
On the plus side, this is also possible to do later, since isolation and
repeatability are solvable without a solution to composability -- but it
may be worth thinking about it now, when discussing data formats.
### aside: this "libraries and programs are different" thing *doesn't
work* in the field
Treating libraries and programs differently has a hidden presumption: that
I have any control over other authors.
Regrettably, I lack total control over all programmers. (Imagine! We
could forgo this entire discussion! World conquest by friday! Etc.) And
even if everyone in the world reoriented their views to ask all present and
future users whether a package should be a library or not, there's no
guarantee we'd agree. And a time dimension is indeed a part of that
discussion, unless we break free of VCS repository layouts entirely, which
has not yet come to pass.
As one example, Docker has some code that I'd like to use as a library.
But they don't consider themselves a library, and as a result, they've
already vendored about 5 megs of other sources (and if including history, a
much larger amount even after compression). This situation is
understandable -- they consider themselves a "program", and rightly so --
but nonetheless, it's become very difficult for me to reuse their code
without forking (and I mean manually, by copy-paste, a highly undesirable
outcome). And then AFTER copy-paste forking the sections of code I want, I
have to rewrite all their imports again to fit in my project, since I'm
already using several of the same libraries at the same versions. This
entire process would be nonsensical made-work that actively distracts from
real development, and with all due respect to all the authors in this
scenario who are all understandably acting in their best interests, I want
no part of this to be in my future for other golang proje
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to email@example.com.