goimports is very fast with small GOPATH trees, but its performance scales
linearly with the number of files and directories in that tree. My GOPATH
has 15077 directories, 13250 of which contain "node_modules" in the name
and are part of a single 2 days prototype of a web application using
gulp/npm for the frontend. In those two days, the execution time of
goimports (in non-trivial cases) raised from ~0.1 seconds to ~3-4 seconds.
If I were working on multiple apps using that kind of technology, goimports
would become unusable.
The culprit is loadPkgIndex() in fix.go, which scans the whole GOPATH
building an index of the whole GOPATH (a map of package names -> full
path). This is invoked in all cases where there is an unresolved package
name in the source code which doesn't immediately map to the standard
library (there is a short-circuit for the standard library, so that
loadPkgIndex isn't called in that common case).
I think the main goal of goimports is to be bound to the save button of an
editor. If goimports takes several seconds to run, it's very annoying even
if it ends up producing the correct result. Given that it's only a
programming aid, I would rather have goimports always take 0.1 seconds, and
not be able to complete imports in some cases. IOW, a very fast goimports
beats a very correct goimports, IMO.
I brainstormed a few ideas on how to improve the speed of goimports, and I
would like to discuss them in the list before submitting a CL that goes in
a direction which is deemed completely wrong:
1) loadPkgIndex ignores all directories (and subtrees) with names starting
with a dot, an underscore, or whose name is "testdata". It might make sense
to grow this ignore list to cover common directory names pointing to trees
that are unlikely to be used as part of an import path. "node_modules" is
such an example, but also "bower_components" and "Godeps" come to mind. I
understand that growing such a hardcoded list is suboptimal, but it's a
quick stopgap with enormous effect on performances. Obviously it's just a
heuristic, and it might be wrong in some cases, but it would probably work
most of the times, and I think it would give an improvement to many people
using Go to program single-page web applications.
2) Similarly, we could avoid recursing more than N levels. This is just a
rule of thumb because it's highly unlikely that somebody is using an import
line with a path 10-15 components long. Most import paths are long between
3 and 5 components.
3) The index built by loadPkgIndex could be serialized to disk, and reused
for a certain period of time. I attempted a quick patch where the index is
serialized in a dot file in the home (through gob) and reused for 60
minutes, and it works very well for me. If I improved the patch to be smart
enough to always rebuild the part of the index that handles the current
project (that is, the current directory and its siblings), it would be
close to perfect.
4) goimports doesn't realize that an expression like "a.b" could refer to a
global unexported symbol "a" in the current package, so it always rebuilds
the index just in case there is a package named "a" that exports a symbol
called "b". A shortcut could be added so that if goimports realizes that
"a" is not a package name but just a global symbol in the current package,
it could remove it from the list of packages to be searched for; at that
point, the list would go down to zero in most normal cases, and thus the
frequency of calls to loadPkgIndex in a normal development loop would
Thanks in advance for your comments.
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to email@example.com.
For more options, visit https://groups.google.com/d/optout.