FAQ
четверг, 31 марта 2016 г., 20:08:56 UTC+3 пользователь Keith Randall
написал:


On Thu, Mar 31, 2016 at 8:37 AM, Philip Hofer <pho...@umich.edu
<javascript:>> wrote:
On Thursday, March 31, 2016 at 6:52:02 AM UTC-7, ilya....@intel.com
wrote:
I'm not sure that tuple types will really help with 2
We don't really know whether instruction will set flags, until after
regalloc.
E. g. for addition we may generate either lea (which doesn't set any
flags) if src != dst or
ADD (which sets flags) if src == dst.
IIRC AMD64Ops.go records flag info for each op.
We would introduce a new op, ADDQplusflags or some such, that generated
the pair of (add result, flags).
ADDQplusflags would not be eligible for rewriting to LEA. I see.
IMHO late(after regalloc) peephole or simply checking during codegen
(in cmd/compile/internal/amd64/ssa.go) is needed.
FYI my approach removes ~1700 bytes from go tool.
It works during TEST codegen and simply skips generating anything
when second to last instruction sets flags.
Presumably you test that the second-to-last instruction operates with the
argument of test as the destination register?

e.g.
sub a, b
test a, a <-- legal to excise
jnz <label>

but
sub a, b
test b, b <-- not legal to excise
jnz <label>

Do you have a sense of how many TESTs could have been eliminated but
weren't? (Or, how many false negatives do you get when performing this pass
after regalloc?)
Right, that's my question also. How often is the thing your TESTing the
previous instruction?
For TEST reg,reg
reg is generated by previous instruction ~3x as often (160k during all.bash)
as by not-second to last instruction (~50k)
Out of those 160k, only 5k TESTs are eliminated,
This number can be somewhat improved, by e. g. avoiding generation of ADDL
for ADDB
Most of (140k) uneliminated TESTs have some kind of a move as their source.
I suppose more sophisticated analysis could eliminate additional tests, by
considering more instruction/changing code generation.
We already have ssaMarkMoves which does something similar.


And how often is it a flags-generating instruction? I know our scheduler
tries to put flag generators right before to flag users. Maybe a similar
scheduler heuristic, to put the value being TESTed right before the TEST
would help make the "peephole" technique work well enough. Hard to know
whether any of this would be worth it without some numbers.

среда, 30 марта 2016 г., 18:36:14 UTC+3 пользователь Keith Randall
написал:
Tuple types would have other uses as well, for example doing x/y and
x%y with a single instruction.

On Wed, Mar 30, 2016 at 8:31 AM, Philip Hofer wrote:

Keith suggested that we'd need tuple types in order to make the flags
output explicit in 2, since the rulegen stuff happens before instruction
scheduling. It'd be a much more modest change if we had late-stage
peephole-ing, but I'd rather the tuple stuff go in, since it will also make
a tremendous difference on arm in terms of generating conditional
instructions.

On Wednesday, March 30, 2016 at 5:20:10 AM UTC-7, ilya....@intel.com
wrote:
Hi,

Have you started converting 2?
I'm working on similar patch (going to send it today/tomorrow).

As for 3, I think we already do similar optimization.
E. g. in ./cmd/compile/internal/ssa/gen/AMD64Ops.go
For NEGB we have asm: "NEGL"

вторник, 29 марта 2016 г., 21:04:43 UTC+3 пользователь Philip Hofer
написал:
The changes for amd64 were pretty simple:

1) Turn "cmp $0, %reg" into "test %reg, %reg", since it is one byte
shorter
2) Eliminate "test %reg, %reg" where %reg was produced with an
arithmetic instruction that would have set the flags
3) Use 32- instead of 64-bit instructions in elimshortmov.

I'm gonna take a shot at converting the patches to the new SSA
rulegen format. Does this look like an accurate translation of (1)?

diff --git a/src/cmd/compile/internal/ssa/gen/AMD64.rules
b/src/cmd/compile/internal/ssa/gen/AMD64.rules
index bc932c9..061d716 100644
--- a/src/cmd/compile/internal/ssa/gen/AMD64.rules
+++ b/src/cmd/compile/internal/ssa/gen/AMD64.rules
@@ -1239,6 +1239,12 @@
(CMPWconst (ANDWconst [c] x) [0]) -> (TESTWconst [c] x)
(CMPBconst (ANDBconst [c] x) [0]) -> (TESTBconst [c] x)

+// TEST %reg,%reg is shorter than CMP
+(CMPQconst x [0]) -> (TESTQ x x)
+(CMPLconst x [0]) -> (TESTL x x)
+(CMPWconst x [0]) -> (TESTW x x)
+(CMPBconst x [0]) -> (TESTB x x)
+


On Tue, Mar 29, 2016 at 10:39 AM, Keith Randall <k...@google.com>
wrote:
The plan is to retire peep.go (and cgen.go, ...) for each
architecture as the SSA backend becomes the default.
So I don't think it is worth upstreaming them. Maybe the arm stuff
for 1.7 if they are significant wins.

I'd be interested to see your changes to amd64 so I can make sure
the SSA backend performs an equivalent optimization.

On Tue, Mar 29, 2016 at 10:21 AM, Philip Hofer <pho...@umich.edu>
wrote:
I have a couple small patches to peep.go for arm and amd64 sitting
on my box locally. I'm happy to send them upstream if the plan is to keep
the existing peephole optimizer around for a while, but IIRC there were
some bigger plans for that code. Is that still (or was it ever) the case?

--
You received this message because you are subscribed to the Google
Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to golang-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to golang-dev+...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

Discussion Posts

Previous

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 14 of 14 | next ›
Discussion Overview
groupgolang-dev @
categoriesgo
postedMar 29, '16 at 5:21p
activeApr 1, '16 at 3:14p
posts14
users3
websitegolang.org

People

Translate

site design / logo © 2021 Grokbase