FAQ
I seems that C_PBIT in 5a means "P bit in instruction is set: pre
indexing mode":

".U", LS, C_UBIT,
".S", LS, C_SBIT,
".W", LS, C_WBIT,
".P", LS, C_PBIT,
".PW", LS, C_WBIT|C_PBIT,
".WP", LS, C_WBIT|C_PBIT,

".F", LS, C_FBIT,

".IBW", LS, C_WBIT|C_PBIT|C_UBIT,
".IAW", LS, C_WBIT|C_UBIT,
".DBW", LS, C_WBIT|C_PBIT,
".DAW", LS, C_WBIT,
".IB", LS, C_PBIT|C_UBIT,
".IA", LS, C_UBIT,
".DB", LS, C_PBIT,
".DA", LS, 0,

But in 5l, 5c and 5g C_PBIT has opposite meaning:

int32
olr(int32 v, int b, int r, int sc)
{
int32 o;

if(sc & C_SBIT)
diag(".S on LDR/STR instruction");
o = (sc & C_SCOND) << 28;
if(!(sc & C_PBIT))
o |= 1 << 24;
if(!(sc & C_UBIT))
o |= 1 << 23;
if(sc & C_WBIT)
o |= 1 << 21;
o |= (1<<26) | (1<<20);
if(v < 0) {
if(sc & C_UBIT) diag(".U on neg offset");
v = -v;
o ^= 1 << 23;
}
if(v >= (1<<12) || v < 0)
diag("literal span too large: %d (R%d)\n%P", v, b, PP);
o |= v;
o |= b << 16;
o |= r << 12;
return o;
}

Search Discussions

  • Russ Cox at Nov 6, 2012 at 7:29 pm
    I believe that in the actual ARM machine instruction encoding, .P is a
    0 in position 24 while not .P is a 1.

    Russ
  • Ziutek at Nov 7, 2012 at 6:58 pm

    On 6 Lis, 20:29, Russ Cox wrote:
    I believe that in the actual ARM machine instruction encoding, .P is a
    0 in position 24 while not .P is a 1.

    Russ
    This is my test code:

    TEXT ·PrePost (SB), 7, $0
    MOVW 8(R0), R1
    MOVW.IB 8(R0), R1
    MOVW.IBW 8(R0), R1
    MOVW.IA 8(R0), R1
    MOVW.IAW 8(R0), R1

    MOVD 8(R0), F0
    MOVD.IBW 8(R0), F0
    MOVD.IA 8(R0), F0

    RET

    Disassembled after compilation:

    Dump of assembler code for function github.com/ziutek/matrix.PrePost:
    => 0x0002f178 <+0>: ldr r1, [r0, #8]
    0x0002f17c <+4>: ldr r1, [r0], #-8
    0x0002f180 <+8>: ldrt r1, [r0], #-8
    0x0002f184 <+12>: ldr r1, [r0, #-8]
    0x0002f188 <+16>: ldr r1, [r0, #-8]!
    0x0002f18c <+20>: vldr d0, [r0, #8]
    0x0002f190 <+24>: ldc 11, cr0, [r0, #8]!
    0x0002f194 <+28>: vldr d0, [r0, #8]
    0x0002f198 <+32>: add pc, lr, #0
    End of assembler dump.

    I think that it should look like this:

    ldr r1, [r0, #8] // MOVW 8(R0), R1 [P=1,W=0,U=1]
    ldr r1, [r0, #8] // MOVW.IB 8(R0), R1 [P=1,W=0,U=1]
    ldr r1, [r0, #8]! // MOVW.IBW 8(R0), R1 [P=1,W=1,U=1]
    ldr r1, [r0], #8 // MOVW.IA 8(R0), R1 [P=0,W=0,U=1]
    ldrt r1, [r0, #8] // MOVW.IAW 8(R0), R1 [P=0,W=1,U=1]

    vldr d0, [r0, #8] // MOVD 8(R0), F0 [P=1,W=0,U=1]
    vldr d0, [r0, #8]! // MOVD.IBW 8(R0), F0 [P=1,W=1,U=1]
    vldr d0, [r0],#8 // MOVD.IA 8(R0), F0 [P=0,W=0,U=1]

    I need MOVD.IA to speedup some numerical calculations and I found that
    MOVD.IA doesn't set P bit correctly.

    After some research I found that 5l, 5c and 5l treats C_PBIT (and
    probably C_UBIT) flag in opposite way than 5a. So I am in confusion,
    should 5l, 5c and 5l be fixed or 5a...
  • Russ Cox at Nov 7, 2012 at 7:18 pm
    I think the P bit is fine. Instructions set the P bit by default,
    unless C_PBIT is set. This corresponds to (for example) MOV having the
    P bit=1 but MOV.P having the P bit=0. I do not want to change 5c, 5g,
    or 5l.

    I don't know what these other suffixes do, but assuming you are
    correct, I think the fix would be to swap A and B in these lines:

    lex.c:273: ".IBW", LS, C_WBIT|C_PBIT|C_UBIT,
    lex.c:274: ".IAW", LS, C_WBIT|C_UBIT,
    lex.c:275: ".DBW", LS, C_WBIT|C_PBIT,
    lex.c:276: ".DAW", LS, C_WBIT,
    lex.c:277: ".IB", LS, C_PBIT|C_UBIT,
    lex.c:278: ".IA", LS, C_UBIT,
    lex.c:279: ".DB", LS, C_PBIT,
    lex.c:280: ".DA", LS, 0,

    It sounds like those definitions were written with an incorrect
    understanding of what C_PBIT means.

    Russ
  • Ziutek at Nov 7, 2012 at 7:43 pm

    On 7 Lis, 20:18, Russ Cox wrote:

    It sounds like those definitions were written with an incorrect
    understanding of what C_PBIT means.

    Russ
    So I will try fix 5a.

    By the way, I think that "PBIT" (literally "P bit") in C_PBIT flag
    name is misleading, especially when C_WBIT is treaded in opposite way.
    This probably causes bug in 5a.

    It will be better to change C_PBIT name to something less confusing
    (eg: C_PREI like "preindex"). Same C_UBIT name should be changed to
    something like C_SUBO "subtract offset".
  • Ziutek at Nov 7, 2012 at 8:16 pm
    I have a problem with ".P" and ".U" suffixes. To work correctly with
    5l they should clear C_PBIT/C_UBIT flags. Maybe we need to remove
    support for "bit suffixes" if C_YYY flags don't longer means real
    bits?


    "Bit suffixes" supported by 5a:

    ".U", LS, C_UBIT,
    ".S", LS, C_SBIT,
    ".W", LS, C_WBIT,
    ".P", LS, C_PBIT,
    ".PW", LS, C_WBIT|C_PBIT,
    ".WP", LS, C_WBIT|C_PBIT,
    ".F", LS, C_FBIT,
  • Russ Cox at Nov 7, 2012 at 8:50 pm
    I have to admit I don't understand the problem. The .P suffix seems to
    be working fine for MOVW.P.

    You seem to be assuming that .P has to mean "P=1" in the instruction
    encoding. It doesn't have to. The meaning of the assembly language
    phrases is up to us.

    Russ
  • Ziutek at Nov 7, 2012 at 9:30 pm

    On 7 Lis, 21:50, Russ Cox wrote:
    I have to admit I don't understand the problem. The .P suffix seems to
    be working fine for MOVW.P.

    You seem to be assuming that .P has to mean "P=1" in the instruction
    encoding. It doesn't have to. The meaning of the assembly language
    phrases is up to us.

    Russ
    If I understood correctly:

    MOVW means P=1,U=1
    MOVW.P means P=0,U=1
    MOVW.U means U=1,P=0

    So to fix this issue I need to only swap DA <=> IB and IA <=> DB
  • Ziutek at Nov 7, 2012 at 10:52 pm
    So to fix this issue I need to only swap DA <=> IB and IA <=> DB
    After this fix my test code:

    TEXT ·PrePost (SB), 7, $0
    MOVW 8(R0), R1
    MOVW.IB 8(R0), R1
    MOVW.IBW 8(R0), R1
    MOVW.IA 8(R0), R1
    MOVW.IAW 8(R0), R1

    MOVD 8(R0), F0
    MOVD.IBW 8(R0), F0
    MOVD.IA 8(R0), F0

    RET

    disassembles to:

    Dump of assembler code for function github.com/ziutek/matrix.PrePost:
    => 0x0002f178 <+0>: 08 10 90 e5 ldr r1, [r0, #8]
    0x0002f17c <+4>: 08 10 90 e5 ldr r1, [r0, #8]
    0x0002f180 <+8>: 08 10 b0 e5 ldr r1, [r0, #8]!
    0x0002f184 <+12>: 08 10 90 e4 ldr r1, [r0], #8
    0x0002f188 <+16>: 08 10 b0 e4 ldrt r1, [r0], #8

    0x0002f18c <+20>: 02 0b 90 ed vldr d0, [r0, #8]
    0x0002f190 <+24>: 02 0b b0 ed ldc 11, cr0, [r0, #8]!
    0x0002f194 <+28>: 02 0b 90 ed vldr d0, [r0, #8]

    0x0002f198 <+32>: 00 f0 8e e2 add pc, lr, #0
    End of assembler dump.

    Fixed-point instructions works fine after fix.

    You can see problem in VFP code but it isn't real. After a closer
    study of the "ARM Architecture
    Reference Manual" it became clear that there is no real vldr pre/post
    indexed instructions.

    This code:

    vldr d0, [r0, #8]!
    vldr d0, [r0],#8

    can't be generated by 5a because it contains pseudo-instructions. And
    as they aren't real pre/post indexed instructions they can't speedup
    my code :(
  • Ziutek at Nov 7, 2012 at 11:09 pm
  • Dave Cheney at Nov 7, 2012 at 11:25 pm
    Hi,

    I've been trying to follow this thread as I am concerned with any 5x codegen bugs, but it is still not clear to me what the problem is your are fixing.

    I believe you are saying the 5a syntax differs from the description in the arm literature, if that is the case I am not surprised, but as of now it is the dialect we have.

    Do the affected instruction suffixes appear anywhere in the .s files in the stdlib? If not then I guess it is ok to fix their interpretation, but if they are in use then we need to tread very carefully.

    Does the tree build and pass all tests with this patch applied? Have you reviewed the runtime .s files?

    In summary, thank you for fixing this issue, but I'm concerned this is going to have subtle knock ons.

    Cheers

    Dave
    On 08/11/2012, at 10:09, ziutek wrote:

    http://codereview.appspot.com/6822093
  • Ziutek at Nov 8, 2012 at 10:04 am
    There is definitely a bug in MOVW.IA (increment after) and MOVW.IBW
    (increment before) handling. Generated code has negated offset and
    wrong P bit. I can't run all.bash for now because I tried this fix on
    slow qemu emulator. No I can test on real Cortex-A9 machine and
    compilation fails:

    pkg/go/build
    cmd/go
    ./make.bash: line 119: 12656 Segmentation fault "$GOTOOLDIR"/
    go_bootstrap clean -i std

    I check for ARM assembler code in tree. There is no instructions
    with .IB suffix but there are MOVM.IA:

    $ find . -name "*_arm.s" -exec grep "\.IA" {} \; -print
    MOVM.IA.W [R0-R7], (R(TO))
    ./pkg/runtime/memclr_arm.s
    MOVM.IA.W [R1-R12], (R0)
    MOVM.IA.W (R0), [R1-R12]
    ./pkg/runtime/vlop_arm.s
    MOVM.IA.W (R(FROM)), [R1-R8]
    MOVM.IA.W [R1-R8], (R(TS))
    MOVM.IA.W (R(FROM)), [R(FR0),R(FR1),R(FR2),R(FR3)]
    MOVM.IA.W [R(FW0),R(FW1),R(FW2),R(FW3)], (R(TS))
    ./pkg/runtime/memmove_arm.s

    It seems that 5l treats C_PBIT, C_UBIT in MOVM in different way than
    for MOVW:

    movm:
    if(instoffset != 0)
    diag("offset must be zero in MOVM");
    o1 |= (p->scond & C_SCOND) << 28;
    if(p->scond & C_PBIT)
    o1 |= 1 << 24;
    if(p->scond & C_UBIT)
    o1 |= 1 << 23;
    if(p->scond & C_SBIT)
    o1 |= 1 << 22;
    if(p->scond & C_WBIT)
    o1 |= 1 << 21;
    break;

    As you can see C_PBIT and C_UBIT flags are used without any negation.
    Because of this it will be hard to fix 5a without changing 5l (and
    probably 5c, 5g).
  • Dave Cheney at Nov 8, 2012 at 3:54 pm
    What does objdump -dS say ?
    On Thu, Nov 8, 2012 at 9:04 PM, ziutek wrote:
    There is definitely a bug in MOVW.IA (increment after) and MOVW.IBW
    (increment before) handling. Generated code has negated offset and
    wrong P bit. I can't run all.bash for now because I tried this fix on
    slow qemu emulator. No I can test on real Cortex-A9 machine and
    compilation fails:

    pkg/go/build
    cmd/go
    ./make.bash: line 119: 12656 Segmentation fault "$GOTOOLDIR"/
    go_bootstrap clean -i std

    I check for ARM assembler code in tree. There is no instructions
    with .IB suffix but there are MOVM.IA:

    $ find . -name "*_arm.s" -exec grep "\.IA" {} \; -print
    MOVM.IA.W [R0-R7], (R(TO))
    ./pkg/runtime/memclr_arm.s
    MOVM.IA.W [R1-R12], (R0)
    MOVM.IA.W (R0), [R1-R12]
    ./pkg/runtime/vlop_arm.s
    MOVM.IA.W (R(FROM)), [R1-R8]
    MOVM.IA.W [R1-R8], (R(TS))
    MOVM.IA.W (R(FROM)), [R(FR0),R(FR1),R(FR2),R(FR3)]
    MOVM.IA.W [R(FW0),R(FW1),R(FW2),R(FW3)], (R(TS))
    ./pkg/runtime/memmove_arm.s

    It seems that 5l treats C_PBIT, C_UBIT in MOVM in different way than
    for MOVW:

    movm:
    if(instoffset != 0)
    diag("offset must be zero in MOVM");
    o1 |= (p->scond & C_SCOND) << 28;
    if(p->scond & C_PBIT)
    o1 |= 1 << 24;
    if(p->scond & C_UBIT)
    o1 |= 1 << 23;
    if(p->scond & C_SBIT)
    o1 |= 1 << 22;
    if(p->scond & C_WBIT)
    o1 |= 1 << 21;
    break;

    As you can see C_PBIT and C_UBIT flags are used without any negation.
    Because of this it will be hard to fix 5a without changing 5l (and
    probably 5c, 5g).
  • Russ Cox at Nov 8, 2012 at 6:50 pm
    What if we make movm handle the C_PBIT the same way as movw and then
    update the meaning of .IA and .IB in the assembler?
  • Ziutek at Nov 8, 2012 at 9:03 pm

    On 8 Lis, 19:50, Russ Cox wrote:
    What if we make movm handle the C_PBIT the same way as movw and then
    update the meaning of .IA and .IB in the assembler?
    I don't understood 5c/5g code enough to fix this issue. I tried
    something like this:

    diff -r b4a8d6b52a2f src/cmd/5a/lex.c
    --- a/src/cmd/5a/lex.c Fri Nov 09 02:09:09 2012 +0900
    +++ b/src/cmd/5a/lex.c Thu Nov 08 21:56:00 2012 +0100
    @@ -270,14 +270,14 @@

    ".F", LS, C_FBIT,

    - ".IBW", LS, C_WBIT|C_PBIT|C_UBIT,
    - ".IAW", LS, C_WBIT|C_UBIT,
    - ".DBW", LS, C_WBIT|C_PBIT,
    - ".DAW", LS, C_WBIT,
    - ".IB", LS, C_PBIT|C_UBIT,
    - ".IA", LS, C_UBIT,
    - ".DB", LS, C_PBIT,
    - ".DA", LS, 0,
    + ".DAW", LS, C_WBIT|C_PBIT|C_UBIT,
    + ".DBW", LS, C_WBIT|C_UBIT,
    + ".IAW", LS, C_WBIT|C_PBIT,
    + ".IBW", LS, C_WBIT,
    + ".DA", LS, C_PBIT|C_UBIT,
    + ".DB", LS, C_UBIT,
    + ".IA", LS, C_PBIT,
    + ".IB", LS, 0,

    "@", LAT, 0,

    diff -r b4a8d6b52a2f src/cmd/5c/txt.c
    --- a/src/cmd/5c/txt.c Fri Nov 09 02:09:09 2012 +0900
    +++ b/src/cmd/5c/txt.c Thu Nov 08 21:56:00 2012 +0100
    @@ -559,7 +559,6 @@
    gmovm(Node *f, Node *t, int w)
    {
    gins(AMOVM, f, t);
    - p->scond |= C_UBIT;
    if(w)
    p->scond |= C_WBIT;
    }
    diff -r b4a8d6b52a2f src/cmd/5l/asm.c
    --- a/src/cmd/5l/asm.c Fri Nov 09 02:09:09 2012 +0900
    +++ b/src/cmd/5l/asm.c Thu Nov 08 21:56:00 2012 +0100
    @@ -1501,9 +1501,9 @@
    if(instoffset != 0)
    diag("offset must be zero in MOVM");
    o1 |= (p->scond & C_SCOND) << 28;
    - if(p->scond & C_PBIT)
    + if(!(p->scond & C_PBIT))
    o1 |= 1 << 24;
    - if(p->scond & C_UBIT)
    + if(!(p->scond & C_UBIT))
    o1 |= 1 << 23;
    if(p->scond & C_SBIT)
    o1 |= 1 << 22;

    but all.bash still fails (but with different error):

    pkg/text/template
    pkg/go/doc
    pkg/go/build
    cmd/go
    throw: cas64 failed
  • Minux at Nov 6, 2012 at 7:33 pm
    P == 0: post indexed addressing (I've ignored LDRT here)
    P == 1: if W == 0, offset addressing, otherwise pre-indexed addressing
    On Sat, Nov 3, 2012 at 4:36 AM, ziutek wrote:

    I seems that C_PBIT in 5a means "P bit in instruction is set: pre
    indexing mode":

    ".U", LS, C_UBIT,
    ".S", LS, C_SBIT,
    ".W", LS, C_WBIT,
    ".P", LS, C_PBIT,
    ".PW", LS, C_WBIT|C_PBIT,
    ".WP", LS, C_WBIT|C_PBIT,

    ".F", LS, C_FBIT,

    ".IBW", LS, C_WBIT|C_PBIT|C_UBIT,
    ".IAW", LS, C_WBIT|C_UBIT,
    ".DBW", LS, C_WBIT|C_PBIT,
    ".DAW", LS, C_WBIT,
    ".IB", LS, C_PBIT|C_UBIT,
    ".IA", LS, C_UBIT,
    ".DB", LS, C_PBIT,
    ".DA", LS, 0,

    But in 5l, 5c and 5g C_PBIT has opposite meaning:

    int32
    olr(int32 v, int b, int r, int sc)
    {
    int32 o;

    if(sc & C_SBIT)
    diag(".S on LDR/STR instruction");
    o = (sc & C_SCOND) << 28;
    if(!(sc & C_PBIT))
    o |= 1 << 24;
    if(!(sc & C_UBIT))
    o |= 1 << 23;
    if(sc & C_WBIT)
    o |= 1 << 21;
    o |= (1<<26) | (1<<20);
    if(v < 0) {
    if(sc & C_UBIT) diag(".U on neg offset");
    v = -v;
    o ^= 1 << 23;
    }
    if(v >= (1<<12) || v < 0)
    diag("literal span too large: %d (R%d)\n%P", v, b, PP);
    o |= v;
    o |= b << 16;
    o |= r << 12;
    return o;
    }

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-dev @
categoriesgo
postedNov 2, '12 at 8:36p
activeNov 8, '12 at 9:03p
posts16
users4
websitegolang.org

People

Translate

site design / logo © 2022 Grokbase