Re: [cc65] Cracknut...

Date view Thread view Subject view

From: Ullrich von Bassewitz (uz_at_musoftware.de)
Date: 2003-05-18 04:44:00


On Sun, May 18, 2003 at 01:25:33AM +0200, Groepaz wrote:
> now the question is what that actually proves.... a) cc65 doesnt have a
> peephole optimizer or b) the peephole rules are not sufficient.

It proves that

  c) you haven't read the docs:-)

The coding rules state explicitly that you shouldn't use post increment (and I
do also remember telling you that several times when talking about your
programs). When writing the code the proper way

        x = 0;
        do {
            ++x;
            buf[x] = 0;
        } while (x);

you will get the following assembler code:

                lda     #$00
                sta     _x
        L000A:	inc     _x
                lda     #<(_buf)
                ldx     #>(_buf)
                clc
                adc     _x
                bcc     L000F
                inx
        L000F:  sta     sreg
                stx     sreg+1
                lda     #$00
                tay
                sta     (sreg),y
                lda     _x
                bne     L000A
                rts

This is almost identical to what you propose the compiler should generate.

> and whatever
> it is, i am very tempted to help with improving it :=)

I would say that you overestimate the potentialities of a peephole optimizer.
The problems with a peephole optimizer are:

    1. It can only handle patterns it knows.

    2. A small change in code generation may break lots of patterns.

    3. There are lots of order dependencies. Some patterns do only work after
       others have been replaced, some more patterns must be replaced before
       others.

    4. Checking for patterns, especially larger ones (like in your example)
       gets a mess rather quickly, because small variances in several places
       add up to dozens of different resulting patterns. It is easy for a
       human to see the common base in all these patterns, but not for a
       computer.

Problem #1 means that you can *always* come up with a sequence that runs
through a peephole optimizer without major changes - just use a pattern it
doesn't know. This is also the answer to the following question:

> the second question goes one (or more :=)) step further... is there a way to
> make the compiler access arrays <=256 elements via indexed addressing mode
> rather than indirect?

The replacement you're mentioning here is indeed done for loads (your plasma
demo shows this [or should - seems my current development version of the
compiler breaks it]). So yes, it is possible. But because of problems 1-4,
adding more patterns doesn't always make sense, I'm doing this usually for
heavily used patterns.

The cc65 optimizer has about 60 different rules. Not all, but most of them are
what you would call "peephole optimizing", and many do handle more than just
one pattern. If you define an environment variable named "CC65_OPTSTATS" with
the name of a file, the compiler will generate a text file with this name that
contains single and cumulated statistics about the steps applied, and those
that were successful. The file is then updated with every compiler run.

> (btw could you tell in
> a few words what type of optimizations cc65 actually does?... and if there is
> peephole optimization, could you point me to the file with the rules in it?
> :))

The cc65 optimizer is much more complex, so there are no static rules. Code
optimization is done on the 6502 level only (this is the reason why sometimes
the generated code is really clever, especially for simple code, and sometimes
absolutely ugly). The following files are concerned with optimization:

        codeinfo.c codeopt.c coptadd.c coptcmp.c coptneg.c coptsize.c
        coptstore.c copttest.c coptc02.c coptind.c coptpush.c coptstop.c
        coptsub.c

You will also want to have a look at the following modules, they contain data
structures and functions for the abstract representation of the 6502 code, the
labels and so on:

        codeent.c codelab.c codeseg.c opcodes.c reginfo.c

> however...the first thing sounds actually very doable to me (peephole
> optimizing is nothing more than pattern matching anyway) while the second
> appears to be the real cracknut here :o)

See my comments above. Maybe I'm wrong, but it would say that you're
underestimating the problems. Look at the generated code above, then assume we
have some simple changes:

    1. The programmer may write

            buf[++x] = 0;

       or

            buf[x++] = 0;

       instead of

            ++x;
            buf[x] = 0;

    2. We may have a pointer that points to the buffer. Something like

        char x;
        char* buf;

        void main (void)
        {
            x = 0;
            do {
                ++x;
                buf[x] = 0;
            } while (x);
        }

    3. Either the buffer or the index may be an auto variable on the stack.

    4. The index may have some arithmetics applied like in

            buf[x+1] = 0;

    5. The compiler/code generator may use some other zero page register
       instead of sreg.

Just take these (rather simple) variations and calculate how many resulting
patterns you get. If you don't find a better solution instead of "nothing more
than pattern matching", you will end up with a dozens of patterns just for
this case. And all of these patterns may have errors and may need changes when
one single line in the code generator gets changed.

> Nevertheless improving the peephole
> stuff looks very promising to me

For me it seems to be pretty maxed out. Just have a look at the list of files
concerned with this sort of stuff...

> (i've spotted rts following immediatly on
> jsr a couple of times in compiled code aswell...) ... tjam whatever :)

"jsr xxx/rts" should always get replaced by "jmp xxx". If you do *really* find
it in optimized code, you can send me a bug report.

Regards


        Uz


-- 
Ullrich von Bassewitz                                  uz_at_musoftware.de
----------------------------------------------------------------------
To unsubscribe from the list send mail to majordomo_at_musoftware.de with
the string "unsubscribe cc65" in the body(!) of the mail.


Date view Thread view Subject view

This archive was generated by hypermail 2.1.3 : 2003-05-18 04:44:21 CEST