From: Groepaz (groepaz_at_gmx.net)
Date: 2003-05-18 01:25:33
this one kindof results from a little discussion we had on the go64
mailinglist.... may be interisting to hear some comments :)
ok..the keywords are "efficiency of generated code" and "peephole optimizing"
(we were discussing compilers in general)
i came up with a simple codesnippet that kindof drastically demonstrates how
bloated compiled code can get...atleast when the compiler is small-c based
:=P
first in handcoded asm:
buf: .res $100
main:
ldx #0
txa
lp:
sta buf,x
inx
bne lp
rts
i think what it does is obvious...
now there is a snippet of C-code doing the same thing:
char x;
char buf[0x100];
void main(void)
{
x=0;
do
{
buf[x++]=0;
} while(x);
}
cc65 generates the following code (-Osir).... please notice the comments about
remaining peepholes (more to that later...)
_x: .res 1,$00
_buf: .res 256,$00
_main:
lda #$00
sta _x
L0006:
;-peephole start
; lda _x
; pha
; clc
; adc #$01
; sta _x
; pla
;-peephole end
inc _x
lda _x
;-peephole optimization end
;-peephole start
; clc
; adc #<(_buf)
; tay
; lda #$00
; adc #>(_buf)
; tax
; tya
;-peephole end
clc
ldx #>(_buf)
adc #<(_buf)
bcc @s
inx
@s:
;-peephole optimization end
sta sreg
stx sreg+1
lda #$00
tay
sta (sreg),y
lda _x
bne L0006
rts
after removing those 2 peepholes we get:
_main:
lda #$00
sta _x
L0006:
inc _x
lda _x
clc
ldx #>(_buf)
adc #<(_buf)
bcc @s
inx
@s:
sta sreg
stx sreg+1
lda #$00
tay
sta (sreg),y
lda _x
bne L0006
rts
now the question is what that actually proves.... a) cc65 doesnt have a
peephole optimizer or b) the peephole rules are not sufficient. and whatever
it is, i am very tempted to help with improving it :=) (btw could you tell in
a few words what type of optimizations cc65 actually does?... and if there is
peephole optimization, could you point me to the file with the rules in it?
:))
the second question goes one (or more :=)) step further... is there a way to
make the compiler access arrays <=256 elements via indexed addressing mode
rather than indirect? that could probably reduce the above loop even further,
probably to something quite close to the first mentioned handwritten code. (i
know it'll prolly involve major changes to the
compiler/codegenerator/whatever but i'd like to hear your comments anyway...
maybe there's a small chance or sth :=P)
however...the first thing sounds actually very doable to me (peephole
optimizing is nothing more than pattern matching anyway) while the second
appears to be the real cracknut here :o) Nevertheless improving the peephole
stuff looks very promising to me (i've spotted rts following immediatly on
jsr a couple of times in compiled code aswell...) ... tjam whatever :)
gpz
----------------------------------------------------------------------
To unsubscribe from the list send mail to majordomo_at_musoftware.de with
the string "unsubscribe cc65" in the body(!) of the mail.
This archive was generated by hypermail 2.1.3 : 2003-05-18 01:30:00 CEST