[cc65] code generation

Date view Thread view Subject view

From: Groepaz (groepaz_at_gmx.net)
Date: 2003-09-10 05:26:50


i'm currently playing with optimizing that raycaster as much as i can without 
writing too much external assembly (which can not be completely avoided 
unfortunatly) and stumbled about some things...

-
an expression like 

register unsigned short x;
register unsigned char xx;
x+=(xx>>2);

translates to

	lda     regbank+1
	lsr     a
	lsr     a
	sec
	eor     #$FF
	adc     regbank+4
	sta     regbank+4
	lda     #$00
	eor     #$FF
	adc     regbank+4+1
	sta     regbank+4+1

two things here... 1) i think the immediate eor after immediate lda should 
really be catched by the optimizer :) 2) it maybe a nice possible 
optimisation to use a branch/increment type of code for the highbyte when 
adding an unsigned char to an unsigned short.

-

to circumvent the above i tried the following macro

#define uaddsc(_a,_b) \
	( \
	__asm__ ("lda %v", _a), \
	__asm__ ("clc"), \
	__asm__ ("adc %v", _b), \
	__asm__ ("sta %v", _a), \
	__asm__ ("bcc @_l"), \
	__asm__ ("inc %v+1", _a), \
	__asm__ ("@_l:") \
	)

mmmh....two problems here

1) for the heck of it, i cant find out how to make the branch work :=P neither 
a local label, nor a "*+3" or anything else i could think of would do the 
trick...any ideas? :) (sth to generate a unique local label could be a 
solution)

2) if any of the arguments passed to the macro are register variables, the 
resulting code will be all wrong - it generates references to bogus memory 
locations instead of the register variables :=P

-

this macro

#define SINUS(_x) (unsigned char) \
	(__AX__= (_x), \
	__asm__ ("tay"), \
	__asm__ ("lda %v,y", sinustablelow), \
     __AX__)

unsigned char i;
register unsigned char xx;
	xx=SINUS(i);


gives this:

	lda     L04C6
	tay
	lda     _sinustablelow,y
	sta     regbank+1
 
two things that could come in handy here....

1) additional pseudo-variables for the X and Y registers so that kinda macro 
can be written more efficiantly
2) i am wondering why the optimizer doesnt change the lda/tay into ldy (it 
_does_ remove the ldx too anyway!) ... i've seen much more unneccesary use of 
the y register than the x register actually. (infact its pretty smart about 
removing unneeded x-register loads if you arrange the code right)

-

additional questions...

- what impact does it have if i write inline-macros and i dont load parameters 
using the __AX__ pseudo-variable but directly like in the uaddsc macro above? 
i think it shouldnt make a difference except that using __AX__ will help the 
optimizer to eg remove the lda/ldx alltogether if the value is in a/x 
currently anyway....but maybe there is more :)

- i'd like to page-align some global arrays (since that saves some cycles for 
free on 0x100 byte tables, avoiding crossing page-boundaries) ... is that 
possible in C somehow?

- how much do the c64 libraries depend on the system interupt, ie $ea31 beeing 
called frequently? will it for example affect the clock() call if i use my 
own interupt handler? (i'd like to skip the kernel stuff alltogether and just 
do some simple joystick stuff)...i need a working clock/time for syncing :)

mmmmh...thats all for now i guess...back to coding :)

gpz


----------------------------------------------------------------------
To unsubscribe from the list send mail to majordomo_at_musoftware.de with
the string "unsubscribe cc65" in the body(!) of the mail.


Date view Thread view Subject view

This archive was generated by hypermail 2.1.3 : 2003-09-10 05:32:01 CEST