RE: [cc65] Using the hardware stack

From: Christian Krüger <Christian.Krueger1pace.de>
Date: 2007-06-28 14:52:04
 
Hi,

> It wouldn't be faster than an all-static approach.

yes, of course.

> It would probably be a bit slower, because there's no STX absolute,Y 
> instruction, plus you lose the use of the Y register.

Which is not a big problem since you can solve this problems with little
effort (tya,txa,pha and vice versa...).
 
> It might be more compact than all-static due to not having to 
> allocate static memory for locals/parameters. But you would 
> have to be very careful to avoid stack overflow - 256 bytes 
> is abnormally small for a C stack.

Correct. But this stack is additional to the 256 byte CPU
stack where you like to store return adresses and register variables
etc. ;-)

> In fact, you would probably want to make the stack word-aligned and use
> 512 bytes. Or longword-aligned and 1024, or...

Why? Since there is no misalignment in the 6502 world this should not
be an issue. 
 
> It has some merit, but overall I prefer the static approach 
> for speed and simplicity.

The proposed solution is a compromise between your 'radical' approach
and the exisiting world.

BTW: Stack overflows could be easily handled in 'DEBUG'-mode
(to check if the tiny stack is sufficient):

; -------------------------------

; Classic (exisiting) implementaion

.proc	pushax

	pha          ; (3)
	lda	sp     ; (6)
	sec          ; (8)
	sbc	#2     ; (10)
	sta	sp	 ; (13)
	bcs	@L1    ; (17)
	dec	sp+1   ; (+5)
@L1:	ldy	#1     ; (19)
	txa		 ; (21)
	sta	(sp),y ; (27)
	pla          ; (31)
	dey          ; (33)
	sta	(sp),y ; (38)
	rts          ; (44)     

.endproc


; -------------------------------
; Quick stack implementation

.if		STACK_CHECKING

	.macro		psa
				dey
				bne @OK
				jmp STACKOVERFLOW		; ok - only 255 bytes... ;-)
	@OK:			sta STACK,y			
	.endmacro

.else
	.macro		psa
				dey
				sta STACK,y
	.endmacro
.endif

	
.proc	pushax
	pha			; (3)
	txa			; (5)
	psa			; (dey)	   (7)
				; (sta STACK,y)(12)
	pla			; (16)
	psa			; (dey)	    (18)	
				; (sta STACK,y) (23)
	rts			; (29)
.endproc

---------------------------

As visible the QS-implementation is quicker (29 to 44)
and smaller so that inlining could be a futher issue (additional
speed improvement).

Anyhow: Your 'leaf' and 'register' optimizations are
highly appreciated in conjunction!

chrisker
----------------------------------------------------------------------
To unsubscribe from the list send mail to majordomo@musoftware.de with
the string "unsubscribe cc65" in the body(!) of the mail.
Received on Thu Jun 28 14:52:16 2007

This archive was generated by hypermail 2.1.8 : 2007-06-28 14:52:19 CEST