cc65 internals

Brad Smith


Internal details of cc65 code generation, such as the expected linker configuration, and calling assembly functions from C.

1. Linker configuration

2. Calling assembly functions from C


1. Linker configuration

The C libraries and code generation depend directly on a suitable linker configuration. There are premade configuration files in the cfg/ directory, normally chosen by the linker's selected target. These can be used as a template for customization.

The C libraries depend on several special segments to be defined in your linker configuration. Generated code will also use some of them by default. Some platform libraries have additional special segments.

Memory areas are free to be defined in a way that is appropriate to each platform, and the segments they contain are used as a layer of semantics and abstraction, to allow much of the reorganization to be done with the linker config, rather than requiring platform-specific code source changes.

1.1 ZEROPAGE segment

Used by the C library and generated code for efficient internal and temporary state storage, also called "pseudo-registers".

1.2 STARTUP segment

Used by each platform instance of the C library in crt0.s to contain the entry point of the program.

The startup module will export __STARTUP__ : absolute = 1 to force the linker to always include crt0.s from the library.

1.3 CODE segment

The default segment for generated code, and most C library code will be located here.

Use #pragma code-name to redirect generated code to another segment.

1.4 BSS segment

Used for uninitialized variables. Originally an acronym for "Block Started by Symbol", but the meaning of this is now obscure.

Use #pragma bss-name to redirect uninitialized variables to another segment.

1.5 DATA segment

Used for initialized variables.

On some platforms, this may be initialized as part of the program loading process, but on others it may have a separate LOAD and RUN address, allowing copydata to copy the initialization from the loaded location into their run destination in RAM.

Use #pragma data-name to redirect initialized variables to another segment.

1.6 RODATA segment

Used for read-only (constant) data.

Use #pragma rodata-name to redirect constant data to another segment.

1.7 FEATURES table

This currently defines table locations for the CONDES constructor, destructor, and interruptor features. Some platform libraries use these.

The constructors will be called with initlib at startup, and the destructors with donelib at program exit. Interruptors are called with callirq.

2. Calling assembly functions from C

2.1 Calling conventions

There are two calling conventions used in cc65:

The default convention is fastcall, but this can be changed with the --all-cdecl command line option. If a convention is specified in the function's declaration, that convention will be used instead. Variadic functions will always use cdecl convention.

If the --standard command line option is used, the cdecl and fastcall keywords will not be available. The standard compliant variations __cdecl__ and __fastcall__ are always available.

If a function has a prototype, parameters are pushed to the C-stack as their respective types (i.e. a char parameter will push 1 byte), but if a function has no prototype, default promotions will apply. This means that with no prototype, char will be promoted to int and be pushed as 2 bytes. "K & R"-style forward declarations may be used, but they will function the same as if no prototype was used.

2.2 Prologue, before the function call

If the function is declared as fastcall, the rightmost argument will be loaded into the A/X/sreg registers:

All other parameters will be pushed to the C-stack from left to right. The rightmost parameter will have the lowest address on the stack, and multi-byte parameters will have their least significant byte at the lower address.

The sp pseudo-register is a zeropage pointer to the base of the C-stack. If the function is variadic, the Y register will contain the number of bytes pushed to the stack for this function.

Example:

// C prototype
void cdecl foo(unsigned bar, unsigned char baz);

; C-stack layout within the function:
;
;            +------------------+
;            | High byte of bar |
; Offset 2 ->+------------------+
;            | Low byte of bar  |
; Offset 1 ->+------------------+
;            | baz              |
; Offset 0 ->+------------------+

; Example code for accessing bar. The variable is in A/X after this code snippet:
;
    ldy     #2      ; Offset of high byte of bar
    lda     (sp),y  ; High byte now in A
    tax             ; High byte now in X
    dey             ; Offset of low byte of bar
    lda     (sp),y  ; Low byte now in A

2.3 Epilogue, after the function call

Return requirements

If the function has a return value, it will appear in the A/X/sreg registers.

Functions with an 8-bit return value (char or unsigned char) are expected to promote this value to a 16-bit integer on return, and store the high byte in X. The compiler will depend on the promoted value in some cases (e.g. implicit conversion to int), and failure to return the high byte in X will cause unexpected errors. This problem does not apply to the sreg pseudo-register, which is only used if the return type is 32-bit.

If the function has a void return type, the compiler will not depend on the result of A/X/sreg, so these may be clobbered by the function.

The C-stack pointer sp must be restored by the function to its value before the function call prologue. It may pop all of its parameters from the C-stack (e.g. using the runtime function popa), or it could adjust sp directly. If the function is variadic, the Y register contains the number of bytes pushed to the stack on entry, which may be added to sp to restore its original state.

The internal pseudo-register regbank must not be changed by the function.

Clobbered state

The Y register may be clobbered by the function. The compiler will not depend on its state after a function call.

The A/X/sreg registers may be clobbered if any of them are not used by the return value (see above).

Many of the internal pseudo-registers used by cc65 are available for free use by any function called by C, and do not need to be preserved. Note that if another C function is called from your assembly function, it may clobber any of these itself: