Re: [cc65] Charset handling question...

From: Greg King <greg.king41verizon.net> Date: 2007-07-29 15:34:32 · This archive was generated by hypermail 2.1.8 : 2007-07-29 15:35:07 CEST

From: "Andreas Koch"; on Sunday, July 29, 2007; at 04:24 AM -0400
>
> I'm in the process of writing an app. that is going to be ported to
> various platforms and compilers/assemblers.
> It internally works using ASCII, and uses various access methods
> (including, on the C64, direct screen-memory access) for in/output.
>
> I already defined some ASCII-Code <-> C64-Screencode translation tables
> (Like a=01 (C64) vs. a=97 (ASCII)).
>
> Now, while
>   poke(SCREEN+0,Ascii2Screen['a']);
> works quite nicely,
>   poke(SCREEN+1,Ascii2Screen['T']);
> unfortunately doesn't -- 'T' seems to be 212 (0xD4) in CC65, not 84 as
> in ASCII.
>
> So:
> 1) In what encoding does CC65, by default, store string/char literals?

CC65's compiler and assembler store those literals in each platform's native
code-set (eg., PETSCII on CBM machines, ATASCII on Atari machines).

> 2) Is this a CC65 specialty, or can I expect to find non-ASCII storage in
> other compilers, as well?

It's a CC65 specialty.  Most other compilers use the code-set that's in the
source files that they process.  (CC65 defines the __CC65__ preprocessor
"macro".)

> 3) Can I switch CC65 to store as ASCII?

Yes, but usually by compiling a source file for an ASCII platform.  However,
that method changes some of the values that some of CC65's header files
define!  So, you must be careful about which files you compile for which
platform.

> 4) Is there some more sensible way to handle this [I can't just use
> printf(), etc. because I really need various in/output methods, including
> rendering text to graphics, processing files in ASCII or PETSCII, and
> processing interpreted code working on that ASCII text].

It is easy to handle the literal-text problem:
Put the code that uses PETSCII text in one file, and put the code that uses
ASCII text in a second file.  Compile the PETSCII file for one of the CBM
machines; compile the ASCII file for the "none" platform.  Then, link the
object files together.

But, you would have problems with library functions.  You would want to take
character functions (the <ctype.h> functions, for example) from the library
of an ASCII machine.  But, you would want to link, at the same time, to
functions that exist in only a non-ASCII library (the CBM file functions,
for example).

It probably would be easier to be portable in the opposite direction; that
is, your program would use a native code-set internally, and would convert
text from/to the non-native files.

Preprocessor directives could choose the appropriate source code that knows
which code-set is native and which set is not.

Or, you could isolate, into their own functions, the program codes that need
to know the difference.  Then, you could build different libraries that
handle the different character-code sets.  Link to the appropriate library
when you build the final program.

----------------------------------------------------------------------
To unsubscribe from the list send mail to majordomo@musoftware.de with
the string "unsubscribe cc65" in the body(!) of the mail.