From: Groepaz (groepaz_at_gmx.net)
Date: 2003-09-15 01:05:01
On Sunday 14 September 2003 13:27, Ullrich von Bassewitz wrote: > I've had a look at the preprocessor: What you're suggesting is rather > difficult to add to the current implementation[1]. But because of some > other problems with the current implementation, the preprocessor may get a > rewrite at some time, and then I may consider your proposal. i see .... so remind me to remind you when its time to do it :=P > > that should allow labels in inline asm in both macros and functions with > > the following restrictions: > > [...] > > Here is one more: The concept does not allow generation of local labels > outside of macros, because it relies on a counter incremented when a macro > is expanded. oh...it should only be a problem with generating local labels outside a macro if these are generated directly after locals which in turn are inside a macro... just let the counter increment everytime a function starts aswell (or each time a .proc directive is emitted maybe) a found another interisting problem though (and a real clean generic solution could be really tricky i guess :=P) .... think of a macro that a) has an AX assignment in between a jump and the jump target label and b) gets another macro (which also has locals) as paramter ... urgs :=) (somekind of real nested lexical levels and all that crap is needed here to get it 1001% right i guess :=P) (to lazy to write correct syntax now...) #define test1(a,b) \ AX = a \ inc a \ bcc @l1 \ AX = b \ @l1: \ AX #define test2(c) \ AX = c \ inx \ bcc @l1 \ inx \ @l1: \ AX test1(2,test2(4)); .... the @l1 from test1() will be in the wrong "namespace" since the expansion of test2() had already incremented the pointer further.... a real true recursive macro expansion can probably do the job though :) > This and the load/transfer stuff is in the current snapshot, but I do hope to be able to test this soon (couldnt check the register variable stuff yet either :/) > somehow regret adding it, because it saves just a few bytes, at least when > compiling the samples and libraries. For new peephole optimization > proposals I would suggest a few statistics highlighting the > advantages/disadvantages. mmmh well... IMHO every optimization that does both decrease codesize and increase execution speed at the same time (as the immidiate lda/ora etc merging does) is worth the trouble. yes it probably does only save a few bytes, admittedly it even saves only a few bytes in code where i exploit certain special cases. however, adding this and maybe a handful of additional things really has potential - its amazing to see how smart some of the compiler generated code actually looks after opt65 removed a few hickups. (let alone the bunch of useless tax/txa instructions it removes) also, the difference between opt65 and the cc65 internal optimizer doesnt seem to be that large either - ie, theres not much left that an improved opt65 could possibly do, any further stuff needs knowledge about the code generator (which opt65 does not have by design) or the program flow or whatever else is pretty hard to implement without a parse tree and that kinda things :) tjam, and i find it kinda hard to make any figures about how "worthy" a certain optimization is :=P it really depends on your testprogram what kind of possibly optimizations you will even spot - and even then the value of a removed instruction may raise and fall with that particular instruction beeing in the most inner loop of your program or not :) for short, i have to agree that optimizations that trade speed for size or vice versa can be problematic for general use, but all the rest are worth beeing used - personally i can happily live with (much) increased compile time then too. > I do really, really doubt this number, at least for compiler generated > code. It may be true for inline asm or use of __AX__/__EAX__, but in this > case it's a problem if the asm code. Within the samples and library > sources, there is exactly ONE module that profits from the optimization - > and the reason is that this modules makes heavy use of the is... functions > from ctype.h that are inline assembly. i'm only testing with what the compiler makes from my raycaster code.... a handful inline-asm macrose there, but the rest is pretty much plain C (with a few additional routines in asm, but they arent touched by opt65 or anything at all) some of the things that happen here may be really specific, i dont know....but actually i dont really think so - i am just copying some techniques that i am used to from assembler coding :) (and amazingly lot of the compiler generated code looks like what i would probably write when doing plain asm....after a peephole run with opt65 that is :=P) however, i will release the thing as an example for such kind of hacking (getting most of the speed out of C with no respect to final code size :=))) .... i want to add one or two more variants of the core algorithm (mostly to have a look myself what works best in cc65) and by then i think it should contain quite a bunch of different variations of a handful of sub-problems which can be compared then. mmmh....maybe if i find the time i'll try building contiki with all the stuff piped through opt65.... that could be interisting :) > BTW, your macro is a good example of code that will not work in certain > cases > > as pointed out by me in my last mail within this thread: > > #define getblock(_x,_y) (unsigned char) \ > > ( \ > > __AX__= (_y), \ > > __asm__ ("txa"), \ > > __asm__ ("tay"), \ > > __AX__= (_x), \ > > __asm__ ("lda %v,x", worldptr_lo ), \ > > __asm__ ("sta ptr1"), \ > > __asm__ ("lda %v,x", worldptr_hi ), \ > > __asm__ ("sta ptr1+1"), \ > > __asm__ ("lda (ptr1),y"), \ > > __asm__ ("ldx #%b", 0), \ > > __AX__) > > The second assignment to __AX__ will destroy the Y register just loaded in > all but the most simplest cases (where _x is a global variable). ouch! i wouldnt have expected that here really.... uhmz :=P the assignment somehow suggests that it doesnt touch Y at all (even if in reality it cant really avoid using it) gpz ---------------------------------------------------------------------- To unsubscribe from the list send mail to majordomo_at_musoftware.de with the string "unsubscribe cc65" in the body(!) of the mail.
This archive was generated by hypermail 2.1.3 : 2003-09-15 01:16:32 CEST