This is the mail archive of the
ecos-discuss@sources.redhat.com
mailing list for the eCos project.
Re: code optimizations
- To: <ecos-discuss at sourceware dot cygnus dot com>
- Subject: Re: [ECOS] code optimizations
- From: Hugo Tyson <hmt at redhat dot com>
- Date: 28 Aug 2001 14:34:46 +0100
- References: <002001c12beb$47c11670$090110ac@TRENT>
"Trenton D. Adams" <tadams@theone.dnsalias.com> writes:
> Does it matter whether voltatile or unsigned goes first? Because I'm
> still getting the jumping around with (unsigned volatile)
(Other msgs have covered that volatile unsigned int * is best, but...)
You can still appear to get "jumping around."
Going back to Bart's example:
> For example, given two lines:
>
> *(unsigned *)PMPCON |= 0x0002;
> *(unsigned *)SYSCON2 |= SYSCON2_PCMCIA1;
On a random made-up CPU, that might turn into 10 instructions:
Naively compiled:
LOAD r1, =PMPCON
LOAD r3, [r1]
MOVE r2, #2
OR r3, r2
STORE r3, [r1]
LOAD r4, =SYSCON2
LOAD r6, [r4]
MOVE r5, #SYSCON2_PCMCIA1
OR r6, r5
STORE r6, [r4]
Note that we happen to have plenty of registers here.
Often, there is a delay after a load instruction before you can use the
data you loaded. The CPU hardware looks after this for you by waiting
until the load has completed. So the 2nd and 4th loads above might be
delayed waiting for the 1st and 3rd respectively loads of the addresses to
complete.
But optimized code can make use of that delay to "do other stuff" in time
that might otherwise be a delay. So it's perfectly OK to intermingle those
loads and stores as much as we're allowed to make better use of the load
path from memory or the cache or whatever. So you might actually get:
# these two are independent so no mutual delay:
LOAD r1, =PMPCON # first line address
LOAD r4, =SYSCON2 # 2nd line address
MOVE r2, #2 # 1st line setup
LOAD r3, [r1] # 1st line read
MOVE r5, #SYSCON2_PCMCIA1 # 2nd line setup
OR r3, r2 # 1st line "arithmetic"
STORE r3, [r1] # 1st line write
LOAD r6, [r4] # 2nd line as required
OR r6, r5 # by "volatile", all after
STORE r6, [r4] # 1st line completes
So if you single-step through you still might see the "setup" instructions
- getting addresses and constants - intermingled, even though the actual
manipulation of the hardware (LOADs and STORES of volatile memory
locations) must be ordered and non-overlapping.
HTH,
- Huge