
When function A is calling function B, B is calling C, which in turn calls D, which in turn calls E and goes deeper. use any static analysis tool, to get to know worst cast function call path and look for venues to reduce it. The deeper the function calls, the heavier is the use of the stack. On ARM architectures as per AAPCS(ARM arch Procedure Call standard), maximum of 4 parameters can be passed using the registers and rest of the parameters would be pushed into the stack.Īlso consider the case of using a global rather than passing the data to a function which is most frequently called with the same parameter. Make sure that the number of parameters passed to a functions is deeply analysed.
W.r.t functions, following are the handles to optimise the RAM
Free space and ram optimizer for laptop code#
The last one is only relevant on architectures that store code in RAM, I guess. In some cases you may actually get a space improvement by inlining some functions, if the prolog/epilog overhead is larger than the function body. Roll loops, pull common code into functions, etc. If you've got a hard realtime deadline, you know exactly how fast your code needs to run, so if you've any spare performance you can make speed/size tradeoffs until you hit that point.
Enable function body and constant merging in the compiler, so that if it sees eight different consts with the same value, it'll put just one in the text segment and alias them with the linker. alloca() doesn't clean up until a function returns, so can waste stack longer than you expect. This can be bad with large objects in outer functions that then eat up persistent memory they don't have to as the outer function calls deeper into the tree. Some compilers are very bad at recognizing when you're done with a variable, and will leave it on the stack until the function returns. Scope local allocations tightly so that they don't stay on stack longer than they have to. A constant memory footprint is much easier to optimize and you'll be sure of having no leaks. Do away with dynamic memory allocation and use static pools. If you've got any arrays of more than eight structures, it's worth using bitfields to pack them down tighter. Start with the obvious: squeeze 32-bit words into 16s where possible, rearrange structures to eliminate padding, cut down on slack in any arrays. Here are the tricks I've used on the Cell: AES encryption can use an on-the-fly key calculation which means you don't have to have the entire expanded key in memory. Some algorithms have have a range of possible implementations with a speed/memory trade-off. I can't remember what ARM compilers typically do, but some compilers I've used in the past by default made enum variables 2 bytes even though the enum definition really only required 1 byte to store its range. Enum variable sizeĮnum variable size may be bigger than necessary. int16_t ( short) or int8_t ( char) instead of int32_t ( int). (As a side benefit, the functions are more likely to be re-entrant, thread-safe.) Smaller variables It might require an increase in the stack size, but will save more memory on reduced global/static variables. If this happens enough times in the code, it would actually save total memory usage overall to make such variables local again. I've seen code that needed a small temporary array in a function, which was declared static, evidently because "it would take too much stack space". You can check things like: Global vs local variablesĬheck for unnecessary use of static or global variables, where a local variable (on the stack) can be used instead. If you've checked for the easy savings, and still need more, you might need to go through your code and save "here a little, there a little". For ARM processors, there can be several stacks, for several of the operating modes, and you may find that the stacks allocated for the exception or interrupt operating modes are larger than needed. If you measure your stack usage and it's well under the allocation, you may be able to reduce the allocation. If you don't use heap, you can possibly eliminate that. Perhaps your linker config reserves large amounts of RAM for heap and stack, larger than necessary for your application.
E.g.: const my_struct_t * param_lookup = // Two const may be needed also in ROM Especially look out for look-up tables of pointers, which need the const on the correct side of the *, or may need two const declarations. If you can get a memory map from the linker, check that for large items in RAM.Ĭheck for look-up tables that haven't used the const declaration properly, which puts them in RAM instead of ROM. Arrays and Lookup TablesĪrrays and look-up tables can be good "low-hanging fruit".
On the other hand, there may turn out to be some "low hanging fruit". Unlike speed optimisation, RAM optimisation might be something that requires "a little bit here, a little bit there" all through the code.