Skip to content

These module docs are in beta and may be incomplete.

modm:platform:cortex-m: ARM Cortex-M Core

This module generates the startup code, the vector table, the linkerscript and provides runtime support for dynamic memory allocations on the heap and assertion handling.


The linkerscript is generated for the devices memory map.

Placement of .fastcode section

From the Cortex-M3 Technical Reference Manual:

14.5 System Interface:

The system interface is a 32-bit AHB-Lite bus. Instruction and vector fetches, and data and debug accesses to the System memory space, 0x20000000 - 0xDFFFFFFF, 0xE0100000 - 0xFFFFFFFF, are performed over this bus.

14.5.6 Pipelined instruction fetches:

To provide a clean timing interface on the System bus, instruction and vector fetch requests to this bus are registered. This results in an additional cycle of latency because instructions fetched from the System bus take two cycles. This also means that back-to-back instruction fetches from the System bus are not possible.

Note: Instruction fetch requests to the ICode bus are not registered. Performance critical code must run from the ICode interface.

So for STM32s where the CCM is not connected to the I-Bus, we execute .fastcode from Flash.



Fill the stack with relative jumps to fault handler to prevent accidental execution.: False{ True, False }


Dynamic memory allocation strategy: newlib{ block, newlib, tlsf }

By default, the arm-none-eabi toolchain ships with the newlib libc, which uses dlmalloc as the underlying allocator algorithm and only requires the implementation of the void * sbrk(ptrdiff_t size) hook. However, this limits the allocator to use just one memory region, which must then also be of continuous extend, since sbrk can only grow and shrink, but not jump. Therefore, when using the newlib strategy, only the largest memory region is used as heap! Depending on the device memory architecture this can leave large memory regions unused.

For devices with very small memories, we recommend using the block allocator strategy, which uses a very light-weight and simple algorithm. This also only operates on one continuous memory region as heap.


Memories can have different traits, such as DMA-ability or access time. The default memory allocator functions (malloc, new, etc) only return DMA-able memories, ordered by fastest access time. Similarly the search for the largest memory region only considers DMA-able memory.


For devices which contain separate memories laid out in a continuous way (often called SRAM1, SRAM2, etc.) the newlib and block strategies choose the largest continuous memory region, even though unaligned accesses across memory regions may not be supported in hardware and lead to a bus fault! Consider using the TLSF implementation, which does not suffer from this issue.

To use all non-statically allocated memory for heap, use the TLSF strategy, which natively supports multiple memory regions. Our implementation treats all internal memories as separate regions, so unaligned accesses across memory boundaries are not an issue. To request heap memory of different traits, see modm::MemoryTraits.


The TLSF implementation has a static overhead of about 1kB per memory trait group, however, these can then contain multiple non-continuous memory regions. The upside of this large static allocation is very fast allocation times of O(1), but we recommend using TLSF only for devices with multiple memory regions.


Minimum size of the application main stack: 3040{ 256 .. 3040 .. 65536 }

The ARM Cortex-M uses a descending stack mechanism which is placed so that it grows towards the beginning of RAM. In case of a stack overflow the hardware then attempts to stack into invalid memory which triggers a HardFault. A stack overflow will therefore never overwrite any static or heap memory and this protection works without the MPU and therefore also on ARM Cortex-M0 devices.

If you enable either the LED or the logging HardFault option, a smaller stack is added above the main stack. This stack is only used by the HardFault handler when not enough memory remains in the main stack to preserve GDB backtrace behavior. This memory also acts as a small safety buffer against main stack underflow, which is not detected however.

If the vector table is relocated into RAM, the start address needs to be aligned to the next highest power-of-two word depending on the total number of device interrupts. On devices where the table is relocated into the same memory as the main stack, an alignment buffer up to 1kB is added to the main stack.

|              ...                |
|    Interrupt Vectors (in RAM)   |
|        (if re-mapped)           | <-- vector table origin
|---------------------------------| <-- HardFault stack top
|        HardFault Stack          |
|       (grows downwards)         |
|               |                 |
|               v                 |
|---------------------------------| <-- main stack top
|           Main Stack            |
|       (grows downwards)         |
|               |                 |
|               v                 |
|  Alignment buffer for vectors   |
|   (overwritten by main stack!)  |
'---------------------------------' <-- RAM origin


The main stack size you provide is a minimum and may be enlarged to satisfy alignment requirements. Be aware that these requirements operate on the sum of HardFault and main stack. Disabling HardFault options may therefore decrease the alignment buffer added to the main stack size, which may make your application overflow stack. You need to increase your minimum main stack size in that case.


The main stack is watermarked and you can get the maximum stack usage using the uint32_t modm::cortex::getMaximumStackUsage() function.


Vector table location in ROM or RAM: rom{ ram, rom }

The vector table is always stored in ROM and copied to RAM by the startup script if required. You can modify the RAM vector table using the CMSIS NVIC functions:

  • void NVIC_SetVector(IRQn_Type IRQn, uint32_t vector)
  • uint32_t NVIC_GetVector(IRQn_Type IRQn)

For applications that do not modify the vector table at runtime, relocation to RAM is not necessary and can save a few hundred bytes of static memory.

By default, the fastest option is chosen depending on the target memory architecture. This does not always mean the table is copied into RAM, and therefore may not be modifiable with this option!

From the ARM Cortex-M4 Technical Reference Manual on exception handling:

  • Processor state is automatically stored to the stack on an exception, and automatically restored from the stack at the end of the Interrupt Service Routine.
  • The vector is fetched in parallel to the state saving, enabling efficient interrupt entry.


Placing main stack and vector table into the same memory can significantly slow down interrupt latency, since both I-Code and D-Code memory interface need to fetch from the same access port.


Add an offset to the default start address of the flash memory. This might be required for bootloaders located there. WARNING: Not all offsets are compatible with the vector table relocation.: 0{ 0 ... 2097152 }


// Function
void _delay_ms(uint16_t ms);
void _delay_ns(uint16_t ns);
void _delay_us(uint16_t us);



Additions to the linkerscript's 'MEMORY' ∈ String


Maximum required size of the process stack ∈ -Inf ... +Inf


Additions to the linkerscript's 'SECTIONS' ∈ String


Additions to the linkerscript's '.table.copy.extern' section ∈ String


Additions to the linkerscript's '.table.heap' section ∈ String

Additions to the linkerscript's '' section ∈ String



Computes linkerscript properties post-build: - process_stack_size: largest requested process stack size by any module - vector_table_location: ram or rom

Stripped and newline-joined collector values of: - linkerscript_memory - linkerscript_sections - linkerscript_extern_zero - linkerscript_extern_copy - linkerscript_extern_heap

Additional memory properties: - memories: unfiltered memory regions - regions: memory region names - ram_origin: Lowest SRAM origin address - ram_origin: Total size of all SRAM regions

:returns: dictionary of linkerscript properties


Computes vector table properties: - vector_table: [position] = Full vector name (ie. with _Handler or _IRQHandler suffix) - vector_table_location: rom or ram - highest_irq: highest IRQ number + 1 - core: cortex-m{0,3,4,7}{,+,f,fd}

The system vectors start at -16, so you must add 16 to highest_irq to get the total number of vectors in the table!

:returns: a dictionary of vector table properties


modm:platform:cortex-m modm_platform_cortex_m modm: platform: cortex-m modm_architecture_accessor modm: architecture: accessor modm_platform_cortex_m->modm_architecture_accessor modm_architecture_assert modm: architecture: assert modm_platform_cortex_m->modm_architecture_assert modm_architecture_atomic modm: architecture: atomic modm_platform_cortex_m->modm_architecture_atomic modm_architecture_build_id modm: architecture: build_id modm_platform_cortex_m->modm_architecture_build_id modm_architecture_clock modm: architecture: clock modm_platform_cortex_m->modm_architecture_clock modm_architecture_delay modm: architecture: delay modm_platform_cortex_m->modm_architecture_delay modm_architecture_heap modm: architecture: heap modm_platform_cortex_m->modm_architecture_heap modm_architecture_interrupt modm: architecture: interrupt modm_platform_cortex_m->modm_architecture_interrupt modm_architecture_memory modm: architecture: memory modm_platform_cortex_m->modm_architecture_memory modm_architecture_unaligned modm: architecture: unaligned modm_platform_cortex_m->modm_architecture_unaligned modm_cmsis_device modm: cmsis: device modm_platform_cortex_m->modm_cmsis_device modm_platform modm: platform modm_platform_cortex_m->modm_platform modm_platform_clock modm: platform: clock modm_platform_cortex_m->modm_platform_clock modm_tlsf modm: tlsf modm_platform_cortex_m->modm_tlsf modm_platform_core modm: platform: core modm_platform_core->modm_platform_cortex_m