Skip to content

ARM Cortex-M Core

lbuild module: modm:platform:cortex-m

This module generates the startup code, vector table, linkerscript as well as initialize the heap, deal with assertions, provide blocking delay functions, atomic and unaligned access and the GNU build ID.

Since this is only initializes the generic ARM Cortex-M parts, it delegates device-specific initialization to the modm:platform:core module. Please depend on that module directly instead of this one.


After reset, the ARM Cortex-M hardware jumps to the Reset_Handler(), which is implemented as follows:

  1. The main stack pointer (MSP) is initialized by hardware.
  2. Call __modm_initialize_platform() to initialize the device hardware.
  3. Copy data from internal flash to internal RAM.
  4. Zero sections in internal RAM.
  5. Initialize ARM Cortex-M core: enable FPU and relocate vector table.
  6. Execute shared hardware initialization functions.
  7. Copy data from internal flash to external RAM.
  8. Zero sections in external RAM.
  9. Initialize heap via __modm_initialize_memory() (implemented by the modm:platform:heap module).
  10. Call static constructors.
  11. Call main() application entry point.
  12. If main() returns, assert on main.exit (only in debug profile).
  13. Reboot if assertion returns.

Device Initialization

The __modm_initialize_platform() function is called directly after reset, and its purpose is to initialize the device specific hardware, such as enable internal memories or disable the hardware watchdog timer.

It's important to understand that because the .data section has not yet been copied and the .bss section has not yet been zeroed, there exists no valid C environment yet in this function context! This means you cannot use any global variables, not even "local" static ones defined in your function, and depending on your hardware you may not even access read-only data (const variables, global OR local). In addition, if your linkerscript places the main stack pointer into a memory that is disabled on reset, you cannot even access the stack until you've enabled its backing memory. The Reset_Handler therefore calls this function in Assembly without accessing the stack.

It is strongly recommended to only read/write registers in this function, and perhaps even write this function in Assembly if deemed necessary. Do not initialize the device clock, leave the default clock undisturbed!

Additional Initialization

A few modules need to initialize additional hardware during booting. For example: your device has external memories connected that you want to use for the heap. You can create a function that configures the peripherals for these external memories and place a pointer to this function into a special linker section and the startup script will then call this function before heap initialization.

Since the hardware init functions are called after internal data initialization, you have a valid C environment and thus can access the device normally, but since the calls happen before external data and heap initialization you cannot use the heap in these functions!

You can give a relative global order to your init functions. Ordered init functions are called first, then unordered init functions are called in any order. Please note that order numbers 0 - 999 are reserved for use by modm or other libraries!

Unique init function names

Init function names need to be globally unique for linking. Unfortunately there is no simple way of stringifying C++ functions, so you have to provide a name manually for now.

void init_external_sdram()
    // configure the hardware here
// Startup script calls this function in any order, *after* prioritized functions!
// If you need to pass a C++ function, you need to declare
MODM_HARDWARE_INIT_NAME(init_function_name, namespace::init_function);

// If you need to initialize in a certain order use numbers >= 1000
MODM_HARDWARE_INIT_ORDER(init_before_sdram1, 1000);
// called after init_before_sdram1, since it has a higher order number
MODM_HARDWARE_INIT_NAME_ORDER(init_before_sdram2, namespace::function, 1001);

Interrupt Vector Table

The Cortex-M vector table (VTOR) is target-specific and generated using data from modm-devices. The main stack pointer is allocated according to the linkerscript and the Reset_Handler is defined by the startup script.

All handlers are weakly aliased to Undefined_Handler, which is called if an IRQ is enabled, but no handler is defined for it. This default handler determines the currectly active IRQ, sets its priority to the lowest level, and disables the IRQ from firing again and then asserts on nvic.undef with the (signed) IRQ number as context.

The lowering of the priority is necessary, since the assertion handlers (see modm:architecture:assert) are called from within this active IRQ and its priority should not prevent logging functionality (which might require a UART interrupt to flush data out) from working correctly.


This module provides building blocks for GNU ld linkerscripts in the form of Jinja macros that the modm:platform:core module assembles into a linkerscript, depending on the memory architecture of the target chosen.

The following macros are available:

  • copyright(): Copyright notice.
  • prefix(): Contains MEMORY sections, output format and entry symbol and stack size definitions

  • section_vector_rom(memory): place the read-only vector table at the beginning of ROM memory.

  • section_vector_ram(memory): place the volatile vector table into RAM memory. You must satisfy alignment requirements externally.

  • section(memory, name): place section .{name} into memory.

  • section_stack(memory, start=None): place the main stack into memory after moving the location counter to start.

  • section_heap(memory, name, section=None): Fill up remaining space in memory with heap section .{name} and add to section.

  • section_rom(memory): place all read-only sections (.text, .rodata etc) into memory.

  • section_ram(memory, rom): place all volatile sections (.data, .bss etc)
  • into memory and load from rom.

  • section_table_zero(memory, sections=[]): place the zeroing table (.bss plus sections) into memory.

  • section_table_copy(memory, sections=[]): place the copying table (.data, .fastdata, .vector_ram plus sections) into memory.
  • section_table_extern(memory): place the zeroing and copying tables for external memories into memory.
  • section_table_heap(memory, sections): place heap tables containing sections into memory.

  • section_rom_start(memory): place at ROM start.

  • section_rom_end(memory): place at ROM end.
  • section_debug(): place debug sections at the very end.

Please consult the modm:platform:core documentation for the target-specific arrangement of these section macros and for potential limitations that the target's memory architecture poses.

Section .fastdata

For devices without data cache place the .fastdata section into the fastest RAM. Please note that the .fastdata section may need to be placed into RAM that is only accessable to the Cortex-M core, which can cause issues with DMA access. However, the .fastdata sections is not required to be DMA-able, and in such a case the developer needs to place the data into the generic .data section or choose a device with a DMA-able fast RAM.

Section .fastcode

For devices without an instruction cache or without a fast RAM connected to the I-bus, place .fastcode into ROM, which usually has a device-specific ROM cache. Please note that using a device with a dedicated instruction cache RAM yields much more predictable performance than executing from ROM, even with a ROM cache.

From the Cortex-M3 Technical Reference Manual:

14.5 System Interface:

The system interface is a 32-bit AHB-Lite bus. Instruction and vector fetches, and data and debug accesses to the System memory space, 0x20000000 - 0xDFFFFFFF, 0xE0100000 - 0xFFFFFFFF, are performed over this bus.

14.5.6 Pipelined instruction fetches:

To provide a clean timing interface on the System bus, instruction and vector fetch requests to this bus are registered. This results in an additional cycle of latency because instructions fetched from the System bus take two cycles. This also means that back-to-back instruction fetches from the System bus are not possible.

Note: Instruction fetch requests to the ICode bus are not registered. Performance critical code must run from the ICode interface.

Adding Sections

The default linkerscripts only describe the internal memory, however, they can be extended for external memories using the linkerscript.* collectors of this module. For example, to add an external 16MB SDRAM to your device and place a static data section there that is copied from flash and use the remainder for heap access, these steps need to be performed:

Add the external SDRAM to the linkerscript's MEMORY statements in the project.xml configuration:

    <collect name="modm:platform:cortex-m:linkerscript.memory">
       SDRAM (rwx) : ORIGIN = 0xC0000000, LENGTH = 16M

You can also declare this as Python code in a lbuild file (useful for board support packages modules, see modm:board):

            "SDRAM (rwx) : ORIGIN = 0xC0000000, LENGTH = 16M")

Add a partition of the new memory to the linkerscripts SECTION statements. Since collectors order is only preserved locally, make sure to add the sections that depend on this order in one value. Here the previous value of the SDRAM location counter is required to "fill up" the remaining memory with the external heap section:

linkerscript_sections = """
.sdramdata :
    __sdramdata_load = LOADADDR (.sdramdata);   /* address in FLASH */
    __sdramdata_start = .;                      /* address in RAM */


    . = ALIGN(4);
    __sdramdata_end = .;

.heap_extern (NOLOAD) : ALIGN(4)
    __heap_extern_start = .;
    __heap_extern_end = .;
env.collect(":platform:cortex-m:linkerscript.sections", linkerscript_sections)

Next, add the sections that need to be copied from ROM to RAM, here the contents of the .sdramdata section is stored in the internal FLASH memory and needs to be copied into SDRAM during the startup:

linkerscript_copy = """
env.collect(":platform:cortex-m:linkerscript.table_extern.copy", linkerscript_copy)

And finally, to register the remaining memory in SDRAM with the allocator, add the memory range to the heap table. Remember to use the correct memory traits for this memory, see modm:architecture:memory for the trait definitions:

linkerscript_heap = """
env.collect(":platform:cortex-m:linkerscript.table_extern.heap", linkerscript_heap)

Linkerscript collectors are plain text

The collectors here only strip the leading/trailing whitespace and newlines and paste the result as is into the linkerscripts. No input validation is performed, so if you receive linker errors with your additions, please check the GNU LD documentation first.

Blocking Delay

The delay functions as defined by modm:architecture:delay are implemented via software loop or hardware cycle counter (via DWT->CYCCNT, not available on ARMv6-M devices) and have the following limitations:

  • nanosecond delay is implemented as a tight loop with better than 100ns resolution and accuracy at any CPU frequency.
  • microsecond delay has a maximum delay of 10 seconds.
  • millisecond delay is implemented via modm::delay_us(ms * 1000), thus also has a maximum delay of 10 seconds.

Compiler Options

This module adds these architecture specific compiler options:

  • -mcpu=cortex-m{type}: the target to compile for.
  • -mthumb: only Thumb2 instruction set is supported.
  • -mfloat-abi=hard: if FPU available use the fastest ABI available.
  • -mfpu=fpv{4, 5}-{sp}-d16: single or double precision FPU.
  • -fsingle-precision-constant: if SP-FPU, treat all FP constants as SP.
  • -Wdouble-promotion: if SP-FPU, warn if FPs are promoted to doubles.

In addition, these linker options are added:

  • -nostartfiles: modm implements its own startup script.
  • -wrap,_{calloc, malloc, realloc, free}_r: reimplemented Newlib with our own allocator.

This module is only available for sam, stm32.



Offset of FLASH Section Origin

Add an offset to the default start address of the flash memory. This might be required for bootloaders located there.

Vector Table Relocation

Not all offsets are compatible with the vector table relocation.

Default: 0
Inputs: [0 ... 0x10000] sam, stm32{f0,f1,f3,f4,f7,g0,g4,l1,l4}
Inputs: [0 ... 0x100000] stm32{f1,f2,f4,f7,l4}
Inputs: [0 ... 0x180000] stm32f4
Inputs: [0 ... 0x20000] sam, stm32{f0,f1,f2,f3,f4,g0,g4,l1,l4}
Inputs: [0 ... 0x200000] stm32{f4,f7,l4}
Inputs: [0 ... 0x4000] stm32{f0,f1,f3,g0}
Inputs: [0 ... 0x40000] sam, stm32{f0,f1,f2,f3,f4,f7,g4,l1,l4}
Inputs: [0 ... 0x60000] stm32{f1,f3,f4,l1}
Inputs: [0 ... 0x8000] sam, stm32{f0,f1,f3,g0,g4,l1}
Inputs: [0 ... 0x80000] stm32{f1,f2,f3,f4,f7,g4,l1,l4}
Inputs: [0 ... 0xc0000] stm32{f1,f2}


Path to project provided linkerscript

Default: []
Inputs: [Path]


Minimum size of the application main stack

The ARM Cortex-M uses a descending stack mechanism which is placed so that it grows towards the beginning of RAM. In case of a stack overflow the hardware then attempts to stack into invalid memory which triggers a HardFault. A stack overflow will therefore never overwrite any static or heap memory and this protection works without the MPU and therefore also on ARM Cortex-M0 devices.

If the vector table is relocated into RAM, the start address needs to be aligned to the next highest power-of-two word depending on the total number of device interrupts. On devices where the table is relocated into the same memory as the main stack, an alignment buffer up to 1kB is added to the main stack.

|              ...                |
|    Interrupt Vectors (in RAM)   |
|        (if re-mapped)           | <-- vector table origin
|---------------------------------| <-- main stack top
|           Main Stack            |
|       (grows downwards)         |
|               |                 |
|               v                 |
|  Alignment buffer for vectors   |
|   (overwritten by main stack!)  |
'---------------------------------' <-- RAM origin

Default: 3*1024 (3072)
Inputs: [256 .. 3*1024 .. 65536]


Vector table location in ROM or RAM

The vector table is always stored in ROM and copied to RAM by the startup script if required. You can modify the RAM vector table using the CMSIS NVIC functions:

  • void NVIC_SetVector(IRQn_Type IRQn, uint32_t vector)
  • uint32_t NVIC_GetVector(IRQn_Type IRQn)

For applications that do not modify the vector table at runtime, relocation to RAM is not necessary and can save a few hundred bytes of static memory.

By default, the fastest option is chosen depending on the target memory architecture. This does not always mean the table is copied into RAM, and therefore may not be modifiable with this option!

From the ARM Cortex-M4 Technical Reference Manual on exception handling:

  • Processor state is automatically stored to the stack on an exception, and automatically restored from the stack at the end of the Interrupt Service Routine.
  • The vector is fetched in parallel to the state saving, enabling efficient interrupt entry.

On Interrupt Latency

Placing main stack and vector table into the same memory can significantly slow down interrupt latency, since both I-Code and D-Code memory interface need to fetch from the same access port.

This option is only available for stm32{f1,f2,f3,f4,f7,g4,l1,l4}.

Default: ram stm32{f3,f7,g4}
Default: rom stm32{f1,f2,f3,f4,l1,l4}
Inputs: [ram, rom]



Additions to the linkerscript's 'MEMORY'

Inputs: [String]


Maximum required size of the process stack

Inputs: [0 ... +Inf]


Additions to the linkerscript's 'SECTIONS'

Inputs: [String]


Additions to the linkerscript's '.table.copy.extern' section

Inputs: [String]


Additions to the linkerscript's '.table.heap' section

Inputs: [String]

Additions to the linkerscript's '' section

Inputs: [String]



Computes linkerscript properties (* post-build only):

  • process_stack_size: largest requested process stack size by any module
  • vector_table_location: ram or rom

Stripped and newline-joined collector values of:

  • linkerscript_memory
  • linkerscript_sections
  • linkerscript_extern_zero
  • linkerscript_extern_copy
  • linkerscript_extern_heap

Additional memory properties:

  • memories: unfiltered memory regions
  • regions: memory region names
  • ram_origin: Lowest SRAM origin address
  • ram_origin: Total size of all SRAM regions

:returns: dictionary of linkerscript properties


Computes vector table properties:

  • vector_table: [position] = Full vector name (ie. with _Handler or _IRQHandler suffix)
  • vector_table_location: rom or ram
  • highest_irq: highest IRQ number + 1
  • core: cortex-m{0,3,4,7}{,+,f,fd}

The system vectors start at -16, so you must add 16 to highest_irq to get the total number of vectors in the table!

:returns: a dictionary of vector table properties


modm:platform:cortex-m modm_platform_cortex_m modm: platform: cortex-m modm_architecture_interrupt modm: architecture: interrupt modm_platform_cortex_m->modm_architecture_interrupt modm_cmsis_device modm: cmsis: device modm_platform_cortex_m->modm_cmsis_device modm_stdc++ modm: stdc++ modm_platform_cortex_m->modm_stdc++

Limited availability: Check with 'lbuild discover' if this module is available for your target!