ARM Cortex-M Core¶

lbuild module: modm:platform:cortex-m

This module generates the startup code, vector table, linkerscript as well as initialize the heap, deal with assertions, provide blocking delay functions, atomic and unaligned access and the GNU build ID.

Since this is only initializes the generic ARM Cortex-M parts, it delegates device-specific initialization to the modm:platform:core module. Please depend on that module directly instead of this one.

Startup¶

After reset, the ARM Cortex-M hardware jumps to the Reset_Handler(), which is implemented as follows:

The main stack pointer (MSP) is initialized by software.
Call __modm_initialize_platform() to initialize the device hardware.
Call modm_initialize_platform() to initialize the custom device hardware.
Copy data to internal RAM.
Zero sections in internal RAM.
Initialize ARM Cortex-M core: enable FPU, caches and relocate vector table.
Execute shared hardware initialization functions.
Copy data to external RAM.
Zero sections in external RAM.
Initialize heap via __modm_initialize_memory() (implemented by the modm:platform:heap module).
Call static constructors.
Call main() application entry point.
If main() returns, assert on main.exit (only in debug profile).
Reboot if assertion returns.

Device Initialization¶

The __modm_initialize_platform() function is called directly after reset, and its purpose is to initialize the device specific hardware, such as enable internal memories or disable the hardware watchdog timer. You can provide additional application-specific initialization by overwriting the weakly linked modm_initialize_platform() function:

extern "C" void modm_initialize_platform()
{
    // Configure power settings before accessing SRAM
}

It's important to understand that because the .data section has not yet been copied and the .bss section has not yet been zeroed, there exists no valid C environment yet in this function context! This means you cannot use any global variables, not even "local" static ones defined in your function, and depending on your hardware you may not even access read-only data (const variables, global OR local). In addition, if your linkerscript places the main stack pointer into a memory that is disabled on reset, you cannot even access the stack until you've enabled its backing memory. The Reset_Handler therefore calls this function in Assembly without accessing the stack.

It is strongly recommended to only read/write registers in this function, and perhaps even write this function in Assembly if deemed necessary.

Cache Initialization¶

For Cortex-M7 devices, the I-Cache is enabled by default. The D-Cache with a write-back write-allocate policy is only enabled if the modm:platform:dma module is NOT selected. modm currently does not support allocating DMA buffers in non-cachable regions or granular cache invalidation. See the CMSIS-Core Cache API for more information on cache management.

Additional Initialization¶

A few modules need to initialize additional hardware during booting. For example: your device has external memories connected that you want to use for the heap. You can create a function that configures the peripherals for these external memories and place a pointer to this function into a special linker section and the startup script will then call this function before heap initialization.

Since the hardware init functions are called after internal data initialization, you have a valid C environment and thus can access the device normally, but since the calls happen before external data and heap initialization you cannot use the heap in these functions!

You can give a relative global order to your init functions. Ordered init functions are called first, then unordered init functions are called in any order. Please note that order numbers 0 - 999 are reserved for use by modm or other libraries!

Unique init function names

Init function names need to be globally unique for linking. Unfortunately there is no simple way of stringifying C++ functions, so you have to provide a name manually for now.

void init_external_sdram()
{
    // configure the hardware here
}
// Startup script calls this function in any order, *after* prioritized functions!
MODM_HARDWARE_INIT(init_external_sdram);
// If you need to pass a C++ function, you need to declare
MODM_HARDWARE_INIT_NAME(init_function_name, namespace::init_function);

// If you need to initialize in a certain order use numbers >= 1000
MODM_HARDWARE_INIT_ORDER(init_before_sdram1, 1000);
// called after init_before_sdram1, since it has a higher order number
MODM_HARDWARE_INIT_NAME_ORDER(init_before_sdram2, namespace::function, 1001);

Interrupt Vector Table¶

The Cortex-M vector table (VTOR) is target-specific and generated using data from modm-devices. The main stack pointer is allocated according to the linkerscript and the Reset_Handler is defined by the startup script.

All handlers are weakly aliased to Undefined_Handler, which is called if an IRQ is enabled, but no handler is defined for it. This default handler determines the currectly active IRQ, sets its priority to the lowest level, and disables the IRQ from firing again and then asserts on nvic.undef with the (signed) IRQ number as context.

The lowering of the priority is necessary, since the assertion handlers (see modm:architecture:assert) are called from within this active IRQ and its priority should not prevent logging functionality (which might require a UART interrupt to flush data out) from working correctly.

Linkerscript¶

This module provides building blocks for GNU ld linkerscripts in the form of Jinja macros that the modm:platform:core module assembles into a linkerscript, depending on the memory architecture of the target chosen.

The following macros are available:

copyright(): Copyright notice.
prefix(): Contains MEMORY sections, output format and entry symbol and stack size definitions.
section_vector_rom(memory): places the read-only vector table into ROM memory.
section_vector_ram(memory, table_copy): places the volatile vector table into RAM memory and add it to the copy table. You must satisfy alignment requirements externally.
section_load(memory, table_copy, sections): place each .{section} in sections into memory and add them the copy table.
section_stack(memory, start=None, suffix=""): place the main stack into memory after moving the location counter to start. suffix can be used to add multiple .stack{suffix} sections.
section_heap(memory, name, placement=None, sections=[]): Add the noload sections to memory and fill up remaining space in memory with heap section .{name}. Argument placement can be used to place the section into a larger continuous section of which memory is just a subsection. The __{name}_end will be the maximum of the location counter and the memory section end address, so that previous sections will push this section back.
all_heap_sections(table_copy, table_zero, table_heap, props={}): places the heap sections as described by cont_ram_regions of the linkerscript query. This also adds bss and noinit sections into each region. The props key can be used to override the default 0x001f memory properties.
section_rom(memory): place all read-only sections (.text, .rodata etc) into memory.
section_ram(memory, rom, table_copy, table_zero, sections_data=[], sections_bss=[], sections_noinit=[]): place all volatile sections (.data, .bss etc) into memory and load from rom. Additional sections can be added.
section_tables(memory, copy, zero, heap): place the zero, copy and heap table into memory.
section_rom_start(memory): place at ROM start.
section_rom_end(memory): place at ROM end.
section_debug(): place debug sections at the very end.

Please consult the modm:platform:core documentation for the target-specific arrangement of these section macros and for potential limitations that the target's memory architecture poses.

Section `.fastdata`¶

The .fastdata section is placed into a device specific data cache or into the fastest RAM. Please note that the .fastdata section may be placed into RAM that is only accessable to the Cortex-M core (via the Data-Bus), which can cause issues with DMA access. However, the .fastdata section is not required to be DMA-able and in such a case the developer needs to place the data into the generic .data section or choose a device with a DMA-able fast RAM.

Section `.fastcode`¶

The .fastcode section is placed into a device specific instruction cache (via I-Code bus) or into the fastest executable RAM (via S-Bus).

From the Cortex-M3 Technical Reference Manual:

14.5 System Interface:

The system interface is a 32-bit AHB-Lite bus. Instruction and vector fetches, and data and debug accesses to the System memory space, 0x20000000 - 0xDFFFFFFF, 0xE0100000 - 0xFFFFFFFF, are performed over this bus.

14.5.6 Pipelined instruction fetches:

To provide a clean timing interface on the System bus, instruction and vector fetch requests to this bus are registered. This results in an additional cycle of latency because instructions fetched from the System bus take two cycles. This also means that back-to-back instruction fetches from the System bus are not possible.

Note: Instruction fetch requests to the ICode bus are not registered. Performance critical code must run from the ICode interface.

Adding Sections¶

The default linkerscripts only describe the internal memory, however, they can be extended for external memories using the linkerscript.* collectors of this module. For example, to add an external 16MB SDRAM to your device and place a static data section there that is copied from flash and use the remainder for heap access, these steps need to be performed:

Add the external SDRAM to the linkerscript's MEMORY statements in the project.xml configuration:

<library>
  <collectors>
    <collect name="modm:platform:cortex-m:linkerscript.memory">
       SDRAM (rwx) : ORIGIN = 0xC0000000, LENGTH = 16M
    </collect>
  <collectors>
</library>

You can also declare this as Python code in a lbuild module.lb file (useful for board support packages modules, see modm:board):

env.collect(":platform:cortex-m:linkerscript.memory",
            "SDRAM (rwx) : ORIGIN = 0xC0000000, LENGTH = 16M")

Add a partition of the new memory to the linkerscripts SECTION statements. Since collectors order is only preserved locally, make sure to add the sections that depend on this order in one value. Here the previous value of the SDRAM location counter is required to "fill up" the remaining memory with the external heap section:

linkerscript_sections = """
.data_sdram :
{
    __data_sdram_load = LOADADDR(.data_sdram);
    __data_sdram_start = .;

    *(.data_sdram)

    . = ALIGN(4);
    __data_sdram_end = .;
} >SDRAM AT >FLASH

.heap_sdram (NOLOAD) :
{
    __heap_sdram_start = .;
    . = ORIGIN(SDRAM) + LENGTH(SDRAM);
    __heap_sdram_end = .;
} >SDRAM
"""
env.collect(":platform:cortex-m:linkerscript.sections", linkerscript_sections)

Next, add the sections that need to be copied from ROM to RAM, here the contents of the .data_sdram section is stored in the internal FLASH memory and needs to be copied into SDRAM during the startup:

linkerscript_copy = """
LONG(__data_sdram_load)
LONG(__data_sdram_start)
LONG(__data_sdram_end)
"""
env.collect(":platform:cortex-m:linkerscript.table_extern.copy", linkerscript_copy)

And finally, to register the remaining memory in SDRAM with the allocator, add the memory range to the heap table. Remember to use the correct memory traits for this memory, see modm:architecture:memory for the trait definitions:

linkerscript_heap = """
LONG(0x801f)
LONG(__heap_sdram_start)
LONG(__heap_sdram_end)
"""
env.collect(":platform:cortex-m:linkerscript.table_extern.heap", linkerscript_heap)

Linkerscript collectors are plain text

The collectors here only strip the leading/trailing whitespace and newlines and paste the result as is into the linkerscripts. No input validation is performed, so if you receive linker errors with your additions, please check the GNU LD documentation first.

Blocking Delay¶

The delay functions as defined by modm:architecture:delay are implemented via software loop (ARMv6-M devices) or hardware cycle counter (via DWT->CYCCNT on ARMv7-M device) and have the following limitations expressed in cycles, which depends on the configured CPU frequency:

nanosecond delay is implemented as a tight loop with a minimum delay of <20 cycles, a resolution of 1-4 cycles and a maximum delay of 32-bit cycles.
microsecond delay has a maximum delay of 32-bit cycles.

Compiler Options¶

This module adds these architecture specific compiler options:

-mcpu=cortex-m{type}: the target to compile for.
-mthumb: only Thumb2 instruction set is supported.
-mfloat-abi={soft, softfp, hard}: the FPU ABI: hard is fastest.
-mfpu=fpv{4, 5}-{sp}-d16: single or double precision FPU.
-Wdouble-promotion: if SP-FPU, warn if FPs are promoted to doubles. Note that unless you use the .f suffix or explicitly cast floating point operations to float, floating point constants are of double type, whose storage can result in an increased binary size. While you can add the -fsingle-precision-constant compiler flag to implicitly cast all doubles to floats, this also impacts compile time computations and may reduce accuracy. Therefore it is not enabled by default and you should carefully watch for any unwanted numeric side effects if you use this compiler option. See Semantics of Floating Point Math in GCC.

In addition, these linker options are added:

-nostartfiles: modm implements its own startup script.
-wrap,_{calloc, malloc, realloc, free}_r: reimplemented Newlib with our own allocator.

This module is only available for rp, sam, stm32.

Options¶

enable_dcache¶

Enable Data-Cache

This option is only available for same7x/s7x/v7x, stm32{f7,h7}.

Default: yes
Inputs: [yes, no]

enable_icache¶

Enable Instruction-Cache

This option is only available for same7x/s7x/v7x, stm32{f7,h7}.

Default: yes
Inputs: [yes, no]

float-abi¶

Floating point ABI

This option is only available for sam{d5x/e5x,e7x/s7x/v7x,g5x}, stm32{f3,f4,f7,g4,h7,l4,l5,u5}.

Default: hard
Inputs: [hard, soft, softfp]

linkerscript.flash_offset¶

Offset of FLASH Section Origin

Add an offset to the default start address of the flash memory. This might be required for bootloaders located there.

Vector Table Relocation

Not all offsets are compatible with the vector table relocation.

Default: 0
Inputs: [0 ... 0x10000] samd1x/d2x/dax, stm32{c0,f0,f1,f3,f4,f7,g0,l0,l1,l4}
Inputs: [0 ... 0x100000] sam{d5x/e5x,e7x/s7x/v7x}, stm32{f4,f7,h7,l4,u5}
Inputs: [0 ... 0x1000000] rp2040
Inputs: [0 ... 0x180000] stm32f4
Inputs: [0 ... 0x2000] stm32l0
Inputs: [0 ... 0x20000] samd1x/d2x/dax, stm32{c0,f0,f1,f2,f3,f4,g0,g4,h7,l0,l4,u5}
Inputs: [0 ... 0x200000] same7x/s7x/v7x, stm32{f4,f7,h7,l4,u5}
Inputs: [0 ... 0x30000] stm32l0
Inputs: [0 ... 0x4000] stm32{c0,f0,f1,f3,g0,l0}
Inputs: [0 ... 0x40000] sam{d1x/d2x/dax,d5x/e5x}, stm32{c0,f0,f1,f2,f3,f4,f7,g0,g4,l1,l4,l5,u5}
Inputs: [0 ... 0x400000] stm32u5
Inputs: [0 ... 0x60000] stm32{f3,l1}
Inputs: [0 ... 0x8000] samd1x/d2x/dax, stm32{c0,f0,f3,g0,g4,l0,l1}
Inputs: [0 ... 0x80000] sam{d5x/e5x,e7x/s7x/v7x,g5x}, stm32{f2,f3,f4,f7,g0,g4,h7,l1,l4,l5,u5}

linkerscript.flash_reserved¶

Add a reserved section at the end of the flash.

Default: 0
Inputs: [0 ... 0x10000] samd1x/d2x/dax, stm32{c0,f0,f1,f3,f4,f7,g0,l0,l1,l4}
Inputs: [0 ... 0x100000] sam{d5x/e5x,e7x/s7x/v7x}, stm32{f4,f7,h7,l4,u5}
Inputs: [0 ... 0x1000000] rp2040
Inputs: [0 ... 0x180000] stm32f4
Inputs: [0 ... 0x2000] stm32l0
Inputs: [0 ... 0x20000] samd1x/d2x/dax, stm32{c0,f0,f1,f2,f3,f4,g0,g4,h7,l0,l4,u5}
Inputs: [0 ... 0x200000] same7x/s7x/v7x, stm32{f4,f7,h7,l4,u5}
Inputs: [0 ... 0x30000] stm32l0
Inputs: [0 ... 0x4000] stm32{c0,f0,f1,f3,g0,l0}
Inputs: [0 ... 0x40000] sam{d1x/d2x/dax,d5x/e5x}, stm32{c0,f0,f1,f2,f3,f4,f7,g0,g4,l1,l4,l5,u5}
Inputs: [0 ... 0x400000] stm32u5
Inputs: [0 ... 0x60000] stm32{f3,l1}
Inputs: [0 ... 0x8000] samd1x/d2x/dax, stm32{c0,f0,f3,g0,g4,l0,l1}
Inputs: [0 ... 0x80000] sam{d5x/e5x,e7x/s7x/v7x,g5x}, stm32{f2,f3,f4,f7,g0,g4,h7,l1,l4,l5,u5}

linkerscript.override¶

Path to project provided linkerscript

Default: []
Inputs: [Path]

main_stack_size¶

Minimum size of the application main stack

The ARM Cortex-M uses a descending stack mechanism which is placed so that it grows towards the beginning of RAM. In case of a stack overflow the hardware then attempts to stack into invalid memory which triggers a HardFault. A stack overflow will therefore never overwrite any static or heap memory and this protection works without the MPU and therefore also on ARM Cortex-M0 devices.

If the vector table is relocated into RAM, the start address needs to be aligned to the next highest power-of-two word depending on the total number of device interrupts. On devices where the table is relocated into the same memory as the main stack, an alignment buffer up to 1kB is added to the main stack.

|              ...                |
|---------------------------------|
|    Interrupt Vectors (in RAM)   |
|        (if re-mapped)           | <-- vector table origin
|---------------------------------| <-- main stack top
|           Main Stack            |
|       (grows downwards)         |
|               |                 |
|               v                 |
|---------------------------------|
|  Alignment buffer for vectors   |
|   (overwritten by main stack!)  |
'---------------------------------' <-- RAM origin

Default: 3Ki (3072)
Inputs: [256 .. 3Ki .. 64Ki]

vector_table_location¶

Vector table location in ROM or RAM

The vector table is always stored in ROM and copied to RAM by the startup script if required. You can modify the RAM vector table using the CMSIS NVIC functions:

void NVIC_SetVector(IRQn_Type IRQn, uint32_t vector)
uint32_t NVIC_GetVector(IRQn_Type IRQn)

For applications that do not modify the vector table at runtime, relocation to RAM is not necessary and can save a few hundred bytes of static memory.

By default, the fastest option is chosen depending on the target memory architecture. This does not always mean the table is copied into RAM, and therefore may not be modifiable with this option!

From the ARM Cortex-M4 Technical Reference Manual on exception handling:

Processor state is automatically stored to the stack on an exception, and automatically restored from the stack at the end of the Interrupt Service Routine.

The vector is fetched in parallel to the state saving, enabling efficient interrupt entry.

On Interrupt Latency

Placing main stack and vector table into the same memory can significantly slow down interrupt latency, since both I-Code and D-Code memory interface need to fetch from the same access port.

This option is only available for rp, sam, stm32{c0,f1,f2,f3,f4,f7,g0,g4,h7,l0,l1,l4,l5,u5}.

Default: ram stm32{f3,f7,g4,h7,l4}
Default: rom rp, sam, stm32{c0,f1,f2,f3,f4,g0,l0,l1,l4,l5,u5}
Inputs: [ram, rom]

Collectors¶

linkerscript.memory¶

Additions to the linkerscript's 'MEMORY'

Inputs: [String]

linkerscript.sections¶

Additions to the linkerscript's 'SECTIONS'

Inputs: [String]

linkerscript.table_extern.copy¶

Additions to the linkerscript's '.table.copy.extern' section

Inputs: [String]

linkerscript.table_extern.heap¶

Additions to the linkerscript's '.table.heap' section

Inputs: [String]

linkerscript.table_extern.zero¶

Additions to the linkerscript's '.table.zero.extern' section

Inputs: [String]

Queries¶

linkerscript¶

Computes linkerscript properties (* post-build only):

vector_table_location: ram or rom

Stripped and newline-joined collector values of:

linkerscript_memory
linkerscript_sections
linkerscript_extern_zero
linkerscript_extern_copy
linkerscript_extern_heap

Additional memory properties:

memories: unfiltered memory regions
ram_regions: memory region name with ram in their names
regions: memory region names
cont_ram_regions: all continuous internal SRAM sections
cont_ram: largest continuous internal SRAM section

:returns: dictionary of linkerscript properties

vector_table¶

Computes vector table properties:

vector_table: [position] = Full vector name (ie. with _Handler or _IRQHandler suffix)
vector_table_location: rom or ram
vector_table_size: in Bytes
highest_irq: highest IRQ number + 1
core: cortex-m{0,3,4,7}{,+,f,fd}
exception_frame_size: in Bytes.

The system vectors start at -16, so you must add 16 to highest_irq to get the total number of vectors in the table!

:returns: a dictionary of vector table properties

Dependencies¶

Limited availability: Check with 'lbuild discover' if this module is available for your target!

ARM Cortex-M Core¶

Startup¶

Device Initialization¶

Cache Initialization¶

Additional Initialization¶

Interrupt Vector Table¶

Linkerscript¶

Section .fastdata¶

Section .fastcode¶

Adding Sections¶

Blocking Delay¶

Compiler Options¶

Options¶

enable_dcache¶

enable_icache¶

float-abi¶

linkerscript.flash_offset¶

linkerscript.flash_reserved¶

linkerscript.override¶

main_stack_size¶

vector_table_location¶

Collectors¶

linkerscript.memory¶

linkerscript.sections¶

linkerscript.table_extern.copy¶

linkerscript.table_extern.heap¶

linkerscript.table_extern.zero¶

Queries¶

linkerscript¶

vector_table¶

Dependencies¶

Section `.fastdata`¶

Section `.fastcode`¶