ARM Cortex-M Core¶
lbuild module: modm:platform:cortex-m
This module generates the startup code, vector table, linkerscript as well as initialize the heap, deal with assertions, provide blocking delay functions, atomic and unaligned access and the GNU build ID.
Since this is only initializes the generic ARM Cortex-M parts, it delegates
device-specific initialization to the modm:platform:core
module. Please depend
on that module directly instead of this one.
Startup¶
After reset, the ARM Cortex-M hardware jumps to the Reset_Handler()
, which is
implemented as follows:
- The main stack pointer (MSP) is initialized by software.
- Call
__modm_initialize_platform()
to initialize the device hardware. - Call
modm_initialize_platform()
to initialize the custom device hardware. - Copy data to internal RAM.
- Zero sections in internal RAM.
- Initialize ARM Cortex-M core: enable FPU, caches and relocate vector table.
- Execute shared hardware initialization functions.
- Copy data to external RAM.
- Zero sections in external RAM.
- Initialize heap via
__modm_initialize_memory()
(implemented by themodm:platform:heap
module). - Call static constructors.
- Call
main()
application entry point. - If
main()
returns, assert onmain.exit
(only in debug profile). - Reboot if assertion returns.
Device Initialization¶
The __modm_initialize_platform()
function is called directly after reset,
and its purpose is to initialize the device specific hardware, such as enable
internal memories or disable the hardware watchdog timer. You can provide
additional application-specific initialization by overwriting the weakly linked
modm_initialize_platform()
function:
extern "C" void modm_initialize_platform()
{
// Configure power settings before accessing SRAM
}
It's important to understand that because the .data
section has not yet been
copied and the .bss
section has not yet been zeroed, there exists no valid C
environment yet in this function context! This means you cannot use any global
variables, not even "local" static ones defined in your function, and depending
on your hardware you may not even access read-only data (const
variables,
global OR local). In addition, if your linkerscript places the main stack
pointer into a memory that is disabled on reset, you cannot even access the
stack until you've enabled its backing memory. The Reset_Handler
therefore
calls this function in Assembly without accessing the stack.
It is strongly recommended to only read/write registers in this function, and perhaps even write this function in Assembly if deemed necessary.
Cache Initialization¶
For Cortex-M7 devices, the I-Cache is enabled by default. The D-Cache with a
write-back write-allocate policy is only enabled if the modm:platform:dma
module is NOT selected. modm currently does not support allocating DMA buffers
in non-cachable regions or granular cache invalidation. See the
CMSIS-Core Cache API for more information on cache management.
Additional Initialization¶
A few modules need to initialize additional hardware during booting. For example: your device has external memories connected that you want to use for the heap. You can create a function that configures the peripherals for these external memories and place a pointer to this function into a special linker section and the startup script will then call this function before heap initialization.
Since the hardware init functions are called after internal data initialization, you have a valid C environment and thus can access the device normally, but since the calls happen before external data and heap initialization you cannot use the heap in these functions!
You can give a relative global order to your init functions. Ordered init functions are called first, then unordered init functions are called in any order. Please note that order numbers 0 - 999 are reserved for use by modm or other libraries!
Unique init function names
Init function names need to be globally unique for linking. Unfortunately there is no simple way of stringifying C++ functions, so you have to provide a name manually for now.
void init_external_sdram()
{
// configure the hardware here
}
// Startup script calls this function in any order, *after* prioritized functions!
MODM_HARDWARE_INIT(init_external_sdram);
// If you need to pass a C++ function, you need to declare
MODM_HARDWARE_INIT_NAME(init_function_name, namespace::init_function);
// If you need to initialize in a certain order use numbers >= 1000
MODM_HARDWARE_INIT_ORDER(init_before_sdram1, 1000);
// called after init_before_sdram1, since it has a higher order number
MODM_HARDWARE_INIT_NAME_ORDER(init_before_sdram2, namespace::function, 1001);
Interrupt Vector Table¶
The Cortex-M vector table (VTOR) is target-specific and generated using data
from modm-devices. The main stack pointer is allocated according to the
linkerscript and the Reset_Handler
is defined by the startup script.
All handlers are weakly aliased to Undefined_Handler
, which is called if an
IRQ is enabled, but no handler is defined for it. This default handler
determines the currectly active IRQ, sets its priority to the lowest level, and
disables the IRQ from firing again and then asserts on nvic.undef
with the (signed) IRQ number as context.
The lowering of the priority is necessary, since the assertion handlers (see
modm:architecture:assert
) are called from within this active IRQ and its
priority should not prevent logging functionality (which might require a UART
interrupt to flush data out) from working correctly.
Linkerscript¶
This module provides building blocks for GNU ld linkerscripts in the form of
Jinja macros that the modm:platform:core
module assembles into a
linkerscript, depending on the memory architecture of the target chosen.
The following macros are available:
-
copyright()
: Copyright notice. -
prefix()
: ContainsMEMORY
sections, output format and entry symbol and stack size definitions. -
section_vector_rom(memory)
: places the read-only vector table into ROMmemory
. -
section_vector_ram(memory, table_copy)
: places the volatile vector table into RAMmemory
and add it to the copy table. You must satisfy alignment requirements externally. -
section_load(memory, table_copy, sections)
: place each.{section}
insections
intomemory
and add them the copy table. -
section_stack(memory, start=None, suffix="")
: place the main stack intomemory
after moving the location counter tostart
.suffix
can be used to add multiple.stack{suffix}
sections. -
section_heap(memory, name, placement=None, sections=[])
: Add the noloadsections
tomemory
and fill up remaining space inmemory
with heap section.{name}
. Argumentplacement
can be used to place the section into a larger continuous section of whichmemory
is just a subsection. The__{name}_end
will be the maximum of the location counter and thememory
section end address, so that previous sections will push this section back. -
all_heap_sections(table_copy, table_zero, table_heap, props={})
: places the heap sections as described bycont_ram_regions
of thelinkerscript
query. This also adds bss and noinit sections into each region. Theprops
key can be used to override the default0x001f
memory properties. -
section_rom(memory)
: place all read-only sections (.text
,.rodata
etc) intomemory
. -
section_ram(memory, rom, table_copy, table_zero, sections_data=[], sections_bss=[], sections_noinit=[])
: place all volatile sections (.data
,.bss
etc) intomemory
and load fromrom
. Additional sections can be added. -
section_tables(memory, copy, zero, heap)
: place the zero, copy and heap table intomemory
. -
section_rom_start(memory)
: place at ROM start. -
section_rom_end(memory)
: place at ROM end. -
section_debug()
: place debug sections at the very end.
Please consult the modm:platform:core
documentation for the target-specific
arrangement of these section macros and for potential limitations that the
target's memory architecture poses.
Section .fastdata
¶
The .fastdata
section is placed into a device specific data cache or into the
fastest RAM. Please note that the .fastdata
section may be placed into RAM
that is only accessable to the Cortex-M core (via the Data-Bus), which can cause
issues with DMA access. However, the .fastdata
section is not required to be
DMA-able and in such a case the developer needs to place the data into the
generic .data
section or choose a device with a DMA-able fast RAM.
Section .fastcode
¶
The .fastcode
section is placed into a device specific instruction cache (via
I-Code bus) or into the fastest executable RAM (via S-Bus).
From the Cortex-M3 Technical Reference Manual:
14.5 System Interface:
The system interface is a 32-bit AHB-Lite bus. Instruction and vector fetches, and data and debug accesses to the System memory space, 0x20000000 - 0xDFFFFFFF, 0xE0100000 - 0xFFFFFFFF, are performed over this bus.
14.5.6 Pipelined instruction fetches:
To provide a clean timing interface on the System bus, instruction and vector fetch requests to this bus are registered. This results in an additional cycle of latency because instructions fetched from the System bus take two cycles. This also means that back-to-back instruction fetches from the System bus are not possible.
Note: Instruction fetch requests to the ICode bus are not registered. Performance critical code must run from the ICode interface.
Adding Sections¶
The default linkerscripts only describe the internal memory, however, they can
be extended for external memories using the linkerscript.*
collectors of this
module. For example, to add an external 16MB SDRAM to your device and place a
static data section there that is copied from flash and use the remainder for
heap access, these steps need to be performed:
Add the external SDRAM to the linkerscript's MEMORY
statements in the
project.xml
configuration:
<library>
<collectors>
<collect name="modm:platform:cortex-m:linkerscript.memory">
SDRAM (rwx) : ORIGIN = 0xC0000000, LENGTH = 16M
</collect>
<collectors>
</library>
You can also declare this as Python code in a lbuild module.lb
file (useful
for board support packages modules, see modm:board
):
env.collect(":platform:cortex-m:linkerscript.memory",
"SDRAM (rwx) : ORIGIN = 0xC0000000, LENGTH = 16M")
Add a partition of the new memory to the linkerscripts SECTION
statements.
Since collectors order is only preserved locally, make sure to add the sections
that depend on this order in one value. Here the previous value of the SDRAM
location counter is required to "fill up" the remaining memory with the external
heap section:
linkerscript_sections = """
.data_sdram :
{
__data_sdram_load = LOADADDR(.data_sdram);
__data_sdram_start = .;
*(.data_sdram)
. = ALIGN(4);
__data_sdram_end = .;
} >SDRAM AT >FLASH
.heap_sdram (NOLOAD) :
{
__heap_sdram_start = .;
. = ORIGIN(SDRAM) + LENGTH(SDRAM);
__heap_sdram_end = .;
} >SDRAM
"""
env.collect(":platform:cortex-m:linkerscript.sections", linkerscript_sections)
Next, add the sections that need to be copied from ROM to RAM, here the contents
of the .data_sdram
section is stored in the internal FLASH
memory and needs
to be copied into SDRAM during the startup:
linkerscript_copy = """
LONG(__data_sdram_load)
LONG(__data_sdram_start)
LONG(__data_sdram_end)
"""
env.collect(":platform:cortex-m:linkerscript.table_extern.copy", linkerscript_copy)
And finally, to register the remaining memory in SDRAM with the allocator, add
the memory range to the heap table. Remember to use the correct memory traits
for this memory, see modm:architecture:memory
for the trait definitions:
linkerscript_heap = """
LONG(0x801f)
LONG(__heap_sdram_start)
LONG(__heap_sdram_end)
"""
env.collect(":platform:cortex-m:linkerscript.table_extern.heap", linkerscript_heap)
Linkerscript collectors are plain text
The collectors here only strip the leading/trailing whitespace and newlines and paste the result as is into the linkerscripts. No input validation is performed, so if you receive linker errors with your additions, please check the GNU LD documentation first.
Blocking Delay¶
The delay functions as defined by modm:architecture:delay
are implemented via
software loop (ARMv6-M devices) or hardware cycle counter (via DWT->CYCCNT
on
ARMv7-M device) and have the following limitations expressed in cycles, which
depends on the configured CPU frequency:
- nanosecond delay is implemented as a tight loop with a minimum delay of <20 cycles, a resolution of 1-4 cycles and a maximum delay of 32-bit cycles.
- microsecond delay has a maximum delay of 32-bit cycles.
Compiler Options¶
This module adds these architecture specific compiler options:
-mcpu=cortex-m{type}
: the target to compile for.-mthumb
: only Thumb2 instruction set is supported.-mfloat-abi={soft, softfp, hard}
: the FPU ABI:hard
is fastest.-mfpu=fpv{4, 5}-{sp}-d16
: single or double precision FPU.-Wdouble-promotion
: if SP-FPU, warn if FPs are promoted to doubles. Note that unless you use the.f
suffix or explicitly cast floating point operations tofloat
, floating point constants are ofdouble
type, whose storage can result in an increased binary size. While you can add the-fsingle-precision-constant
compiler flag to implicitly cast all doubles to floats, this also impacts compile time computations and may reduce accuracy. Therefore it is not enabled by default and you should carefully watch for any unwanted numeric side effects if you use this compiler option. See Semantics of Floating Point Math in GCC.
In addition, these linker options are added:
-nostartfiles
: modm implements its own startup script.-wrap,_{calloc, malloc, realloc, free}_r
: reimplemented Newlib with our own allocator.
This module is only available for rp, sam, stm32.
Options¶
float-abi¶
Floating point ABI
This option is only available for sam{d5x/e5x,e7x/s7x/v7x,g5x}, stm32{f3,f4,f7,g4,h7,l4,l5,u5}.
Default: hard
Inputs: [hard, soft, softfp]
linkerscript.flash_offset¶
Offset of FLASH Section Origin
Add an offset to the default start address of the flash memory. This might be required for bootloaders located there.
Vector Table Relocation
Not all offsets are compatible with the vector table relocation.
Default: 0
Inputs: [0 ... 0x10000]
samd1x/d2x/dax, stm32{c0,f0,f1,f3,f4,f7,g0,l0,l1,l4}
Inputs: [0 ... 0x100000]
sam{d5x/e5x,e7x/s7x/v7x}, stm32{f4,f7,h7,l4,u5}
Inputs: [0 ... 0x1000000]
rp2040
Inputs: [0 ... 0x180000]
stm32f4
Inputs: [0 ... 0x2000]
stm32l0
Inputs: [0 ... 0x20000]
samd1x/d2x/dax, stm32{f0,f1,f2,f3,f4,g0,g4,h7,l0,l4,u5}
Inputs: [0 ... 0x200000]
same7x/s7x/v7x, stm32{f4,f7,h7,l4,u5}
Inputs: [0 ... 0x30000]
stm32l0
Inputs: [0 ... 0x4000]
stm32{c0,f0,f1,f3,g0,l0}
Inputs: [0 ... 0x40000]
sam{d1x/d2x/dax,d5x/e5x}, stm32{f0,f1,f2,f3,f4,f7,g0,g4,l1,l4,l5,u5}
Inputs: [0 ... 0x400000]
stm32u5
Inputs: [0 ... 0x60000]
stm32{f3,l1}
Inputs: [0 ... 0x8000]
samd1x/d2x/dax, stm32{c0,f0,f3,g0,g4,l0,l1}
Inputs: [0 ... 0x80000]
sam{d5x/e5x,e7x/s7x/v7x,g5x}, stm32{f2,f3,f4,f7,g0,g4,h7,l1,l4,l5,u5}
linkerscript.flash_reserved¶
Add a reserved section at the end of the flash.
Default: 0
Inputs: [0 ... 0x10000]
samd1x/d2x/dax, stm32{c0,f0,f1,f3,f4,f7,g0,l0,l1,l4}
Inputs: [0 ... 0x100000]
sam{d5x/e5x,e7x/s7x/v7x}, stm32{f4,f7,h7,l4,u5}
Inputs: [0 ... 0x1000000]
rp2040
Inputs: [0 ... 0x180000]
stm32f4
Inputs: [0 ... 0x2000]
stm32l0
Inputs: [0 ... 0x20000]
samd1x/d2x/dax, stm32{f0,f1,f2,f3,f4,g0,g4,h7,l0,l4,u5}
Inputs: [0 ... 0x200000]
same7x/s7x/v7x, stm32{f4,f7,h7,l4,u5}
Inputs: [0 ... 0x30000]
stm32l0
Inputs: [0 ... 0x4000]
stm32{c0,f0,f1,f3,g0,l0}
Inputs: [0 ... 0x40000]
sam{d1x/d2x/dax,d5x/e5x}, stm32{f0,f1,f2,f3,f4,f7,g0,g4,l1,l4,l5,u5}
Inputs: [0 ... 0x400000]
stm32u5
Inputs: [0 ... 0x60000]
stm32{f3,l1}
Inputs: [0 ... 0x8000]
samd1x/d2x/dax, stm32{c0,f0,f3,g0,g4,l0,l1}
Inputs: [0 ... 0x80000]
sam{d5x/e5x,e7x/s7x/v7x,g5x}, stm32{f2,f3,f4,f7,g0,g4,h7,l1,l4,l5,u5}
linkerscript.override¶
Path to project provided linkerscript
Default: []
Inputs: [Path]
main_stack_size¶
Minimum size of the application main stack
The ARM Cortex-M uses a descending stack mechanism which is placed so that it grows towards the beginning of RAM. In case of a stack overflow the hardware then attempts to stack into invalid memory which triggers a HardFault. A stack overflow will therefore never overwrite any static or heap memory and this protection works without the MPU and therefore also on ARM Cortex-M0 devices.
If the vector table is relocated into RAM, the start address needs to be aligned to the next highest power-of-two word depending on the total number of device interrupts. On devices where the table is relocated into the same memory as the main stack, an alignment buffer up to 1kB is added to the main stack.
| ... |
|---------------------------------|
| Interrupt Vectors (in RAM) |
| (if re-mapped) | <-- vector table origin
|---------------------------------| <-- main stack top
| Main Stack |
| (grows downwards) |
| | |
| v |
|---------------------------------|
| Alignment buffer for vectors |
| (overwritten by main stack!) |
'---------------------------------' <-- RAM origin
Default: 3Ki (3072)
Inputs: [256 .. 3Ki .. 64Ki]
vector_table_location¶
Vector table location in ROM or RAM
The vector table is always stored in ROM and copied to RAM by the startup script if required. You can modify the RAM vector table using the CMSIS NVIC functions:
void NVIC_SetVector(IRQn_Type IRQn, uint32_t vector)
uint32_t NVIC_GetVector(IRQn_Type IRQn)
For applications that do not modify the vector table at runtime, relocation to RAM is not necessary and can save a few hundred bytes of static memory.
By default, the fastest option is chosen depending on the target memory architecture. This does not always mean the table is copied into RAM, and therefore may not be modifiable with this option!
From the ARM Cortex-M4 Technical Reference Manual on exception handling:
- Processor state is automatically stored to the stack on an exception, and automatically restored from the stack at the end of the Interrupt Service Routine.
- The vector is fetched in parallel to the state saving, enabling efficient interrupt entry.
On Interrupt Latency
Placing main stack and vector table into the same memory can significantly slow down interrupt latency, since both I-Code and D-Code memory interface need to fetch from the same access port.
This option is only available for rp, sam, stm32{c0,f1,f2,f3,f4,f7,g0,g4,h7,l0,l1,l4,l5,u5}.
Default: ram
stm32{f3,f7,g4,h7,l4}
Default: rom
rp, sam, stm32{c0,f1,f2,f3,f4,g0,l0,l1,l4,l5,u5}
Inputs: [ram, rom]
Collectors¶
linkerscript.memory¶
Additions to the linkerscript's 'MEMORY'
Inputs: [String]
linkerscript.sections¶
Additions to the linkerscript's 'SECTIONS'
Inputs: [String]
linkerscript.table_extern.copy¶
Additions to the linkerscript's '.table.copy.extern' section
Inputs: [String]
linkerscript.table_extern.heap¶
Additions to the linkerscript's '.table.heap' section
Inputs: [String]
linkerscript.table_extern.zero¶
Additions to the linkerscript's '.table.zero.extern' section
Inputs: [String]
Queries¶
linkerscript¶
Computes linkerscript properties (* post-build only):
vector_table_location
: ram or rom
Stripped and newline-joined collector values of:
linkerscript_memory
linkerscript_sections
linkerscript_extern_zero
linkerscript_extern_copy
linkerscript_extern_heap
Additional memory properties:
memories
: unfiltered memory regionsram_regions
: memory region name withram
in their namesregions
: memory region namescont_ram_regions
: all continuous internal SRAM sectionscont_ram
: largest continuous internal SRAM section
:returns: dictionary of linkerscript properties
vector_table¶
Computes vector table properties:
vector_table
: [position] = Full vector name (ie. with_Handler
or_IRQHandler
suffix)vector_table_location
: rom or ramvector_table_size
: in Byteshighest_irq
: highest IRQ number + 1core
: cortex-m{0,3,4,7}{,+,f,fd}exception_frame_size
: in Bytes.
The system vectors start at -16, so you must add 16 to highest_irq
to get
the total number of vectors in the table!
:returns: a dictionary of vector table properties
Dependencies¶
Limited availability: Check with 'lbuild discover' if this module is available for your target!