Saturday, January 1, 2011

Introduction to ARM Cortex-M3 Part 2-Programming

Welcome to the second part of the Introduction to ARM Cortex-M3, in part 1 we went through the core features of the Cortex-M3 and the LPC1768. In this part we will focus more on programming the LPC1768 by covering the following points:
  • Toolchain overview
  • Library Tweaks
  • Hardware Interfaces
  • Software Stacks
we have a lot to cover so let's get started...

Toolchain overview
The toolchain of choice is the CodeSourcery toolchain, CodeSourcery is  a gnu-based ARM toolchain developed in partnership with ARM, it's freely available both in source and pre-compiled and uses the embedded C library newlib (by Redhat) as the standard C library.

We're almost good to go, however, we still need drivers. Fortunately, NXP provides a nice driver library for the LPC1768, the library is based on the Cortex Microcontroller Standard Interface (CMSIS) developed by ARM as an abstraction for the core layers, it comes with the startup code, system initialization code, linker script, drivers for all the peripherals plus many examples.

Library Tweaks
I tweaked the Makefile a little bit to build the drivers, startup and system code into a single library, I then installed this library and headers in /usr/local/lpc17xx I think this greatly simplifies my Makefiles since I only need to link one library.

I also added some common initialization code mostly to system_LPC17xx.c to avoid duplicating it in every project. Finally, I tweaked the linker script a little bit. Let's have a detailed look at this code

First order of business is to enable logging. printf, puts and similar routines found in libc eventually call the low-level funciton _write and since it can not really have any useful implementation in newlib, because it's platform dependant, you will need to provide your own to redirect output somewhere. For example, the following redefines _write to redirect output to UART0:
#define stduart (LPC_UART_TypeDef *)LPC_UART0_BASE
int _write(int fd, const void *buf, uint32_t nbyte)
    UART_Send(stduart, buf, nbyte, BLOCKING);
    return nbyte;
Almost all the code I see regardless of the embedded platform needs some sort of a delay/sleep function to wait on some event or just waste time.  There are two ways you can have delays, one is by using loops of instructions that have a known execution time, which is highly inaccurate and could be interrupted, the other way is using counters.

The SysTick timer counts down from the value loaded into one of its registers until it reaches 0 and then it asserts the SysTick IRQ, the value then is reloaded and the timer counts down again and so on.

The following example demonstrates using the SysTick timer for delays. The system tick count is kept in sys_ticks, when sleep is called the value of sys_ticks is saved and then we keep subtracting this value from the current tick count until the difference is equal to or greater than the required delay:
volatile uint32_t sys_ticks; 
void SysTick_Handler(void)__attribute__((weak));

void SysTick_Handler(void) 

void sleep(uint32_t ms) 
    uint32_t curr_ticks = sys_ticks;
    while ((sys_ticks - curr_ticks) < ms);

int main(void) 
    /*Setup SysTick to interrupt every 1 ms*/
    SysTick_Config((SystemCoreClock / 1000) * 1 - 1);
A few things to note here, first, SysTick_Handler is declared weak, when a weak symbol is redefined the second definition is linked instead, so basically,  it can be overridden by the application, later you will see that I redefine it in the RTOS scheduler.

Second thing to note is that setting the timer to interrupt every 1 ms gives reasonable resolution but not necessarily the best throughput so you may wish to tweak that depending on your application, or disable it altogether,  but keep in mind that it affects your timer's resolution.

More RAM
We mentioned before in part1 that the LPC17xx has a second 32k block of SRAM for the Ethernet and USB controllers, however, if you're not using either you may wish utilize this extra memory to your application, you can do so by accessing the memory directly at 0x2007C000 or, more neatly, by defining a new memory region and a new section in the linker script:
/*linker script*/
  rom (rx)   : ORIGIN = 0x00000000, LENGTH = 512K
  ram (rwx)  : ORIGIN = 0x10000000, LENGTH = 32K
  ram2 (rwx) : ORIGIN = 0x2007C000, LENGTH = 32K   /*define memory region*/ 

  .ram2 : /*define section*/
  } >ram2  
 .text : /*other sections*/
And then using the gcc section attribute to place stuff into this section:
/*place buffer in section .ram2*/
uint8_t buffer[BUFFER_SIZE] __attribute__ ((section (".ram2")))={0};
Hardware Interfaces
Next, we will look at initializing and using some of the common hardware interfaces available on the LPC1768, using the NXP driver library.

The serial interface is the most common interface out there, we've seen how to use _write to redirect the output to the USART, now we look at initializing it. The following is an excerpt from my SystemInit function:
#define stduart (LPC_UART_TypeDef *)LPC_UART0_BASE

void SystemInit()
  /*some init code here*/

  PINSEL_CFG_Type pin_cfg={ /*pinsel config*/
      .Funcnum      = 1,
      .Portnum      = 0,
      .Pinmode      = PINSEL_PINMODE_PULLUP,
      .OpenDrain    = PINSEL_PINMODE_NORMAL,

  UART_CFG_Type uart_cfg={ /*UART config*/
      .Baud_rate    = 57600,
      .Databits     = UART_DATABIT_8,
      .Parity       = UART_PARITY_NONE,
      .Stopbits     = UART_STOPBIT_1,
  /*setup tx*/
  pin_cfg.Pinnum  = 2; 

  /*setup rx*/
  pin_cfg.Pinnum  = 3;

  /*Initialize uart*/
  UART_Init(stduart, &uart_cfg);
  UART_TxCmd(stduart, ENABLE);
More advanced things can be done with the USART of course. For example, there are 16 byte  RX/TX hardware FIFOs that can be set to trigger an interrupt or a DMA transfer at a certain level you can find more about this in the datasheet.

Note that when you mount the mbed you may need  to set the baud rate of /dev/ttyACMx using the following command:
stty -F /dev/ttyACM0 speed 57600
SPI is a full duplex serial interface that uses four wires for data transfer, Master In Salve Out (MISO), Master Out Slave In (MOSI), Serial Clock (SCLK) and Slave Select (SSEL).

Slave Select, or Chip Select (CS), is used to select the active slave when multiple slaves are present on the bus.  A few things  about SPI are worth mentioning. First, both end points have shift registers, when one is  loaded with a byte and shifted the other gets shifted too, i.e. exchanged, and so in order to read a byte you must write one too, however, when writing you may ignore the received byte.

Second, the clock polarity (CPOL) and clock phase (CPHA), the clock polarity determines the idle state of the clock, if CPOL = 0 the clock is low when it's idle and high when it's active, CPOL = 1 the clock is high when it's idle and low when it's active. Clock phase determines edge at which the data is sampled, for CPHA = 0 the data is sampled at the first edge, for CPHA = 1 the data is sampled on second edge.

The four combinations of the CPOL and CPHA are called SPI modes, you need to select a mode when configuring SPI, depending on your hardware, note that if CPOL = 0 and CPHA = 0 the clock transitions form low to high when active and the data is sampled on the raising edge of the clock this is the same as when CPOL =1 and CPHA = 1 because data will still be sampled on the raising edge of the clock

A real life example is the oled driver SSD1339 that samples data on the raising edge of the clock, that is, two modes work equally well CPOL = 0/CPHA = 0 and CPOL = 1/CPHA = 1.

Finally, note that the maximum frequency you can set SPI to, according to the datasheet, is 1/8 of the peripheral clock (PCLK) selected in PCLKSEL0/1. Setting  PCLK_SPI to 10b selects the CPU clock (CCLK) as the peripheral clock source  and so f = (CCLK)/8 = 100Mhz/8 that is 12.5 Mhz.

Now this is an example of initializing and using SPI, note that the legacy SPI interface is replaced by SSP which supports SPI among other protocols:
void ssp_init()
    PINSEL_CFG_Type pin_cfg={
        .Portnum    = SSP_PORT,
        .Pinmode    = PINSEL_PINMODE_PULLUP,
        .OpenDrain  = PINSEL_PINMODE_NORMAL,
    SSP_CFG_Type  ssp_cfg ={
        .CPHA           = SSP_CPHA_FIRST,
        .CPOL           = SSP_CPOL_HI,
        .ClockRate      = SSP_CLCK,
        .Databit        = SSP_DATABIT_8,
        .Mode           = SSP_MASTER_MODE,
        .FrameFormat    = SSP_FRAME_SPI

    /*SSP PINSEL configuration*/
    pin_cfg.Funcnum = 2;
    pin_cfg.Pinnum = SSP_MISO;

    pin_cfg.Pinnum = SSP_MOSI;
    pin_cfg.Pinnum = SSP_SCLK;

    pin_cfg.Pinnum = SSP_SSEL;
    /*initialize SSP*/
    SSP_Init(SSP_BASE, &ssp_cfg);

uint8_t ssp_read()
    while(SSP_GetStatus(SSP_BASE, SSP_STAT_BUSY));
    SSP_SendData(SSP_BASE, 0xFF);
    return SSP_ReceiveData(SSP_BASE);

void ssp_write(uint8_t c)
    while(SSP_GetStatus(SSP_BASE, SSP_STAT_BUSY));
    SSP_SendData(SSP_BASE, c);
I2C is a half duplex low speed serial interface, I2C uses two wires Serial Data Line (SDA) and Serial Clock Line (SCL). Both lines are open-collector, an open-collector pin can only pull the signal line low (sink) thus it's active low and requires a pull-up resistor to keep the line high in idle state.
I2C doesn't use a chip select instead each slave on the bus has a unique  7-bit address to which it only responds to. The following excerpt is from the bma180 accelerometer driver I wrote demonstrating the use of I2C:
void bma180_init()
    PINSEL_CFG_Type pin_cfg={
        .Funcnum    = 3,
        .Portnum    = BMA180_I2C_PORT, /*port0 I2C1*/
        .Pinmode    = PINSEL_PINMODE_PULLUP,
        .OpenDrain  = PINSEL_PINMODE_NORMAL

    pin_cfg.Pinnum = BMA180_I2C_SDA;
    pin_cfg.Pinnum = BMA180_I2C_SCL;

    I2C_Init(BMA180_I2C_BASE, BMA180_I2C_CLCK);
    I2C_Cmd(BMA180_I2C_BASE, ENABLE);

    printf("bma180 chip id %d\n", bma180_read_id());

/*reads chip id*/
int8_t bma180_read_id()
    int8_t buf = 0x00;

    I2C_M_SETUP_Type tfr_cfg = {
        .tx_data     = &buf,
        .tx_length   = sizeof(buf),
        .rx_data     = &buf,
        .rx_length   = sizeof(buf),
        .sl_addr7bit = BMA180_I2C_ADDR,
        .retransmissions_max = 3,

    /*write register address/read register value*/
    I2C_MasterTransferData(BMA180_I2C_BASE, &tfr_cfg, I2C_TRANSFER_POLLING);

    return buf;
The LPC1769 has a full fledged 10/100 Ethernet Controller with WOL and other capabilities. Due to it's length I did not include the example here, however, it's  available  in the sources section.

Software Stacks
Nice so we now have a working toolchain and drivers, and we know a bit  on how to initialize and use some common hardware interfaces. Next we look at some essential software stacks.

FatFS is a fat filesystem implementation that is abstracted from the underlying hardware layer, that is, to use it you'll need to provide your own low-level disk initialization and I/O routines that will eventually be called by the FatFS library to read/write sectors from a disk, or from whatever medium you choose to store your data.

I personally use an SDC for my projects, you can find an SDC driver online and port it, or you can use mine, you can find it below in the sources section. The driver is already  integrated with FatFS and tested so it should save you some time.

If you decide to write your own driver, or just want to know how SDC work, I highly recommend the tutorial on the FatFS homepage along with the sdc Simplified Physical Layer Spec.

Once you have managed the low-level I/O, you'll find that FatFS has a really familiar and easy to use interface, here's a small example:
#include <stdio.h>
#include "ff.h"
int main()
    FIL     fp;
    FATFS   ffs;
    UINT    len;
    const char *path = "0:test.txt";
    const char *text = "SDC Test";

    f_mount(0, &ffs);   /*mount ffs work area*/
    f_open(&fp, path, FA_WRITE|FA_CREATE_ALWAYS);
    f_write(&fp, text, strlen(text), &len);
FreeRTOS is an open-source real time OS, A few things worth mentioning when using FreeRTOS, first, the choice of heap allocator, FreeRTOS comes with 3 allocators the first one doesn't free memory, second one allows freeing memory but doesn't handle fragmentations, third and last one is just a wrapper around malloc/free this is the one you should use unless you want FreeRTOS and libc both poking holes in the heap.

Second issue, is the SysTick_Handler, not really an issue since we declared it as weak we can override it here, simply change xPortSysTickHandler to SysTick_Handler.

Finally, we can't use our sleep function because a) SysTickCnt is not updated anymore b) RTOS needs to know about tasks that yield the processor, i.e. go to sleep, so it can schedule other tasks, so I define the following macro in FreeRTOS.h
#define sleep(ms) vTaskDelay(ms) 
This a small FreeRTOS example that schedules two task
#include <stdio.h>
#include "FreeRTOS.h"
#include "task.h"

void task_one(void *args)
  while (1) {    

void task_two(void *args)
  while (1) {    

void vApplicationStackOverflowHook(xTaskHandle *pxTask, signed char *pcTaskName)
    printf("task stack overflow\n", pcTaskName);

int main(void)
    xTaskCreate(task_one,   /*task function              */
                "task_one", /*task name                  */
                256,        /*task stack size in words 1k*/
                NULL,       /*task parameters            */
                1,          /*task priority              */
                NULL);      /*task handle                */
    xTaskCreate(task_two,   /*task function              */
                "task_two", /*task name                  */
                256,        /*task stack size in words 1k*/
                NULL,       /*task parameters            */
                2,          /*task priority              */
                NULL);      /*task handle                */
uIP is an open-source embedded TCP/IP stack. uIP implements ARP, SLIP, IP, TCP, UDP and ICMP and provides two APIs a BSD socket like API and an event-based API. In the sources section you will find a bare minimum example of an ARP-enabled echo server using the event-based API. 

A very powerful MCU like the LPC1768 opens up a lot of possibilities I hope this introduction is enough to help you explore those possibilities. Any feedback is more than welcome & Thank you.

FatFs-SDC driver
hg clone
uIP echo server and client
hg clone
Sparkfun OLED driver
hg clone
The LPC1768 datasheet 
The Definitive Guide to the ARM Cortex-M3
ARM System Developer's Guide: Designing and Optimizing System Software
Designing Embedded Hardware


  1. Thank you so much to document it, it is really helpful for me.

    Actually, I am totally new to this area. What I am doing to porting a program to mbed but not using mbed library. We choose codesoucery and lpc1768 library from NXP, the headache issue we are facing is "printf()". I have struggled on the issue for 2 days, but still no result.

    I followed your implementation about _write() function and add initialization of UART to SystemInit(), but it seems that printf() still not work. I totally have no idea, could please suggest me?

  2. Jin,

    It's hard to tell without seeing the code. However, are you sure your toolchain works to begin with? can you flash one of the leds on the mbed first ? if not then check your linker script, I had many issues with my linker script when I started.

  3. Hi Mux,

    Thanks for your quick response.
    I use LPC17xx.ld copied from from( and it is ok to flash a led.

    I get NXP library from, but there is no LD file in it and sth in makefile is for DOS env. I modified the related things and make it build.

    You mentioned about "LPC1678.ld" in the document, but it can be find in my nxp library package. Anything wrong here?

  4. sorry, correct this in my previous posting.

    You mentioned about "LPC1678.ld" in the document, but it can NOT be find in my nxp library package. Anything wrong here?

  5. Jin,

    I used CodeSourcery's linker script, but modified it a little bit, you can find it here:

  6. hi mux,

    Thanks so much!
    "_write()" do works with your lpc1768.ld. It is really great helpful to us. Thanks again.

    I tested printf() also, it seems not work yet. Is "_write()" be called by printf() internally? If _write() works, will printf() works too?

  7. Jin,

    yes printf calls _write to do the actual output, if _write is working and printf is not, then it's probably using libc's _write and not your override, make sure you copy _write exactly as in the example.

  8. hi Mux,

    I directly include _write() in the main.c and remove -lcs3unhosted in the GROUP(...) of lpc1768.ld.
    Compilation is ok and it is no problem to call _write(0, "hello", 5), but printf() seems hang or corrupt.

    Is it ok to directly include _write() in the my test program?

  9. Jin,

    yes it's okay I guess, If you have the toolchain hosted somewhere, I could take a look at it.

  10. hi Mux,

    Thanks so much. I found another thing that malloc doesn't work also, it seems hang or corrupt when call it. I guess that print() might related to this.

    I have some basic questions, could you help to explain it if you have time?
    1. Normally, we need implement those low level I/O function(such as "_sbrk/_write/_read/_fstat/...") but not link with libcs3unhosted.a for a specified device. Can we just put these functions all in a file together with application?

    2. Should these "-nodefaultlibs -nostartfiles" flags added to LDFLAGS? I am quite sure about this.

    3. About _init() function, in my testing I found that "-lcs3" is necessary to be linked, and I have to manually add "Sourcery_G++_Lite/lib/gcc/arm-none-eabi/4.5.1/thumb2/*.o" for "_init()" function. Is it correct?

  11. Jin,

    First, _sbrk is not an I/O function it's called by malloc to increase the heap, you don't need to implement it as it's already implemented and, this could be the source of your problems, furthermore, you don't need to implement anything else unless you need it.

    I'm not sure about the flags you mention, man gcc, you could check one of my Makefiles though.

    Finally, if you host your toolchain somewhere send me a link I could debug it for you with jtag.

  12. hi mux,

    It works now, the problem is the static variable in .data and .bss are not initialized correctly. After add some code to do the initialization, it works fine.

    "_startup" function seems not provided by newlib, is it? did you met the same problem?

    For _sbrk(), it is implemented in syscall.c which also includes _write(), the library is libcs3unhosted.a. I reimplemented _write(), so did not link with libcs3unhosted.a, then _sbrk() should be added to my own source file. Some wrong here? Please suggest.

    Thanks so much again for your nice instruction.

  13. Hi Mux,

    I must say you have a very clear and informative blog! I've been enjoying the read and have learned a lot.

    I am also having similar issues getting printf to work.

    Would it be possible to post a .tar.gz of a basic project where you implement _write and use printf?

    I feel that will help us know if we've got an issue with our Toolchain or environment.


  14. Nick,

    There are many working examples in the post, especially implementing _write, you could also use my linker script. If you're still having problems try hosting your code somewhere and send me the link.

  15. Mux,

    I appreciate your reply. I tried your linker script but I got another error when compiling. There is something with my environment thats an issue here (i'm new to GCC). Right now I'm using the Debug Framework that ships with CMSIS and thats working fine enough for my needs at the moment (not having JTAG access on the MBED is a little bit of a pain).
    When I get the time to investigate my setup further I'll let you know if I get it working.
    Otherwise thanks again!

  16. hi Mux,

    Have you been tried to port lwIP stack to mbed? We'd like to try to port it, it seems not that easy..


  17. Hi, I'm trying to use your example code for my lpcxpresso 1769. I'm particularly interested in the SD card part. I used the code you supplied and it compiled. Your example works but when I looked closed at it, the f_close function always returns FR_DISK_ERR. Also when I try to add f_sync - this function also always returns FR_DISK_ERR. Your example however saves the file to an SD card and I can read it on my computer.

    The problem is when I try to run the save function in a loop. I'm working on datalogger project and I'm saving data in a loop at constant frequency. This causes the saving to fail after some time. Last time I tried it took 10 minutes and the program froze at while(ssp_read() != SDC_RESP_ROK); in disk_write function in diskio.c file. When I took the card out and put it to the computer the file name was corrupt (ATA C.SV instead of DATA.CSV which I use) but the data was there. Other times the file system seems corrupted...

    I tried upgrading to fatfs 0.09 (I see you use 0.08a) but that doesn't help.

    Please help me if you have any idea what might be wrong...

  18. Hi,

    I will look into it, but for now, try lowering the SPI frequency, in disk_initialize, to say 1Mhz.

  19. Hi, Thank you for your reply. Changing the frequency from 12500000 to 1500000 helps. At least now after 10 minutes I can open the file normally on my computer - the filesystem does not get corrupt or anything.
    BUT the f_sync returns the FR_DISK_ERR all the time - every time I call it - and I call it once a second.
    Also I will have to test the performance for both frequencies.
    btw, why is 12500000 = 12mhz (not 12000000)?

  20. Ok. Forget about my stupid question regarding 12.5MHz - you explain that above. However I'm using LPC1769, which is 120MHz. Maybe this is the reason why 1 500 000 works (it is 1/80 of 120MHz). However 15 000 000 does not work. It crashes very fast on f_write. Tell me what you think.

  21. Actually, the comment should say 12.5Mhz:
    158 ssp_init(12500000); //12Mhz

    As to why f_sync always fails, I think it calls disk_ioctl which is not implemented, you could change disk_ioctl to return RES_OK.

    The highest frequency is 15Mhz for the LPC1769, but the higher the frequency the more susceptible it is to noise, especially that I'm not using the CRC.

  22. Hi,

    I really enjoyed your post, and i am particularly interested in the SD card read/write.

    I downloaded the FatFs-SDC driver and compiled it. it returns with:

    file not found error.

    I have CMSISv2_LPC17xx in my workspace. Am i missing some toolchain?

    I am using lpc1768 with lpc_xpresso ide.

    please advise.

  23. gemio, you'll need to install the CodeSourcery toolchain and build the NXP driver library.

  24. Great post Mux! Concise and full of info.

    I don't have any problems for you to solve, just wanted to say thanks :-)

  25. Ur blog got me started with lpc1768, thank you for this helping blog.

    I wanted to interface a web cam to lpc1768, could you help me with some initial direction, thank you

    1. I'm not sure that's even possible, I would start with something more realistic, like a serial camera.

  26. Thanks a lot! I managed to blink a led :) Would it be possible to have your liblpc1768 project with the startup and system code ? That would be extremely useful to me. Thanks again.

  27. I was recommended this web site by my cousin. I am
    not sure whether this post is written by him as nobody else know such detailed
    about my difficulty. You are amazing! Thanks!

    My page :: slow booting computer ()