Although the Raspberry-Pi comes with a good Linux distribution, the Pi is about software development, and sometimes we want a real-time system without an operating system. I decided it'd be great to do a tutorial outside of Linux to get to the resources of this great piece of hardware in a similar vein to the Cambridge University Tutorials which are excellently written.
However, they don't create an OS as purported and they stick with assembler rather than migrating to C. I will simply start with nothing but assembler to get us going, but switch to C as soon as possible.
The C compiler simply converts C syntax to assembler and then assembles this into executable code for us anyway.
I highly recommend going through the Cambridge University Raspberry Pi tutorials as they are excellent. If you want to learn a bit of assembler too, then definitely head off to there! These pages provide a similar experience, but with the additional of writing code in C and understanding the process behind that.
- Why is a card of 16MiB necessary when we're not using anywhere near that?
There are quite a few versions of the RPi these days. This part of the tutorial supports the following models:
- RPi Model A
- RPi Model B
- RPi Zero
- RPi Zero W
- RPi Model B+
- RPi 2 Model B
- RPi 3 Model B
- RPi 4 Model B
!!! note It is not an error that the RPI 3 Model B+ is not included in this list. The ACK LED is only available through the mailbox interface (available from part-4 of the tutorial) and so cannot be used directly by the GPIO peripheral which we'll be using in this part of the tutorial.
ARM have now taken over the arm-gcc-embedded project and are provided the releases, so pop over to the ARM gcc downloads section and pick up a toolchain.
I've just grabbed 7.3.1 from the download page and I've locally installed it on my Linux machine to use with this tutorial.
This is what I get when I run this on my command line having decompressed the archive:
~/arm-tutorial-rpi/compiler/gcc-arm-none-eabi-7-2018-q2-update/bin $ ./arm-none-eabi-gcc --version
arm-none-eabi-gcc (GNU Tools for Arm Embedded Processors 7-2018-q2-update) 7.3.1 20180622 \
(release) [ARM/embedded-7-branch revision 261907]
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Cool.
You can use the compiler/get_compiler.sh
script as a short-cut to get the compiler and the
tutorial scripts will make use of it if you do.
Previously with a prior stab at this tutorial I always linked and recommended a fixed, known-working version of the compiler because everything then just worked out of the box. However, now I'm saying - just get the latest and then we can fix the tutorial as things break.
NOTE: The disassembly listing you get may vary slightly from those generated by the gcc version being used in this tutorial if you're using a different compiler version or option set.
The eLinux page gives us the optimal GCC settings for compiling code for the original Raspberry-Pi (V1):
-Ofast -mfpu=vfp -mfloat-abi=hard -march=armv6zk -mtune=arm1176jzf-s
It is noted that -Ofast may cause problems with some compilations, so it is probably better that we stick with the more traditional -O2 optimisation setting. The other flags merely tell GCC what type of floating point unit we have, tell it to produce hard-floating point code (GCC can create software floating point support instead), and tells GCC what ARM processor architecture we have so that it can produce optimal and compatible assembly/machine code.
For the Raspberry-Pi 2 we know that the architecture is different. The ARM1176 from the original pi has been replaced by a quad core Cortex A7 processor. Therefore, in order to compile effectively for the Raspberry-Pi 2 we use a different set of compiler options:
-O2 -mfpu=neon-vfpv4 -mfloat-abi=hard -march=armv7-a -mtune=cortex-a7
You can see from the ARM specification of the Cortex A7 that it contains a VFPV4 (See section 1.2.1) floating point processor and a NEON engine. The settings are gleaned from the GCC ARM options page.
Like this:
-O2 -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard -march=armv8-a+crc -mcpu=cortex-a53
From the Raspberry Pi Foundation page for the RPi4 we can glean some information from the technical specifications regarding what we need to do in order to compile code for the RPi4.
All four processors are A72
. From the ARM documentation we can see that these implement the
armv8-a
architecture. This is the same as the A53
's found in the RPi3 so we can go ahead and
use the same crypto-neon-fp-armv8
floating point unit option for the RPI4. This is detailed in
the v8 architecture programmers guide
-O2 -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard -march=armv8-a+crc -mcpu=cortex-a72
The schematics you would think would be the place to get the ACT LED GPIO port number, but alas, they're so sparse they may as well not have bothered releasing them. Seriously - what they've released is a joke.
Instead we get it from some of the device tree source code for the RPi4.
Also make sure you've got the latest firmware, fixes are always being introduced!
In order to use a C compiler, we need to understand what the compiler does and what the linker does in order to generate executable code. The compiler converts C statements into assembler and performs optimisation of the assembly instructions. This is in-fact all the C compiler does!
The C compiler then implicitly calls the assembler to assemble that file (usually a temporary) into an object file. This will have relocatable machine code in it along with symbol information for the linker to use. These days the C compiler pipes the assembly to the assembler so there is no intermediate file as creating files is a lot slower than passing data from one program to another through a pipe.
The linker's job is to link everything into an executable file. The linker requires a linker script. The linker script tells the linker how to organise the various object files. The linker will resolve symbols to addresses when it has arranged all the objects according to the rules in the linker script.
What we're getting close to here is that a C program isn't just the code we type. There are some fundamental things that must happen for C code to run. For example, some variables need to be initialised to certain values, and some variables need to be initialised to 0. This is all taken care of by an object file which is usually implicitly linked in by the linker because the linker script will include a reference to it. The object file is called crt0.o (C Run-Time zero)
This code uses symbols that the linker can resolve to clear the start of the area where initialised
variables starts and ends in order to zero this memory section. It generally sets up a stack
pointer, and it always includes a call to _main
. Here's an important note: symbols present in C
code get prepended with an underscore in the generation of the assembler version of the code. So
where the start of a C program is the main symbol, in assembler we need to refer to it as it's
assembler version which is _main
.
All of the source in the tutorials is available from the Github repo. So go clone or fork now so you have all the code to compile and modify as you work through the tutorials.
git clone https://github.com/BrianSidebotham/arm-tutorial-rpi
If you're on Windows - these day's I'll say, just get with the program and get yourself a Linux install. If you're entering the world of Raspberry Pi and/or embedded devices Linux is going to be your friend and will give you everything you need. This tutorial used to support both Linux and Windows, but I have no Windows installs left and so can't cover off Windows. If someone else takes that on, I'd fully support them in updating the tutorial - it's all on Github.
Let's have a look at compiling one of the simplest programs that we can. Let's compile and link the
following program (part-1/armc-00
):
int main(void)
{
while(1)
{
}
return 0;
}
In order to compile the code (I realise there's not much to that code!) we can use the build.sh
script in the tutorial directory. Navigate to part-1/armc-00
and run ./build.sh
. With no
arguments this script will just show you what it expects in order to run.
arm-tutorial-rpi/part-1/armc-00 $ ./build.sh
usage: build.sh <pi-model>
pi-model options: rpi0, rpi1, rpi1bp, rpi2, rpi3, rpi3bp, rpi4
As there are different compiler flags for the various RPI models, it's necessary to tell the script what RPI you have in order to use the correct flags to compile with.
The V1 boards are fitted with the Broadcom BCM2835 (ARM1176) and the V2 board uses the BCM2836 (ARM Cortex A7). The RPI3 uses a Cortex-A53. Because of the processor differences, we use different build commands to build for the various RPI models.
Let's have a look at what the compiler command lines look like for the various RPI models.
Let's just concentrate on the RPI-specific options rather than including all the options here.
arm-none-eabi-gcc \
-mfloat-abi=hard \
-mfpu=vfp \
-march=armv6zk \
-mtune=arm1176jzf-s \
main.c -o main.elf
arm-none-eabi-gcc \
-mfloat-abi=hard \
-mfpu=neon-vfpv4 \
-march=armv7-a \
-mtune=cortex-a7 \
main.c -o main.elf
arm-none-eabi-gcc \
-mfloat-abi=hard \
-mfpu=crypto-neon-fp-armv8 \
-march=armv8-a+crc \
-mcpu=cortex-a53 \
main.c -o main.elf
arm-none-eabi-gcc \
-mfloat-abi=hard \
-mfpu=crypto-neon-fp-armv8 \
-march=armv8-a+crc \
-mcpu=cortex-a72 \
main.c -o main.elf
THIS IS EXPECTED TO FAIL: Using the build script, let's compile the basic source code using the RPI specific options. Here, we compile for the RPI3. (I've shortened the output so it's easier to read on-screen).
valvers-new/arm-tutorial-rpi/part-1/armc-00 $ ./build.sh rpi3
arm-none-eabi-gcc -mfloat-abi=hard \
-mfpu=crypto-neon-fp-armv8 \
-march=armv8-a+crc \
-mcpu=cortex-a53 \
armc-00.c \
-o kernel.armc-00.rpi3.elf
GCC does successfully compile the source code (there are no C errors in it), but the linker fails with the following message:
.../hard/libc.a(lib_a-exit.o): In function `exit':
exit.c:(.text.exit+0x1c): undefined reference to '_exit'
collect2: error: ld returned 1 exit status
So with our one-line command above we're invoking the C compiler, the assembler and the linker. The C compiler does most of the menial tasks for us to make life easier for us, but because we're embedded engineers (aren't we?) we MUST be aware of how the compiler, assembler and linker work at a very low level as we generally work with custom systems which we must describe intimately to the tool-chain.
So there's a missing _exit
symbol. This symbol is reference by the C library we're using. It is
in-fact a system call. It's designed to be implemented by the OS. It would be called when a program
terminates. In our case, we are our own OS at we're the only thing running, and in fact we will
never exit so we do not need to really worry about it. System calls can be blank, they just merely
need to be provided in order for the linker to resolve the symbol.
So the C library has a requirement of system calls. Sometimes these are already implemented as blank functions, or implemented for fixed functionality. For a list of system calls see the newlib documentation on system calls. Newlib is an open source, and lightweight C library which can be compiled in many different flavours.
The C library is what provides all of the C functionality found in standard C header files such as
stdio.h
, stlib.h
, string.h
, etc.
At this point I want to note that the standard Hello World example won't work here without an OS,
and it is exactly unimplemented system calls that prevent it from being our first example. The
lowest part of printf(...)
includes a write function write
- this function is used by all of
the functions in the C library that need to write to a file. In the case of printf, it needs to
write to the file stdout. Generally when an OS is running stdout produces output visible on a
screen which can then be piped to another file system file by the OS. Without an OS, stdout
generally prints to a UART to so that you can see program output on a remote screen such as a PC
running a terminal program. We will discuss write implementations later on in the tutorial series,
let's move on...
The easiest way to fix the link problem is to provide a minimal exit function to satisfy the
linker. As it is never going to be used, all we need to do is shut the linker up and let it
resolve _exit
. So now, again with the build.sh
script in armc-01
we can compile the next
version of the code, part-1/armc-01.c
int main(void)
{
while(1)
{
}
return 0;
}
void exit(int code)
{
while(1)
;
}
NOTE: In case you're wondering, the C compiler prefixes an underscore to the generated symbols for functions, so we don't include an underscore, otherwise we'd end up with a function that the linker sees as
__exit
. If we were writing this in an assembler file, we'd have to include the underscore prefix ourselves.
As we can see, compilation is successful and we get a kernel*.elf
file generated by the compiler.
Currently, that elf file is 37k.
part-1/armc-01 $ ./build.sh rpi3
arm-none-eabi-gcc -mfloat-abi=hard \
-mfpu=crypto-neon-fp-armv8 \
-march=armv8-a+crc \
-mcpu=cortex-a53 \
armc-01.c \
-o kernel.armc-01.rpi3.elf
part-1/armc-01 $ ls -lh
total 16K
-rw-r--r-- 1 brian brian 366 Sep 21 00:19 armc-01.c
-rwxr-xr-x 1 brian brian 2.3K Sep 21 00:45 build.sh
-rwxr-xr-x 1 brian brian 36K Sep 21 00:45 kernel.armc-01.rpi3.elf
It's important to have an infinite loop in the exit function. In the C library, which is not
intended to be used with an operating system (hence arm-NONE-eabi-*
), _exit
is marked as being
noreturn. We must make sure it doesn't return otherwise we will get a warning about it. The
prototype for _exit
always includes an exit code int too. Yes, that's a bit oxymoronic!
Now using the same build command above we get a clean build! Yay! But there is really a problem, in order to provide a system underneath the C library we will have to provide linker scripts and our own C Startup code. In order to skip that initially and to simply get up and running we'll just use GCC's option not to include any of the C startup routines, which excludes the need for exit too.
The GCC options for that is -nostartfiles
As in the Cambridge tutorials we will copy their initial example of illuminating an LED in order to know that our code is running correctly. This is nearly always the embedded developer's "Hello World!". Usually we'll blink the LED to make sure code is running and to know we're getting clocks at the speed we think we should.
First, let's have a look at how a Raspberry-Pi processor boots. The BCM2385 from Broadcom includes two processors that we should know about, one is a Videocore(tm) GPU which is why the Raspberry-Pi makes such a good media-centre and the other is the ARM core which runs the operating system. Both of these processors share the peripheral bus and also have to share some interrupt resources. Although in this case, share means that some interrupt sources are not available to the ARM processor because they are already taken by the GPU.
The GPU starts running at reset or power on and includes code to read the first FAT partition of the SD Card on the MMC bus. It searches for and loads a file called bootcode.bin into memory and starts execution of that code. The bootcode.bin bootloader in turn searches the SD card for a file called start.elf and a config.txt file to set various kernel settings before searching the SD card again for a kernel.img file which it then loads into memory at a specific address (0x8000) and starts the ARM processor executing at that memory location. The GPU is now up and running and the ARM will start to come up using the code contained in kernel.img. The start.elf file contains the code that runs on the GPU to provide most of the requirements of OpenGL, etc.
Therefore in order to boot your own code, you need to firstly compile your code to an executable and name it kernel.img, and put it onto a FAT formatted SD Card, which has the GPU bootloader (bootcode.bin, and start.elf) on it as well. The latest Raspberry-Pi firmware is available on GitHub. The bootloader is located under the boot sub-directory. The rest of the firmware provided is closed-binary video drivers. They are compiled for use under Linux so that accelerated graphics drivers are available. As we're not using Linux these files are of no use to us, only the bootloader firmware is.
All this means that the processor is already up and running when it starts to run our code. Clock sources and PLL settings are already decided and programmed in the bootloader which alleviates that problem from us. We get to just start messing with the devices registers from an already running core. This is something I'm not that used too, normally the first thing in my code would be setting up correct clock and PLL settings to initialise the processor, but the GPU has setup the basic clocking scheme for us.
The first thing we will need to set up is the GPIO controller. There are no drivers we can rely on as there is no OS running, all the bootloader has done is boot the processor into a working state, ready to start loading the OS.
You'll need to get the Raspberry-Pi BCM2835 peripherals datahsheet, and make sure you pay attention to the errata for that too as it's not perfect. This gives us the information we require to control the IO peripherals of the BCM2835. I'll guide us through using the GPIO peripheral - there are as always some gotcha's:
The Raspberry-Pi 2B 1.2 uses the BMC2837 and so you'll want to get the Raspberry-Pi BCM2837 peripherals datahsheet.
NOTE: The 2837 peripherals document is just a modified version of the original 2835 document with the addresses updated to suit the 2837's base peripheral address. See rpi issue 325 for further details.
We'll be using the GPIO peripheral, and it would therefore be natural to jump straight to that documentation and start writing code, but we need to first read some of the 'basic' information about the processor. The important bit to note is the virtual address information. On page 5 of the BCM2835 peripherals page we see an IO map for the processor. Again, as embedded engineers we must have the IO map to know how to address peripherals on the processor and in some cases how to arrange our linker scripts when there are multiple address spaces.
The VC CPU Bus addresses relate to the Broadcom Video Core CPU. Although the Video Core CPU is what
bootloads from the SD Card, execution is handed over to the ARM core by the time our kernel.img
code is called. So we're not interested in the VC CPU Bus addresses.
The ARM Physical addresses is the processors raw IO map when the ARM Memory Management Unit (MMU) is not being used. If the MMU is being used, the virtual address space is what what we'd be interested in.
Before an OS kernel is running, the MMU is also not running as it has not been initialised and the
core is running in kernel mode. Addresses on the bus are therefore accessed via their ARM Physical
Address. We can see from the IO map that the VC CPU Address 0x7E000000
is mapped to ARM Physical
Address 0x20000000
for the original Raspberry Pi. This is important!
If you read the two peripheral datasheets carefully you'll see a subtle difference in them, notably
, the Raspberry-Pi 2 has the ARM IO base set to 0x3F000000
instead of the original 0x20000000
of the original Raspberry-Pi. Unfortunately for us software
engineers the Raspberry-Pi foundation don't appear to be good at securing the documentation we need,
in fact, their attitude suggests
they think we're magicians and don't actually need any. What a shame! Please if you're a member of
the forum, campaign for more documentation. As engineers, especially in industry we wouldn't accept
this from a manufacturer, we'd go elsewhere! In fact, we did at my work and use the TI Cortex A8
from the Beaglebone Black, a very good and well documented SoC!
Anyway, the base address can be gleaned from searching for uboot patches. The Raspberry Pi 2 uses a BCM2836 so we can search for that and u-boot and we come along a patch for supporting the Raspberry-Pi 2.
Further on in the manual we come across the GPIO peripheral section of the manual (Chapter 6, page 89).
RPi4 has the peripheral base mapped to 0xFE000000
. The peripheral address space looks to be laid
out the same as the previous pis.
Finally, let's get on and see some of our code running on the Raspberry-Pi. We'll continue with using the first example of the Cambridge tutorials by lighting the OK LED on the Raspberry-Pi board.
The GPIO peripheral has a base address in the BCM2835 manual at 0x7E200000
. We know from getting
to know our processor that this translates to an ARM Physical Address of 0x20200000
(0x3F200000
for RPI2 and RPI3, and 0xFE200000
for RPI4). This is the first register in the GPIO peripheral
register set, the GPIO Function Select 0
register.
In order to use an IO pin, we need to configure the GPIO peripheral. From the Raspberry-Pi schematic diagrams the OK LED is wired to the GPIO16 line (Sheet 2, B5) . The LED is wired active LOW - this is fairly standard practice. It means to turn the LED on we need to output a 0 (the pin is connected to 0V by the processor) and to turn it off we output a 1 (the pin is connected to VDD by the processor).
Unfortunately, again, lack of documentation is rife and we don't have schematics for the Raspberry-Pi 2 or plus models! This is important because the GPIO lines were re-jigged and as Florin has noted in the comments section, the Raspberry Pi Plus configuration has the LED on GPIO47, so I've added the changes in brackets below for the RPI B+ models (Which includes the RPI 2).
Back to the processor manual and we see that the first thing we need to do is set the GPIO pin to an output. This is done by setting the function of GPIO16 (GPIO47 RPI+) to an output.
Bits 18 to 20 in the GPIO Function Select 1
register control the GPIO16 pin.
Bits 21 to 23 in the GPIO Function Select 4
register control the GPIO47 pin. (RPI B+)
Bits 27 to 29 in the GPIO Function Select 2
register control the GPIO29 pin. (RPI3 B+)
GPIO42 pin. (RPI4)
In C, we will generate a pointer to the register and use the pointer to write a value into the register. We will mark the register as volatile so that the compiler explicitly does what I tell it to. If we do not mark the register as volatile, the compiler is free to see that we do not access this register again and so to all intents and purposes the data we write will not be used by the program and the optimiser is free to throw away the write because it has no effect.
The effect however is definitely required, but is only externally visible (the mode of the GPIO pin changes). We inform the compiler through the volatile keyword to not take anything for granted on this variable and to simply do as I say with it:
We will use pre-processor definitions to change the base address of the GPIO peripheral depending on what RPI model is being targeted.
#if defined( RPI0 ) || defined( RPI1 )
#define GPIO_BASE 0x20200000UL
#elif defined( RPI2 ) || defined( RPI3 )
#define GPIO_BASE 0x3F200000UL
#elif defined( RPI4 )
#define GPIO_BASE 0xFE200000UL
#else
#error Unknown RPI Model!
#endif
In order to set GPIO16 as an output then we need to write a value of 1 in the relevant bits of the function select register. Here we can rely on the fact that this register is set to 0 after a reset and so all we need to do is set:
/* Assign the address of the GPIO peripheral (Using ARM Physical Address) */
gpio = (unsigned int*)GPIO_BASE;
gpio[LED_GPFSEL] |= (1 << LED_GPFBIT);
This code looks a bit messy, but we will tidy up and optimise later on. For now we just want to get to the point where we can light an LED and understand why it is lit!
The ARM GPIO peripherals have an interesting way of doing IO. It's actually a bit different to most other processor IO implementations. There is a SET register and a CLEAR register. Writing 1 to any bits in the SET register will SET the corresponding GPIO pins to 1 (logic high), and writing 1 to any bits in the CLEAR register will CLEAR the corresponding GPIO pins to 0 (logic low). There are reasons for this implementation over a register where each bit is a pin and the bit value directly relates to the pins output level, but it's beyond the scope of this tutorial.
So in order to light the LED we need to output a 0. We need to write a 1 to bit 16 in the CLEAR register:
*gpio_clear |= (1<<16);
Putting what we've learnt into the minimal example above gives us a program that compiles and links
into an executable which should provide us with a Raspberry-Pi that lights the OK LED when it is
powered. Here's the complete code we'll compile part-1/armc-02
/* The base address of the GPIO peripheral (ARM Physical Address) */
#if defined( RPI0 ) || defined( RPI1 )
#define GPIO_BASE 0x20200000UL
#elif defined( RPI2 ) || defined( RPI3 )
#define GPIO_BASE 0x3F200000UL
#elif defined( RPI4 )
/* This comes from the linux source code:
https://github.com/raspberrypi/linux/blob/rpi-4.19.y/arch/arm/boot/dts/bcm2838.dtsi */
#define GPIO_BASE 0xFE200000UL
#else
#error Unknown RPI Model!
#endif
/* TODO: Expand this to RPi4 as necessary */
#if defined( RPIBPLUS ) || defined( RPI2 )
#define LED_GPFSEL GPIO_GPFSEL4
#define LED_GPFBIT 21
#define LED_GPSET GPIO_GPSET1
#define LED_GPCLR GPIO_GPCLR1
#define LED_GPIO_BIT 15
#elif defined( RPI4 )
/* The RPi4 model has the ACT LED attached to GPIO 42
https://github.com/raspberrypi/linux/blob/rpi-4.19.y/arch/arm/boot/dts/bcm2838-rpi-4-b.dts */
#define LED_GPFSEL GPIO_GPFSEL4
#define LED_GPFBIT 6
#define LED_GPSET GPIO_GPSET1
#define LED_GPCLR GPIO_GPCLR1
#define LED_GPIO_BIT 10
#else
#define LED_GPFSEL GPIO_GPFSEL1
#define LED_GPFBIT 18
#define LED_GPSET GPIO_GPSET0
#define LED_GPCLR GPIO_GPCLR0
#define LED_GPIO_BIT 16
#endif
#define GPIO_GPFSEL0 0
#define GPIO_GPFSEL1 1
#define GPIO_GPFSEL2 2
#define GPIO_GPFSEL3 3
#define GPIO_GPFSEL4 4
#define GPIO_GPFSEL5 5
#define GPIO_GPSET0 7
#define GPIO_GPSET1 8
#define GPIO_GPCLR0 10
#define GPIO_GPCLR1 11
#define GPIO_GPLEV0 13
#define GPIO_GPLEV1 14
#define GPIO_GPEDS0 16
#define GPIO_GPEDS1 17
#define GPIO_GPREN0 19
#define GPIO_GPREN1 20
#define GPIO_GPFEN0 22
#define GPIO_GPFEN1 23
#define GPIO_GPHEN0 25
#define GPIO_GPHEN1 26
#define GPIO_GPLEN0 28
#define GPIO_GPLEN1 29
#define GPIO_GPAREN0 31
#define GPIO_GPAREN1 32
#define GPIO_GPAFEN0 34
#define GPIO_GPAFEN1 35
#define GPIO_GPPUD 37
#define GPIO_GPPUDCLK0 38
#define GPIO_GPPUDCLK1 39
/** GPIO Register set */
volatile unsigned int* gpio;
/** Simple loop variable */
volatile unsigned int tim;
/** Main function - we'll never return from here */
int main(void) __attribute__((naked));
int main(void)
{
/* Assign the address of the GPIO peripheral (Using ARM Physical Address) */
gpio = (unsigned int*)GPIO_BASE;
/* Write 1 to the GPIO16 init nibble in the Function Select 1 GPIO
peripheral register to enable GPIO16 as an output */
gpio[LED_GPFSEL] |= (1 << LED_GPFBIT);
/* Never exit as there is no OS to exit to! */
while(1)
{
for(tim = 0; tim < 500000; tim++)
;
/* Set the LED GPIO pin low ( Turn OK LED on for original Pi, and off
for plus models )*/
gpio[LED_GPCLR] = (1 << LED_GPIO_BIT);
for(tim = 0; tim < 500000; tim++)
;
/* Set the LED GPIO pin high ( Turn OK LED off for original Pi, and on
for plus models )*/
gpio[LED_GPSET] = (1 << LED_GPIO_BIT);
}
}
We now compile with the no start files option too:
part-1/armc-02 $ ./build.sh rpi3
arm-none-eabi-gcc -nostartfiles \
-mfloat-abi=hard \
-mfpu=crypto-neon-fp-armv8 \
-march=armv8-a+crc \
-mcpu=cortex-a53 \
armc-02.c \
-o kernel.armc-02.rpi3.elf
The linker gives us a warning, which we'll sort out later, but importantly the linker has resolved the problem for us. This is the warning we'll see and ignore:
.../arm-none-eabi/bin/ld: warning: cannot find entry symbol _start; defaulting to 0000000000008000
As we can see from the compilation, the standard output is ELF format which is essentially an executable wrapped with information that an OS binary loader may need to know. We need a binary ARM executable that only includes machine code. We can extract this using the objcopy utility:
arm-none-eabi-objcopy kernel.elf -O binary kernel.img
ELF is a file format used by some OS, including Linux which wraps the machine code with meta-data. The meta-data can be useful. In Linux and in fact most OS these days, running an executable doesn't mean the file gets loaded into memory and then the processor starts running from the address at which the file was loaded. There is usually an executable loader which uses formats like ELF to know more about the executable, for example the Function call interface might be different between different executables, this means that code can use different calling conventions which use different registers for different meanings when calling functions within a program. This can determine whether the executable loader will even allow the program to be loaded into memory or not. The ELF format meta-data can also include a list of all of the shared objects (SO, or DLL under Windows) that this executable also needs to have loaded. If any of the required libraries are not available, again the executable loader will not allow the file to be loaded and run.
This is all intended (and does) to increase system stability and compatibility.
We however, do not have an OS and the bootloader does not have any loader other than a disk read,
directly copying the kernel.img
file into memory at 0x8000
which is then where the ARM processor
starts execution of machine code. Therefore we need to strip off the ELF meta-data and simply leave
just the compiled machine code in the kernel.img
file ready for execution.
This gives us the kernel.img
binary file which should only contain ARM machine code. It should be
tens of bytes long. You'll notice that kernel.elf
on the otherhand is ~34Kb. Rename the
kernel.img
on your SD Card to something like old.kernel.img and save your new kernel.img to
the SD Card.
Booting from this SD Card should now leave the OK LED on permanently. The normal startup is for the OK LED to be on, then extinguish. If it remains extinguished something went wrong with building or linking your program. Otherwise if the LED remains lit, your program has executed successfully.
A blinking LED is probably more appropriate to make sure that our code is definitely running. Let's quickly change the code to crudely blink an LED and then we'll look at sorting out the C library issues we had earlier as the C library is far too useful to not have access to it.
Compile the code in part-1/armc-03
. The code listing is identical to part-1/armc-02
but the
build scripts use objcopy to convert the ELF formatted binary to a raw binary ready to deploy on
the SD card.
You can see that a binary image is now in the folder and is a much more sane size for some code that does so little:
part-1/armc-03 $ ./build.sh rpi3bp
arm-none-eabi-gcc -g \
-nostartfiles \
-mfloat-abi=hard \
-O0 \
-DIOBPLUS \
-DRPI3 \
-mfpu=crypto-neon-fp-armv8 \
-march=armv8-a+crc \
-mcpu=cortex-a53 \
armc-03.c \
-o kernel.armc-03.rpi3bp.elf
.../arm-none-eabi/bin/ld: warning: cannot find entry symbol _start; defaulting to 0000000000008000
arm-none-eabi-objcopy kernel.armc-03.rpi3bp.elf -O binary kernel.img
The kernel.img
file contains just the ARM machine code and so is just a few hundred bytes.
part-1/armc-03 $ ll
total 28
drwxr-xr-x 2 brian brian 4096 Sep 21 01:05 ./
drwxr-xr-x 6 brian brian 4096 Sep 21 00:19 ../
-rw-r--r-- 1 brian brian 3894 Sep 21 00:19 armc-03.c
-rwxr-xr-x 1 brian brian 2805 Sep 21 01:05 build.sh*
-rwxr-xr-x 1 brian brian 16777216 Sep 21 01:05 card.armc-03.rpi3bp.img
-rwxr-xr-x 1 brian brian 35208 Sep 21 01:05 kernel.armc-03.rpi3bp.elf*
-rwxr-xr-x 1 brian brian 268 Sep 21 01:05 kernel.armc-03.rpi3bp.img*
The next step is how to get this kernel.img
file onto an SD Card so we can boot the card and run
our newly compiled code.
A script, called make_card.sh
was run during the build.sh
script run. Have a look in the
build.sh
script to see the call near the end of the script. It does the work of generating an SD
Card image that can be written directly to an SD card.
The /card/make_card.sh
script is worth a look. It can generate a card image without the need to
user super user priveleges. It always uses the latest firmware available from the
RPi Foundation GitHub repository
Writing the image to the card can be done using cat
so long as you know what the SD Card device
is.
If you'd rather, you can use the write_card.sh
script in the card
directory which you can use
interactively to select the SD Card.
If you prefer to do things manually you can insert the SD Card and then run dmesg | tail
to view
messages which will show you which device reference was used for the SD Card or else use lsblk
to
list all of the block devices available.
DON'T GET THE SD CARD DEVICE WRONG OR YOU'LL COMPLETELY WIPE OUT ANOTHER DISK!
When you know what disk to use, you can simply cat the image to the disk using
cat kernel.armc-03.rpibp.img > /dev/sdg
for example
As this is the first example where code should run and give you a visible output on your RPi, I've included the kernel binaries for each Raspberry-Pi board so that you can load the pre-built binary and see the LED flash before compiling your own kernel to make sure the build process is working for you. After this tutorial, you'll have to build your own binaries!
Although the code may appear to be written a little odd, please stick with it! There are reasons why it's written how it is. Now you can experiment a bit from a basic starting point, but beware - automatic variables won't work, and nor will initialised variables because we have no C Run Time support yet.
That will be where we start with Step 2 of Bare metal programming the Raspberry-Pi!