Tag Archives: ARM

Adding Networking Support to ELK: Part 1

The holidays are a great time. A little time away from my day job and I can concentrate a little on bigger ELLCC sub-projects. This holiday, I decided to concentrate on adding networking to ELK, ELLCC’s bare metal run time environment, previously mentioned here and here.

ELLCC is, for the most part, made up of other open source projects. ELLCC (the cross compilation tool chain) leverages the clang/LLVM compiler and for cross compiling C and C++ source code. I decided early on that the ELLCC run time support libraries would all have permissive BSD or BSD-like licensing so I use libc++ and libc++ABI for C++ library support, musl for standard C library support, and compiler-rt for low level processor support.

For those of you unfamiliar with ELK (which is probably all of you), I’ll give a brief synopsis of ELK’s design. The major design goals of ELK are

  • Use the standard run time libraries, compiled for Linux, unchanged in a bare metal environment.
  • Allow fine grained configuration of ELK at link time to support target environments with widely different memory and processor resources.
  • Have BSD or BSD-like licensing so that it can be used with no encumbrances for commercial and non-commercial projects.

The implications of the first goal are interesting. I wanted a fully configured ELK environment to support the POSIX environment that user space programs enjoy in kernel space. In addition, all interactions between the application and the bare metal environment would be through system calls, whether or not an MMU is present or used on the system. I can feel embedded programmers shuddering at the last statement: “What!?! A system call just to write a string to a serial port? What a waste!”. I completely understand, being an old embedded guy myself. But it turns out that there are a couple of good reasons to use the system call approach. The first is that system calls are the place where a context switch will often occur in a multi-threaded environment. Other than in the most bare metal of environments, ELK supports multi-threading and can take advantage of the context saved at the time of the system call to help implement threading. The second reason for using system calls is that modern Linux user space programs try to do system calls as infrequently as possible. For example, POSIX mutexes and semaphores are built on Linux futexes. A futex is a wonderful synchronization mechanism. The only time a system call is needed when taking a mutex, for example, is when the mutex is contested. Finally, using system calls allows ELK to be implemented by implemented system call functionality and you only need to include the system calls that your program needs. I gave a simple example of a system call definition in this post.

At the very lowest level, ELK consists of start-up code that initializes the processor and provides hooks for context switching and system call handling. Here is an example of the ARM start-up code. Above that, ELK consists of several modules, each of which provide system calls related to their functionality. The system call status page gives a snapshot of the system calls currently implemented by ELK, along with the name of the module that the system call has (or will be) implemented in. When I started working on adding networking to ELK, the set of modules supported were

  • thread – thread related system calls and data structures.
  • memman – memory management: brk(), mmap(), etc.
  • time – time related calls: clock_gettime(), nanosleep(), etc.
  • vfs – Virtual File System: File systems, device nodes, etc.
  • vm – Virtual Memory.

Some of the modules have multiple variations that can be selected at link time. For example, memman has a very simple for (supporting malloc() only), and a full blown mmap() supporting version, while a version of vm exists for both MMU and non-MMU systems. Much of the functionality of these modules were derived from the cool but seemingly abandoned Prex project.

Looking back at what I’ve written here so far, I’m guessing more that one of any potential readers are thinking “I thought this post was about adding networking support”. Well, I guess it is about time to get to the point.

I had several options for ELK networking. I could write a network stack myself and spent years debugging it or, like may other components of ELLCC, I could look around for a suitably licensed existing open source alternative. Unlike Prex, I wanted the networking code to show signs of being actively maintained and I wanted to be able to import it and updated with as little change to the source code as possible. I didn’t want to get into the business of doing ongoing maintenance of a one of a kind network stack. I finally settled on LwIP, which I had heard of over the years, but never actually used. LwIP has the right kind of license, and even though the last release was in 2012, it is being actively maintained as evidenced by this recent CERT advisory. In addition LwIP was originally designed for small, resource limited systems and is highly configurable.

LwIP consists of the core functionality, which is a single threaded network stack designed to provide a low level API providing callback functions for network events. In addition, LwIP provides two higher level APIs. The netconn API provides a multi-threaded interface by making the core functionality a thread and communicating with it via messages. Above that, LwIP also provides a Berkeley socket interface API. For ELK, I decided to use the core and netconn functionality and provide my own socket interface API that integrate fully into the existing ELK thread and vfs modules so that file descriptors and vnode interfaces would be consistent.

The first step was to get LwIP to compile within the ELK build framework. That was easy: I got the latest GIT clone and imported it into the ELK source tree, I added the core and netconn sources to ELK’s build rules and provided a couple of configuration headers and glue source files to tie it all together. Fortunately, LwIP has been ported to Linux, and ELK provides a Linux like environment, so even the glue files already existed.

I was very curious how much adding networking would add to the size of an ELK program, so I built an ELK example (http://ellcc.org/viewvc/svn/ellcc/trunk/examples/elk/) both with and without networking linked in. A full blown configuration, without networking, and with this main.c:

/* ELK running as a VM enabled OS.
 */
#include 
#include 

#include "command.h"


int main(int argc, char **argv)
{
#if 0
  // LwIP testing.
  void tcpip_init(void *, void *);
  tcpip_init(0, 0);
#endif
  setprogname("elk");
  printf("%s started. Type \"help\" for a list of commands.\n", getprogname());
  // Enter the kernel command processor.
  do_commands(getprogname());
}

Had a size like this:

[~/ellcc/examples/elk] dev% size elk
   text    data     bss     dec     hex filename
 162696    3364   64240  230300   3839c elk
[~/ellcc/examples/elk] dev%

When I enabled the LwIP initialization, I got

[~/ellcc/examples/elk] dev% size elk
   text    data     bss     dec     hex filename
 367390    4024   68428  439842   6b622 elk
[~/ellcc/examples/elk] dev%

Not bad, considering two things. First, I have just about all the LwIP bells and whistles turned on, and secondly I am compiling with no optimization. Total program size is about 100K smaller with -O3.

The other cool things is that at least at this stage, LwIP is starting with no complaint. I can run the example and see that the networking thread has been started:

[~/ellcc/examples/elk] dev% make run
Running elk
enter 'control-A x' to exit QEMU
audio: Could not init `oss' audio driver
elk started. Type "help" for a list of commands.
elk % ps
Total pages: 8192 (33554432 bytes), Free pages: 8131 (33304576 bytes)
   PID    TID       TADR STATE        PRI NAME       
     0      0 0x800683a0 RUNNING        1 kernel
     0      2 0x8006d150 IDLE           3 [idle0]
     0      3 0x80099000 SLEEPING       1 [kernel]
elk % 

That third thread (TID 3) is the network thread in all its glory. Now to make it do something.

In the next installment of this blog, I’ll describe how the ELK network module handles socket related system calls and interacts with the other ELK modules and the LwIP netconn API.

ELK: Closer to Embedded Linux Without the Linux

In a previous post I gave an update on the development status of ELK, the Embedded Little Kernel. ELK allows you to use the ELLCC tool chain to target bare metal environments. ELK is currently under development, but is available and becoming quite usable for the ARM.

ELK can be configured with a range of functionality, from a very simple “hello world” environment where you take control of everything, to a full MMU enabled virtual memory based system. In all cases, ELK uses the musl C standard library compiled for Linux so ELK can provide a very POSIX-like environment in the bare metal work (i.e. kernel space).

An example of elk in action can be found in the ELLCC source repository in the ELK example directory. You can configure the example to build four configurations:

  • Running from flash with no MMU.
  • Running from flash with virtual memory enabled.
  • Running from RAM with no MMU.
  • Running from RAM with MMU enabled.

The full ELK source code can be found here. Functionality currently supported by ELK:

  • Threading using pthread_create(). Many other thread synchronization functions are available, like POSIX mutexes and semaphores.
  • Virtual file system support, with a RAM, device, and fifo (pipe) file system supported currently.
  • A simple command processor for debugging and system testing.

ELK works by trapping and emulating Linux system calls. The current state of system call support is available on the system call status page.

ELK Status Update

I previously mentioned ELK, an Embedded Little (or Linux) Kernel, that can be used to do bare metal development with ELLCC. The goal of ELK is to use the musl Linux C standard run-time library to provide a POSIX environment on bare metal, i.e. in kernel space. I’ve been focused primarily on the ARM version of ELK as a proof of concept prototype, but I plan to port ELK to all the targets supported by ELLCC.

ELK is able to use the musl library compiled for Linux because it traps the Linux system calls and implements their functionality, or at least enough of their functionality to be useful in a bare metal environment.

I put together a status page that shows the progress I’ve been making on emulating the Linux systems calls in ELK. You can find it here. The source for all of this can be found in the ELLCC source repository. In particular, you can look at the glue that helps makes this work for the ARM, the ARM crt1.S file mentioned on the status page.

Introducing ELK: A Bare Metal Environment for ELLCC

ELLCC (pronounced “elk”) is a C/C++ programming environment based on the clang/LLVM compiler. The goal of ELLCC is to provide a complete tool chain and run-time environment for developing C/C++ programs for embedded systems. ELLCC supports several Linux targets today, specifically ARM, Mips, PowerPC, and x86 systems. ELK (which might mean “embedded little kernel” or “embedded Linux kernel”) is a work-in-progress that allows the ELLCC tool chain to target bare metal environments where an OS is not available or needed.

What will differentiate ELK from other bare metal environments is that the goal is to provide a fairly complete Linux-like environment without the need for actually running Linux. ELK is in the design and development stage right now, but the basic functionality has been implemented based on work done using ELLCC for bare metal development and making the ELLCC tool chain easily configurable.

The thing that makes ELK fairly unique is that it uses the C and C++ libraries compiled for Linux. It does this by handling Linux system calls using the normal system call mechanism for each target. ELK has extensible system call handling capability so that new system call emulators can be added as needed. ELK is being developed by setting up the infrastructure for program start up in the bare metal environment and handling system calls and context switching in a small assembly language file. A fairly functional example for ARM can be seen here. System call support can be added easily. Here is a simple exit() system call example:

/* Handle the exit system call.
 */     
#include <bits/syscall.h>       // For syscall numbers.
#include <stdio.h>
#include <kernel.h>

// Make the simple console a loadable feature.
FEATURE(exit, exit)

static int sys_exit(int status)
{
    printf("The program has exited.\n");
    for( ;; )
      continue;
}       
        
CONSTRUCTOR()   
{       
    // Set up a simple exit system call.
    __set_syscall(SYS_exit, sys_exit);
}       

Here’s an example of a simple ARM program running on QEMU in system (bare metal) mode:

[~] dev% cat hello.cpp
#include <iostream>

int main()
{
  std::cout << "hello world" << std::endl;
}
[~] dev% ~/ellcc/bin/ecc++ -target arm-elk-engeabi hello.cpp -g
[~] dev% ~/ellcc/bin/qemu-system-arm -M vexpress-a9 -m 128M -nographic -kernel a.out
audio: Could not init `oss' audio driver
unhandled system call (256) args: 1, 268472376, 1610613215, 1610613023, 1207961673, 1241513736
unhandled system call (45) args: 0, 0, 0, 1241512408, 590752, 0
unhandled system call (45) args: 4096, 4111, 1241512408, 1241512336, 592800, 1254352
unhandled system call (192) args: 0, 8192, 3, 34, -1, 0
unhandled system call (45) args: 4096, 4111, 34, 1241512336, 592800, -1
unhandled system call (192) args: 0, 8192, 3, 34, -1, 0
hello world
unhandled system call (248) args: 0, 1207962904, 1309668, 0, 1207991180, 1241513736
The program has exited.

Notice that there are several unhandled system calls (I did say that ELK is a work-in-progress) but enough has been implemented that “hello world” comes out.
The C version is a little quieter:

[~] dev% ~/ellcc/bin/qemu-system-arm -M vexpress-a9 -m 128M -nographic -kernel a.out
audio: Could not init `oss' audio driver
unhandled system call (256) args: 1, 268472376, 1610613215, 1610613023, 1207961673, 1241513736
hello world
unhandled system call (248) args: 0, 0, 1207983756, 0, 1207985256, 1241513736
The program has exited.

The current state of ELK is that only the ARM version is at all functional. It supports multiple threads, memory allocation using malloc, semaphores, and timers. ELK started out as a couple of prototypes to test the feasibility of using Linux libraries on bare metal and to enhance the ability to configure the tool chain for multiple environments. I like the way that the development is going so far and hope to have a complete ARM implementation soon and support for the other targets soon after that. You can browse the complete source code for ELK.

ELLCC Bare Metal ARM Update

An update to the ELLCC bare metal experiment last reported here

Highlights:

  • Targets the vexpress-a9 Cortex-A9 based evaluation card.
  • SP804 timer supported.
  • PL011 Serial port supported for the system console.
  • Fully interrupt driven operation for the timers and serial port.
  • Vectored interrupts supported using the ARM Generic Interrupt Controller.
  • The scheduler now supports multiple thread priorities, with the number of priorities from 1 to N specified at compile time.
  • Round robin scheduling supported.
  • Tick-less kernel scheduling: no timer interrupts occur unless specifically required for time-slicing.

An example of some tests and commands:

../../bin/qemu-system-arm -M vexpress-a9 -m 128M -nographic -kernel kernel.bin
audio: Could not init `oss' audio driver
kernel started. Type "help" for a list of commands.
kernel % thread1
thread started foo
kernel % thread2
thread2 started
kernel % thread3
unhandled system call (175) args: 1, 1216308, 0, 8, 8, 0
thread3 started
kernel % thread4
thread4 started
kernel % thread5
thread5 started
kernel % thread2 still running
is
unrecognized command: is
kernel % ts
     TID  STATE        PRI NAME       
 0x28d58: RUNNING        1 kernel     
 0x2cd98: IDLE           3 idle0      
 0x2f550: MSGWAIT        1 thread1    
 0x305b0: TIMEOUT        1 thread2    
 0x31670: READY          1 clone1     
 0x326e0: SEMWAIT        1 thread4    
 0x33740: SEMTMO         1 thread5    
kernel % thread5 running
thread2 still running
thread5 running
thread2 still running
thread5 running
thread2 still running
thread5 running
thread2 still running
thread5 running
thread2 still runningindepenent
thread5 running
thread2 still running
kernel % date
Thu Jan  1 00:00:02 1970
kernel % help date
                date: show/set the system clock.
                      The argument used to set the time, if given,
                      is in the form [MMDDhhmm[[CC]YY][.ss]].
kernel % date 050408272014
kernel % date
Sun May  4 08:27:48 2014
kernel % 

The ARM specific source code is available here and the processor independent code is here.

If you want to try this out at home, everything you need except QEMU is packaged as a binary download from ftp://ellcc.org/pub/ choose the tarball appropriate for your host system, untar it, go into the ellcc/baremetal/arm directory and type “make run”.

Make sure you have QEMU installed on your system, as I am not currently able to cross make it for all the hosts.

Bare Metal ARM with musl Gets a Little Friendlier

UPDATE: More info on this little project is here.

So, this thing is getting a little out of hand. “This thing”, in case you haven’t been here before, is my experiment with using a standard Linux C library on a bare metal ARM board. The way that I’m making it work is by writing simple Linux system call handlers to do what the normal Linux system call would do but in a much simplified way. In my last post I described a little about how my test kernel is built and run and how to connect to it with GDB for debugging.

This time I have added some interrupt handlers and implemented a few more system calls. My first attempt at interrupt handling is for the SP804 dual timer to handle the POSIX monotonic and realtime system clocks. After one of my last posts, someone asked what ARM I was targeting. It turned out to be a great question for which I had no clue how to answer. There are so many ways to say ARM: arm7, armv7, cortex, … Diving in to this project at first I had no idea about the nuances. It turns out that that what I’m trying to target at first is the Cortex-a9, specifically as emulated by QEMU on the vexpress-a9 board.

Ah! you say. Then why the heck are you using the SP804 timer for that? You should be using the 64 bit timers in the private memory region! Indeed I should, and that will be an exercise for another day. The a15 is even cooler, with its virtual timer, but I digress.

In this update, I’ve also added some simple command line processing:

[~/ellcc/baremetal/arm] dev% make run
../../bin/qemu-system-arm -M vexpress-a9 -m 128M -nographic -kernel kernel.bin
audio: Could not init `oss' audio driver
kernel started. Type "help" for a list of commands.
kernel % help
                date: show/set the realtime timer.
                time: time the specified command with arguments.
               sleep: sleep for a time period.
Test Commands:
             syscall: test the syscall interface with an unhandled system call.
               yield: yield the current time slice.
             thread1: start the thread1 test case.
               send1: send a message to the thread1 test thread.
             cancel1: cancel the thread1 test thread.
             thread2: start the thread2 test case.
kernel % time date
Thu Jan  1 00:23:50 1970
elapsed time: 0.007252000 sec
kernel %

Kind of cool, right? Well, just to show that some rudimentary threading is going on:

kernel % thread1
unhandled system call (175) args: 1, 1194164, 0, 8, 8, 0
thread started foo
thread self = 0x00029EB0
kernel % send1
thread running 3
kernel % 

Even better, here’s a test case where the thread periodically sleeps:

kernel % thread2
thread2 started
kernel % thread2 still running
thread2 still running
thread2 still running
thread2 still running

The code for thread2 looks like this:

static void *thread2(void *arg)
{
    printf ("thread2 started\n");
    for ( ;; ) {
        // Go to sleep.
        sleep(10);
        printf ("thread2 still running\n");
    }

    return NULL;
}

Unfortunately it will be running forever. I haven’t implemented pthread_cancel() or pthread_kill() support yet. But the command prompt is still active:

thread2 still running
thread2 still running
thread2 still running
timethread2 still running
 sleep 1
elapsed time: 1.000731000 sec
kernel % 

(I had typed in “time sleep 1”)

As usual, the code is available here. Please don’t look at irq.c. The interrupt controller code is a total hack for now.

UPDATE: If you want to try this out at home, everything you need except QEMU is packaged as a binary download from ftp://ellcc.org/pub/ choose the tarball appropriate for your host system, untar it, go into the ellcc/baremetal/arm directory and type “make run”.

Make sure you have QEMU installed on your system, as I am not currently able to cross make it for all the hosts.

Even More Bare Metal ARM

I’ve spend much of the weekend (it is a holiday, right?) playing around with my bare metal prototype. In the last post in my Bare Metal ARM series, I did a little context switching. Actually it implemented simple co-routines that could give up execution to each other. In this installment, I’ve added a rudimentary ready list and implemented a simple message passing scheme allowing threads to communicate with each other and block if no messages are available.

The full source of my little prototype is here. A little explanation of how I’m developing this might be in order. init.S is the processor initialization and exception handling code, written in assembly because most of it has to be. init.S calls into the C standard library by calling __libc_start_main() after setting up a few stack pointers and zeroing out the uninitialized data (.bss) area. __libc_start_main() ends up calling main() after initializing the library.

The Makefile can be used to build and run the code.

[~/ellcc/baremetal/arm] dev% make
../../bin/ecc -target arm-ellcc-linux-eabi5 -march=armv7 -mfpu=vfp -mfloat-abi=softfp  -c init.S
../../bin/ecc -target arm-ellcc-linux-eabi5 -march=armv7 -mfpu=vfp -mfloat-abi=softfp -g -MD -MP -Werror -Wall -Wno-unused-function -c main.c
../../bin/ecc -target arm-ellcc-linux-eabi5 -march=armv7 -mfpu=vfp -mfloat-abi=softfp -g -MD -MP -Werror -Wall -Wno-unused-function -c simple_console.c
../../bin/ecc -target arm-ellcc-linux-eabi5 -march=armv7 -mfpu=vfp -mfloat-abi=softfp -g -MD -MP -Werror -Wall -Wno-unused-function -c simple_memman.c
../../bin/ecc -target arm-ellcc-linux-eabi5 -march=armv7 -mfpu=vfp -mfloat-abi=softfp -g -MD -MP -Werror -Wall -Wno-unused-function -c scheduler.c
../../bin/ecc -target arm-ellcc-linux-eabi5 -nostartfiles -T kernel.ld \
    ../../libecc/lib/arm/linux/crtbegin.o \
    init.o main.o simple_console.o simple_memman.o scheduler.o \
    ../../libecc/lib/arm/linux/crtend.o \
    -o kernel.elf -Wl,--build-id=none
../../bin/ecc-objcopy -O binary kernel.elf kernel.bin
[~/ellcc/baremetal/arm] dev% make run
../../bin/qemu-system-arm -M vexpress-a9 -m 128M -nographic -kernel kernel.bin
audio: Could not init `oss' audio driver
kernel: hello world
unhandled system call (0) args: 1, 2, 3, 4, 5, 6
__syscall(0) = -1, No error information
hello from context 42 code = 3
hello from context 42 code = 6809
prompt: hi there
got: hi there
prompt: 

To exit out of QEMU use the key sequence Control-A x.

By the way, the send and receiving of messages (the “hello from context” messages) is done in main.c:

...
Queue queue = {};
static intptr_t context(intptr_t arg1, intptr_t arg2)
{
    for ( ;; ) {
      Message *msg = get_message(&queue);
      printf("hello from context %" PRIdPTR " code = %d\n", arg1, msg->code);
    }
    return 0;
}
...
int main(int argc, char **argv)
{
...
    new_thread(context, 4096, 42, 0);
    Message msg = { { NULL, sizeof(msg) }, 3 };
    send_message(&queue, &msg);
    msg.code = 6809;
    send_message(&queue, &msg);
...

I do debugging by using two windows, one to start QEMU in debug mode and the other to run GDB. The first window:

[~/ellcc/baremetal/arm] dev% make debug
../../bin/qemu-system-arm -s -S -M vexpress-a9 -m 128M -nographic -kernel kernel.bin
audio: Could not init `oss' audio driver

The other window looks like:

[~/ellcc/baremetal/arm] dev% ~/ellcc/bin/ecc-gdb kernel.elf
GNU gdb (GDB) 7.7
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from kernel.elf...done.
(gdb) target remote :1234
Remote debugging using :1234
0x60000000 in ?? ()
(gdb) break main
Breakpoint 1 at 0x10408: file main.c, line 40.
(gdb) c
Continuing.

Breakpoint 1, main (argc=1, argv=0x100e8 ) at main.c:40
40          printf("%s: hello world\n", argv[0]);
(gdb) 

One aspect of this code that might be confusing is that there is no apparent explicit initialization code in main() for the serial port, memory allocator, or scheduler. Since this is implemented with a full C library, I take advantage of the fact that static constructors work and are extensions to C in both clang and GCC. An example is at the bottom of the scheduler.c file:

/* Initialize the scheduler.
 */
static void init(void)
    __attribute__((__constructor__, __used__));
    
static void init(void)
{   
    // Set up the main and idle threads.
    idle_thread.saved_sp = (Context *)&idle_stack[IDLE_STACK];
    __new_context(&idle_thread.saved_sp, idle, Mode_SYS, NULL,
                  0, 0);
    
    // The main thread is what's running right now.
    main_thread.next = &idle_thread;
    ready = &main_thread;
        
    // Set up a simple set_tid_address system call.
    __set_syscall(SYS_set_tid_address, sys_set_tid_address);
}

Why did I do it this way? Because I don’t have to change any source code if I want to swap out simple_console.c for a hypothetical interrupt_console.c, for example. I just have to change the SRCS macro in the Makefile.

The next step is to implement interrupt handlers, the timer, and preemptive scheduling. By the way, the full POSIX pthread_create() is still a ways off, but it feels like it is getting closer all the time.

Using ELLCC to Cross Debug an ARM Application

I’m looking at replacing my Linux port of the NetBSD standard library with musl, another library with a BSD-like license. For the past couple of days I’ve been doing a feasibility study on musl, running it through my regression tests for x86_64 and things look very good.

Today, I decided to test the ARM and the first regression test failed. If you’re not familiar with ELLCC, it is a cross development tool chain that uses ecc (based on clang/LLVM) as the compiler. As part of my regression testing, I compile some of the NetBSD user-land utilities and run them using QEMU. When I ran the test of the program cat it failed:

cat ../../../../../src/bin/cat/testinput | ./cat | cmp ../../../../../src/bin/cat/testinput || exit 1
stdout: Bad file number
cmp: EOF on -

Strange. How could stdout have a bad file number? I simplified the test case and found that

~/ellcc/bin/qemu-arm cat < ../../../../../src/bin/cat/testinput

also failed.

I decided to fire up gdb on the simplified test. To do this, I started QEMU with the option to listen for the debugger on port 1234.

~/ellcc/bin/qemu-arm -g 1234 cat < ../../../../../src/bin/cat/testinput

In another window, I started the debugger:

[~/ellcc/test/obj/musl/linux/bin/cat] main% ~/ellcc/bin/ecc-gdb cat
GNU gdb (GDB) 7.4
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
For bug reporting instructions, please see:
...
Reading symbols from /home/rich/ellcc/test/obj/musl/linux/bin/cat/cat...done.
(gdb) set arch arm
The target architecture is assumed to be arm
(gdb) target remote :1234
Remote debugging using :1234
[New Remote target]
[Switching to Remote target]
0x0000807c in _start ()
(gdb) break cat.c:294
(gdb) break cat.c:310
Breakpoint 2 at 0x8cec: file ../../../../../src/bin/cat/cat.c, line 310.
(gdb) c
Continuing.

Breakpoint 1, raw_cat (rfd=0) at ../../../../../src/bin/cat/cat.c:294
294 wfd = fileno(stdout);
(gdb) next
295 if (buf == NULL) {
(gdb) print wfd
$1 = 1
(gdb) c
Continuing.

Breakpoint 2, raw_cat (rfd=0) at ../../../../../src/bin/cat/cat.c:310
310 if ((nw = write(wfd, buf + off, (size_t)nr)) < 0) (gdb) print wfd $2 = 0 (gdb)

Well, that is certainly a puzzling result! What could have changed wfd? Looking at the source of cat, it looks like the only thing that could have is the call to fstat(). What if the struct stat definition doesn't match what QEMU (or even ARM Linux) thinks it should be? It turns out that it is very possible that the struct stat used is right beneath the wfd variable on the stack.

Lets check that hypothesis. I'll set a breakpoint right at the fstat() call:

(gdb) set arch arm
The target architecture is assumed to be arm
(gdb) target remote :1234
Remote debugging using :1234
[New Remote target]
[Switching to Remote target]
0x0000807c in _start ()
(gdb) break cat.c:298
Breakpoint 1 at 0x8c40: file ../../../../../src/bin/cat/cat.c, line 298.
(gdb) c
Continuing.

Breakpoint 1, raw_cat (rfd=0) at ../../../../../src/bin/cat/cat.c:298
298 if (fstat(wfd, &sbuf) == 0 &&
(gdb) print wfd
$1 = 1
(gdb) next
303 if (buf == NULL) {
(gdb) print wfd
$2 = 0
(gdb) print sbuf
$3 = {st_dev = 10, __st_dev_padding = 0, __st_ino_truncated = 8, st_mode = 8576, st_nlink = 1, st_uid = 500,
st_gid = 5, st_rdev = 34821, __st_rdev_padding = 0, st_size = 0, st_blksize = 0, st_blocks = 1024,
st_atim = {tv_sec = 0, tv_nsec = 0}, st_mtim = {tv_sec = 1338126579, tv_nsec = 0}, st_ctim = {
tv_sec = 1338126579, tv_nsec = 0}, st_ino = 1336216504}
(gdb)

This is interesting. The sbuf structure looks like it incorrectly set. st_nlink is 1, which is good for stdout. st_uid is 500, which is my user id. st_blksize should be 1024, but that value got moved to st_blocks. st_atime (the file access time) is empty and st_ino should be 8 like __st_ino_truncated. It looks like the struct stat definition used by musl for the ARM is incorrect.

I snooped around a little bit and found the problem. The stat struct was defined as:

struct stat
{
dev_t st_dev;
int __st_dev_padding;
long __st_ino_truncated;
mode_t st_mode;
nlink_t st_nlink;
uid_t st_uid;
gid_t st_gid;
dev_t st_rdev;
int __st_rdev_padding;
off_t st_size;
blksize_t st_blksize;

blkcnt_t st_blocks;
struct timespec st_atim;
struct timespec st_mtim;
struct timespec st_ctim;
ino_t st_ino;
};

It turned out that some padding was missing. I modified it to be

struct stat
{
dev_t st_dev;
int __st_dev_padding;
long __st_ino_truncated;
mode_t st_mode;
nlink_t st_nlink;
uid_t st_uid;
gid_t st_gid;
dev_t st_rdev;
int __st_rdev_padding[2];
off_t st_size;
blksize_t st_blksize;
int __st_rdev_padding2[1];
blkcnt_t st_blocks;
struct timespec st_atim;
struct timespec st_mtim;
struct timespec st_ctim;
ino_t st_ino;
};

and voila! The cat was happy again.

Breakpoint 1, raw_cat (rfd=0) at ../../../../../src/bin/cat/cat.c:298
298 if (fstat(wfd, &sbuf) == 0 &&
(gdb) next
303 if (buf == NULL) {
(gdb) print sbuf
$1 = {st_dev = 10, __st_dev_padding = 0, __st_ino_truncated = 8, st_mode = 8576, st_nlink = 1, st_uid = 500,
st_gid = 5, st_rdev = 34821, __st_rdev_padding = {0, 0}, st_size = 0, st_blksize = 1024,
__st_rdev_padding2 = {0}, st_blocks = 0, st_atim = {tv_sec = 1338132494, tv_nsec = 0}, st_mtim = {
tv_sec = 1338132494, tv_nsec = 0}, st_ctim = {tv_sec = 1336216504, tv_nsec = 0}, st_ino = 8}