Post

How Compilers Actually Link Your Code

How Compilers Actually Link Your Code

When you write #include <stdio.h>, you are pulling in years of infrastructure that most developers never think about. This post peels back the layers: what libc and glibc actually are, how compilers (GCC, Clang) connect your code to them, and why MCU developers reach for newlib instead.


What Is libc?

libc is the C Standard Library — the implementation of everything standardized in ISO C: printf, malloc, fopen, memcpy, strlen, pthread_create, and hundreds more.

Every C program on Linux implicitly links against it. It is the bridge between your code and the operating system kernel.

1
2
3
4
5
6
7
Your Code (.c)
    │
    ▼
  libc  ──── wraps raw Linux syscalls into portable C functions
    │
    ▼
 Linux Kernel  (syscalls: read, write, mmap, ioctl, …)

Without libc, calling printf("hello\n") would require you to manually invoke the write syscall using inline assembly or a syscall stub — every single time.


What Is glibc?

glibc (GNU C Library) is the dominant implementation of libc on Linux desktop and server systems. When people say “libc on Linux,” they almost always mean glibc.

1
2
3
4
$ ls -la /lib/x86_64-linux-gnu/libc.so.6
lrwxrwxrwx ... libc.so.6 -> libc-2.35.so   # This IS glibc
$ ldd --version
ldd (Ubuntu GLIBC 2.35-0ubuntu3) 2.35       # The glibc version

glibc provides:

  • The full ISO C standard library (stdio.h, stdlib.h, string.h, …)
  • POSIX extensions (unistd.h, sys/socket.h, pthread.h, …)
  • Linux-specific extensions (sys/ioctl.h, sys/epoll.h, …)
  • The dynamic linker/loader (ld-linux.so) that runs before main()
  • Startup code (crt1.o, crti.o, crtn.o) that calls main()
1
2
3
$ file /bin/ls
/bin/ls: ELF 64-bit LSB pie executable, dynamically linked,
         interpreter /lib64/ld-linux-x86-64.so.2   # ← glibc's dynamic linker

libc Is Not Just glibc

glibc is the most feature-complete but also the heaviest. There are several other implementations:

LibraryUse CaseNotes
glibcDesktop / Server LinuxDefault on Debian, Ubuntu, Fedora
muslEmbedded Linux, containersAlpine Linux uses musl; small and strict
newlibBare-metal / RTOSUsed by arm-none-eabi-gcc toolchains

When you build with arm-none-eabi-gcc for a bare-metal MCUs, there is no Linux kernel — the libc is newlib, and syscalls like write call a weak stub (_write) that you can re-write its implementation yourself (e.g., redirecting to UART instead of std-out).


newlib — libc for Microcontrollers

When you target a bare-metal MCU (STM32, nRF52, RP2040…) there is no OS, no kernel, and no glibc. The toolchain (arm-none-eabi-gcc) ships with newlib as its C library. Understanding how it works explains a lot of the “mysterious” behavior in embedded C projects.

What newlib provides

newlib implements the same C standard headers you know — stdio.h, string.h, stdlib.h, math.h — but without any OS backing:

1
2
3
4
5
6
7
8
9
10
Your Firmware (.c)
    │
    ▼
  newlib  ──── pure C: printf, malloc, memcpy, ...
    │
    ▼
 Syscall stubs  (_write, _read, _sbrk, _close, ...)
    │
    ▼
 Your hardware  (UART, Flash, RAM — no kernel!)

The stubs are weak symbols. If you do not provide them, the linker uses a default that either loops forever or returns an error. If you do provide them, newlib routes all I/O through your implementation.

Retargeting — implementing the syscall stubs

The most common retargeting task is redirecting printf to a UART. You implement _write:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
// retarget.c — redirect stdout to UART
#include <sys/stat.h>
#include <errno.h>

// Called by newlib whenever printf/puts/fwrite needs to output bytes
int _write(int file, char *ptr, int len)
{
    (void)file;  // ignore fd, always send to UART1
    for (int i = 0; i < len; i++) {
        // HAL_UART_Transmit blocks until the byte is sent, but be careful with that blocking function
        HAL_UART_Transmit(&huart1, (uint8_t *)&ptr[i], 1, HAL_MAX_DELAY);
    }
    return len;
}

Other commonly retargeted stubs:

StubCalled byWhat to implement
_writeprintf, puts, fwriteSend bytes to UART / RTT / semihosting
_readscanf, getchar, freadReceive bytes from UART
_sbrkmalloc, calloc, reallocMove the heap break pointer in your RAM
_close, _fstat, _lseekfile operationsReturn -1 / ENOSYS on bare-metal
_exitexit(), abort()Infinite loop or NVIC_SystemReset()

_sbrk and the heap

malloc in newlib calls _sbrk to grow the heap. You must tell it where the heap lives:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// Linker script defines these symbols:
extern char _end;      // end of BSS segment — start of heap
extern char _estack;   // top of RAM — where stack grows down from

void *_sbrk(ptrdiff_t incr)
{
    static char *heap_end = &_end;
    char *prev = heap_end;

    // Make sure heap doesn't collide with stack
    if (heap_end + incr > &_estack - 512) {
        errno = ENOMEM;
        return (void *)-1;
    }
    heap_end += incr;
    return prev;
}

Without a correct _sbrk, malloc will either corrupt the stack or immediately fail.

Semihosting — debug I/O without a UART

Semihosting is an ARM mechanism that lets the MCU tunnel I/O through the debug probe (J-Link, ST-Link) to a host terminal. It is built into newlib:

1
2
# Link with semihosting support (ARM GCC):
arm-none-eabi-gcc main.c -specs=rdimon.specs -lrdimon -o fw.elf
1
2
3
4
5
// With semihosting, printf goes to the debugger console — no UART needed
#include <stdio.h>
int main(void) {
    printf("Hello from Cortex-M via semihosting\n");
}

Semihosting is extremely slow (each character traps into the debug interface) and must be disabled before release. Use it only during development.

newlib-nano — the size-optimized variant

STM32CubeIDE and most ARM GCC distributions ship newlib-nano, a stripped-down variant:

1
2
# Use newlib-nano (enabled by default in many cube projects):
arm-none-eabi-gcc main.c -specs=nano.specs -o fw.elf

Differences from full newlib:

Featurenewlibnewlib-nano
printf float supportYesNo (add -u _printf_float)
Code sizeLarger~30-50% smaller
malloc implementationdlmallocsimpler, smaller allocator
Wide char / wchar_tFullMinimal

Comparison at a glance

 glibcmuslnewlibnewlib-nano
TargetLinux desktop/serverEmbedded LinuxBare-metal MCUBare-metal MCU (small)
OS requiredLinux kernelLinux kernelNoneNone
SizeLargeSmallMediumSmall
Float printfYesYesYesOpt-in (-u _printf_float)
Actively maintainedYesYesYesYes
Toolchainx86-64, AArch64…x86-64, AArch64…arm-none-eabiarm-none-eabi

This is the part most developers take for granted. When you run:

1
$ gcc main.c -o main

GCC does far more than compile. It orchestrates the entire toolchain pipeline:

1
2
3
4
5
6
7
8
9
10
11
12
13
main.c
  │
  ▼ (1) Preprocessor  cpp
  │     Expands #include, #define → main.i
  │
  ▼ (2) Compiler      cc1
  │     Translates C to assembly → main.s
  │
  ▼ (3) Assembler     as
  │     Assembles to object file → main.o
  │
  ▼ (4) Linker        ld  (invoked via collect2)
        Links main.o + crt1.o + libc.so → main (ELF executable)

You can inspect each stage:

1
2
3
4
$ gcc -E  main.c -o main.i   # Stop after preprocessing
$ gcc -S  main.c -o main.s   # Stop after compiling to assembly
$ gcc -c  main.c -o main.o   # Stop after assembling
$ gcc     main.c -o main     # Full pipeline + link

Clang follows the exact same four stages — it is a drop-in GCC replacement at the command-line level:

1
$ clang main.c -o main       # Identical result on most systems

The key difference is internal architecture:

 GCCClang / LLVM
Front-endGCC’s own parserClang (separate, reusable AST)
Middle-endGIMPLE / RTL passesLLVM IR + LLVM passes
Back-endTarget-specific GCC back-endsLLVM code generators
ToolingSeparate tools (sparse, etc.)clangd, clang-tidy, clang-format built-in

Both ultimately call the system linker (ld or lld) which wires up your object files to the libc shared object.

The Hidden -lc Flag

You never type -lc, but GCC adds it automatically:

1
2
3
$ gcc -v main.c -o main 2>&1 | grep "\-lc"
# You'll see something like:
# ... /usr/lib/gcc/x86_64-linux-gnu/11/collect2 ... -lc ...

-lc tells the linker to link against libc.so.6 (glibc). To prevent this:

1
2
$ gcc -nostdlib main.c -o main  # No libc, no startup code — you're on your own
$ gcc -nodefaultlibs main.c -o main  # No default libs, but keep startup code

These flags are essential in bare-metal or OS kernel development where there is no libc at all.


What Happens Before main()

This is one of the most misunderstood parts. When you run a Linux program, the kernel doesn’t call main() — it jumps to _start, which is provided by glibc’s CRT (C Runtime):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Kernel loads ELF
    │
    ▼
ld-linux.so  (dynamic linker — resolves shared library symbols)
    │
    ▼
_start       (from crt1.o, provided by glibc)
    │
    ├── Initializes argc, argv, envp
    ├── Calls __libc_start_main()
    │       ├── Sets up stdio buffers
    │       ├── Registers atexit handlers
    │       ├── Runs .init_array constructors (C++ static objects, __attribute__((constructor)))
    │       └── Calls main()
    │
    └── After main() returns → calls exit() → runs .fini_array destructors → _exit syscall

crt1.o, crti.o, and crtn.o are the C Runtime Objects silently linked by GCC. You can list them:

1
2
$ gcc --print-file-name=crt1.o
/usr/lib/x86_64-linux-gnu/crt1.o

glibc as a Syscall Wrapper

The Linux kernel exposes ~300+ system calls (syscalls). glibc wraps each one into a normal C function so you don’t have to write inline assembly:

1
2
3
4
5
6
// What glibc does internally for read():
ssize_t read(int fd, void *buf, size_t count)
{
    // x86-64: syscall number 0 = __NR_read
    return syscall(__NR_read, fd, buf, count);
}

The raw Linux kernel interface uses CPU registers and the syscall instruction:

; x86-64 raw write syscall (no libc)
mov rax, 1          ; syscall number: write
mov rdi, 1          ; fd: stdout
mov rsi, msg        ; buffer pointer
mov rdx, 13         ; length
syscall

glibc makes all of that invisible. It also handles errno — the kernel returns a negative error code, glibc converts it to a positive value stored in the thread-local errno variable.

1
2
3
4
5
6
7
// glibc errno handling (simplified):
long ret = raw_syscall(...);
if (ret < 0) {
    errno = -ret;   // e.g. kernel returns -ENOENT → errno = ENOENT (2)
    return -1;
}
return ret;

References

This post is licensed under CC BY 4.0 by the author.