blog_freecoder_Cpp-compiler-post

C/C++: What are Steps to Compile a Program?

Have you ever wondered how the C/C++ compiler (gcc) works to convert your lines of C/C++ code into an executable that can be processed and understood by your Windows, Linux, or macOS machine?

In this article, we will go over all four processes that your C/C++ program takes to compile and produce a file that can be executed on the machine:

4 Steps to Compile a C/C++ Program

During all these following steps, we will use this c/c++ programme as example:

**
 * @file main.c
 * @author freecoder
 * @brief this program print the sum of the integers from 1 to 100
 *
 * @version 1.0
 * @date 10 oct. 2021
 *
 * @copyright Copyright (c) 2021
 *
 */
 
#include <stdio.h>
 
/* defines internal constantes */
#define FIRST_NUMBER ((unsigned int)0)
#define LAST_NUMBER ((unsigned int)100)
#define CONSOLE_MSG ((unsigned char *)"The sum equal to: %d\n")
 
/* main program entry */
int main(int argc, char **argv)
{
	/* local variables */
	unsigned int uiCtr;
	unsigned int uiSum;
 
	/* loop for the firt 100 integers */
	for (uiCtr = FIRST_NUMBER; uiCtr < LAST_NUMBER; uiCtr++)
	{
		/* cumulate the numbers */
		uiSum += uiCtr;
	}
 
	/* show the sum result message on the console */
	printf(CONSOLE_MSG, uiSum);
 
	return 0;
}

#Step 1. Preprocessing

During this step called preprocessing, which consists of replacing all of the preprocessorsin the program in C/C++ roughly, everything that is annotated with the symbol (#)

  • Delete all comment lines in the program 
  • Inclusion of header files and third-party libraries (#include “* .h”)
  • Constants and Macros (#define
  • Activate or deactivate program parts with conditional compilation directives (#ifedf #elseif #endif)
preprocessing compiling step

If you wish to analyse more closely this first step of preprocessing, you just have to run the program with this gcc option (-E):

gcc -E main.c -o main.i

This stage is completed by creating a *.i file that contains the new program to which the various preprocessing directives have been applied and looks like this:

Article-X git:(master) ✗ gcc -E main.c -o main.i
➜  Article-X git:(master) ✗ ll
total 24K
-rw-r--r-- 1 root root 758 Nov 20 12:28 main.c
-rw-r--r-- 1 root root 17K Nov 20 12:48 main.i
➜  Article-X git:(master) ✗ cat main.i
# 0 "main.c"
# 0 "<built-in>"
# 0 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 0 "<command-line>" 2
# 1 "main.c"
# 13 "main.c"
# 1 "/usr/include/stdio.h" 1 3 4
# 27 "/usr/include/stdio.h" 3 4
# 1 "/usr/include/x86_64-linux-gnu/bits/libc-header-start.h" 1 3 4
# 33 "/usr/include/x86_64-linux-gnu/bits/libc-header-start.h" 3 4
# 1 "/usr/include/features.h" 1 3 4
# 461 "/usr/include/features.h" 3 4
# 1 "/usr/include/x86_64-linux-gnu/sys/cdefs.h" 1 3 4
# 452 "/usr/include/x86_64-linux-gnu/sys/cdefs.h" 3 4
# 1 "/usr/include/x86_64-linux-gnu/bits/wordsize.h" 1 3 4
# 453 "/usr/include/x86_64-linux-gnu/sys/cdefs.h" 2 3 4
# 1 "/usr/include/x86_64-linux-gnu/bits/long-double.h" 1 3 4
# 454 "/usr/include/x86_64-linux-gnu/sys/cdefs.h" 2 3 4
# 462 "/usr/include/features.h" 2 3 4
# 485 "/usr/include/features.h" 3 4
# 1 "/usr/include/x86_64-linux-gnu/gnu/stubs.h" 1 3 4
# 10 "/usr/include/x86_64-linux-gnu/gnu/stubs.h" 3 4
# 1 "/usr/include/x86_64-linux-gnu/gnu/stubs-64.h" 1 3 4
# 11 "/usr/include/x86_64-linux-gnu/gnu/stubs.h" 2 3 4
# 486 "/usr/include/features.h" 2 3 4
# 34 "/usr/include/x86_64-linux-gnu/bits/libc-header-start.h" 2 3 4
# 28 "/usr/include/stdio.h" 2 3 4

Note:

-E main.c : Generate the preprocessing code from the C/C++ program.

-o main.i : Specify the output file.

preprocessing step

#Step 2. Compiling

This step is known as compilation phase, and it consists of determining whether or not the program in c/c++ preprocessing (generated in the previous step) contains any syntax errors before executing a set of assembler instructions involving machine (CPU) resources such as registers, memory, stacks, and so on.

Following this step, we’ll have an intermediate file with assembly code.

gcc compiling step

To learn more about this assembly file, run the following gcc option (-S) on the file created in the previous stage (main.i):

gcc -S main.i -o main.s

Then you can look at the output from the console:

➜  Article-X git:(master) ✗ ll
total 24K
-rw-r--r-- 1 root root 758 Nov 20 12:28 main.c
-rw-r--r-- 1 root root 17K Nov 20 12:48 main.i
➜  Article-X git:(master) ✗ gcc -S main.i -o main.s
➜  Article-X git:(master) ✗ ll
total 28K
-rw-r--r-- 1 root root 758 Nov 20 12:28 main.c
-rw-r--r-- 1 root root 17K Nov 20 12:48 main.i
-rw-r--r-- 1 root root 677 Nov 20 22:14 main.s
➜  Article-X git:(master) ✗ cat main.s
        .file   "main.c"
        .text
        .section        .rodata
.LC0:
        .string "The sum equal to: %d\n"
        .text
        .globl  main
        .type   main, @function
main:
.LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        subq    $32, %rsp
        movl    %edi, -20(%rbp)
        movq    %rsi, -32(%rbp)
        movl    $0, -4(%rbp)
        jmp     .L2
.L3:
        movl    -4(%rbp), %eax
        addl    %eax, -8(%rbp)
        addl    $1, -4(%rbp)
.L2:
        cmpl    $99, -4(%rbp)
        jbe     .L3
        movl    -8(%rbp), %eax
        movl    %eax, %esi
        movl    $.LC0, %edi
        movl    $0, %eax
        call    printf
        movl    $0, %eax
        leave
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc
.LFE0:
        .size   main, .-main
        .ident  "GCC: (GNU) 11.1.0"
        .section        .note.GNU-stack,"",@progbits
➜  Article-X git:(master) ✗ 

Note:

-S main.i : Generate the assembly file form the input preprocessing code (main.i).

-o main.s : Specify the output file (main.s).

compiling step

#Step 3. Assembling 

All of the assembly instructions from the previous stage will be compiled to a low level machine code, also known as object or binary in this step, with the exception of the functions of the tier libraries, which will not be controlled at this level (e.g: printf()).

gcc assembling step

To complete this phase, the gcc will be invoked with the following argument:

gcc -C main.s -o main.o

The result on the output console as shown bellow:

➜  Article-X git:(master) ✗ ll
total 28K
-rw-r--r-- 1 root root 758 Nov 20 12:28 main.c
-rw-r--r-- 1 root root 17K Nov 20 12:48 main.i
-rw-r--r-- 1 root root 677 Nov 20 22:14 main.s
➜  Article-X git:(master) ✗ gcc -C main.s -o main.o
➜  Article-X git:(master) ✗ ll
total 44K
-rw-r--r-- 1 root root 758 Nov 20 12:28 main.c
-rw-r--r-- 1 root root 17K Nov 20 12:48 main.i
-rwxr-xr-x 1 root root 16K Nov 20 22:37 main.o
-rw-r--r-- 1 root root 677 Nov 20 22:14 main.s

Note:

-C main.s : Generate the object file form the input assembly code (main.s).

-o main.o : Specify the output file (main.o).

It is possible to check the content of the generated file by:

➜  Article-X git:(master) ✗ objdump -d main.o      
main.o:     file format elf64-x86-64
Disassembly of section .init:
0000000000401000 <_init>:
  401000:       48 83 ec 08             sub    $0x8,%rsp
  401004:       48 8b 05 ed 2f 00 00    mov    0x2fed(%rip),%rax        # 403ff8 <__gmon_start__>
  40100b:       48 85 c0                test   %rax,%rax
  40100e:       74 02                   je     401012 <_init+0x12>
  401010:       ff d0                   callq  *%rax
  401012:       48 83 c4 08             add    $0x8,%rsp
  401016:       c3                      retq   
Disassembly of section .plt:
0000000000401020 <.plt>:
  401020:       ff 35 e2 2f 00 00       pushq  0x2fe2(%rip)        # 404008 <_GLOBAL_OFFSET_TABLE_+0x8>
  401026:       ff 25 e4 2f 00 00       jmpq   *0x2fe4(%rip)        # 404010 <_GLOBAL_OFFSET_TABLE_+0x10>
  40102c:       0f 1f 40 00             nopl   0x0(%rax)
0000000000401030 <[email protected]>:
  401030:       ff 25 e2 2f 00 00       jmpq   *0x2fe2(%rip)        # 404018 <[email protected]_2.2.5>
  401036:       68 00 00 00 00          pushq  $0x0
  40103b:       e9 e0 ff ff ff          jmpq   401020 <.plt>
Disassembly of section .text:
0000000000401040 <_start>:
  401040:       31 ed                   xor    %ebp,%ebp
  401042:       49 89 d1                mov    %rdx,%r9
  401045:       5e                      pop    %rsi
  401046:       48 89 e2                mov    %rsp,%rdx
  401049:       48 83 e4 f0             and    $0xfffffffffffffff0,%rsp
  40104d:       50                      push   %rax
  40104e:       54                      push   %rsp
  40104f:       49 c7 c0 d0 11 40 00    mov    $0x4011d0,%r8
  401056:       48 c7 c1 70 11 40 00    mov    $0x401170,%rcx
  40105d:       48 c7 c7 26 11 40 00    mov    $0x401126,%rdi
  401064:       ff 15 86 2f 00 00       callq  *0x2f86(%rip)        # 403ff0 <[email protected]_2.2.5>
  40106a:       f4                      hlt    
  40106b:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
....
assembling step

#Step 4. Linking

The compiler will take the file (main.o) generated in the previous phase and replace the functions of the external libraries (eg: printf ()) and link them with their original definitions in this final stage.

This can be verified by using the gcc command as follows:

gcc main.c -o main

The result on output console is as shown bellow:

➜  Article-X git:(master) ✗ gcc main.c -o main
➜  Article-X git:(master) ✗ ll
total 60K
-rwxr-xr-x 1 root root 16K Nov 20 23:20 main
-rw-r--r-- 1 root root 758 Nov 20 12:28 main.c
-rw-r--r-- 1 root root 17K Nov 20 12:48 main.i
-rwxr-xr-x 1 root root 16K Nov 20 22:37 main.o
-rw-r--r-- 1 root root 677 Nov 20 22:14 main.s
➜  Article-X git:(master) ✗ size main
   text    data     bss     dec     hex filename
   1202     560       8    1770     6ea main

Note:

gcc main.c : Generate the executable file from the input C/C++ code (main.c).

-o main : Specify the output executable file (main).

linking step

It should be noted that the gcc compiler links external library functions dynamically rather than statically.

Conclusion

I hope this short article sheds some light on the dark side of C/C++ compilers and all the gibberish behind them.

I let you test and put this into practice, and from now if your program fails to compile, I’ll let you understand which of the four steps failed 🙂

Default image
@freecoder

I have been working as an embedded developer for over 15 years and I am very passionate about what I do.
My goal is to write good, clean code that is easy to maintain and extend. I believe that code should be well-tested, readable, and concise.

Articles: 13