Introduction to C
Last updated
Last updated
Shameless plug
This course is given to you for free by the Malcore team: https://m4lc.io/course/c/register
Consider registering, and using Malcore, so we can continue to provide free content for the entire community. You can also join our Discord server here: https://m4lc.io/course/c/discord
We offer free threat intel in our Discord via our custom designed Discord bot. Join the Discord to discuss this course in further detail or to ask questions.
You can also support us by buying us a coffee
In a sentence: C is a powerful general-purpose compiled programming language.
In a more descriptive way: C is a powerful compiled programming language that was developed by Dennis Ritchie in 1972 at Bell Labs. It was originally designed for systems programming specifically operating systems. Overtime C became one of the most widely used languages in existence and was used for a lot more than just operating system development. C is known for simplicity, efficiency, and system resource control. It is a popular choice for applications that require high performance or low-level interaction. There are of course issues with C a lot of people consider it a "broken language" however, we will not get into that debate here.
Issues and benefits of C
It is good for you to understand the issues in C as well as the benefits. We will first start with the issues:
Manual memory management and no garbage collection. Meaning that you must manage your own memory in your programs, it is not done for you like it is in other languages such as Python.
Lack of object orientation. There is no OOP in C. Meaning it does not have classes in it.
There is no built-in error handling. C relies on return codes and external error handling.
C's low-level access can provide security issues such as buffer overflows, stack overflows, and other security issues. These issues can be exploited if not carefully written.
The issues are descriptive and provide a good understanding of what's wrong with the C language and what can arise for the developer. However, there are of course benefits of C such as:
Efficiency and speed. C is highly efficient with minimal overhead runtime. Critical performance applications are ideal to write in C due to its direct memory access and low-level operation abilities.
C is very portable. It can be ported across multiple platforms easily.
Simplicity. Despite how powerful C is, it is simple. There is a small number of keywords and the syntax is pretty straightforward.
Extensiveness. There are millions of libraries for C that can do literally anything you can imagine. It's been around long enough that everything has been created.
Now that you actually know what C we is should get into the basics of syntax and data structures. We will go through a breakdown of the syntax:
Comments:
Single line comment example:
Multi line comment example:
Data types:
int
: integer values, IE: int x = 5;
float
: floating point values, IE: float pi = 3.14;
double
: double floating point values, IE: double pi = 3.14159;
char
: single character, IE: char letter = 'A';
void
: signifies no value or empty set of values.
Variables:
Variables must always be declared before use. For example:
It is also possible to initialize a variable upon use
Constants:
You can declare a constant using the const
keyword. IE: const int AGE = 33;
You can also declare them using #define
, IE: #define PI 3.14
Operators:
Arithmetic operators:
+
addition
-
subtraction
*
multiplication
/
division
%
modulus
Comparison operators:
==
equal to
!=
not equal to
>
greater than
<
less than
>=
greater than or equal to
<=
less than or equal to
Logical operators:
&&
logical AND o
||
logical OR
!
logical NOT
Arrays
Arrays allow you to store multiple values of the same type, IE: int numbers[5] = {1,2,3,4,5};
Structures (structs):
A structure (or struct) is a grouping of different data types. Example:
Enumerations (enums):
An enumeration (enum) is a way to assign names to integer values. Example:
Modifiers:
C offers modifiers that can applied to basic data types. It changes their range and size.
signed
: can store both positive and negative integers, IE: signed int num = -5;
unsigned
: can store only positive values, IE: unsigned int num = 15;
short
: reduces storage size, IE: short int num = 50;
long
: increases the storage size, IE: long int num = 100000;
Typedef:
Used to give a new name to an existing type
Size and range of basic data types:
char
1
-128 - 127
0 - 255
Char
can be signed or unsigned depending on the compiler
int
4
-2,147,483,648 - 2,147,483,647
0 - 4,294,967,295
n/a
float
4
3.4E-38 - 3.4E+38
n/a
n/a
double
8
1.7E-308 - 1.7E+308
n/a
n/a
Control flow is the orders that statements, instructions, or functions are executed or evaluated. There are several control flow options in C.
Types of control flow:
Conditional statements
: executes different code based on certain conditions.
Loops
: repeat a block of code multiple times (or forever).
Jump statements
: unconditional control transfer to another part of the program.
Condition statements:
if/else if/else
statement
This statement allows execution of a block of code if
a specific condition is true. You may also use an else if
to test multiple conditions to be true. Otherwise, the alternative block of code in the else
is executed. Example:
Switch statements
Execute one block of code among multiple depending on the value of an expression. Switch statements have a default
condition, this condition is executed if none of expressions match the case. Example:
Loops:
for
loop repeats a block of code a known number of times, IE:
while
loop is used to repeat code while a condition is evaluated to true, IE:
do-while
loop is similar to a while loop. However, the condition is checked AFTER the loop has already executed. This guarantees that the loop will run at least once, IE:
Jump statements:
break
: used to exit a loop prematurely regardless of condition, IE:
continue
: skip current iteration and move to next, IE:
goto
: allows you to jump to predefined labels within the program, IE:
NOTE: it is worth noting that goto
is usually frowned upon because it can make the code harder to follow. However, do what you want. It's your code. It is also worth noting that goto
is useful for deep nested loops.
Return statements:
Used to exit a function and return a value from that function call, IE:
Summary of control flows cheat sheet:
Construct
Type
Description
Syntax
if
Conditional Statement
Executes a block of code if the condition is true.
if (condition) { // code }
if-else
Conditional Statement
Executes one block of code if the condition is true, another if the condition is false.
if (condition) { // code } else { // code }
else-if
Conditional Statement
Tests multiple conditions sequentially.
if (condition1) { // code } else if (condition2) { // code } else { // code }
switch
Conditional Statement
Selects one block of code to execute from multiple options based on a variable’s value.
switch (variable) { case value1: // code; break; ... default: // code }
for
Loop
Repeats a block of code a specific number of times.
for (initialization; condition; increment) { // code }
while
Loop
Repeats a block of code while a condition is true.
while (condition) { // code }
do-while
Loop
Executes a block of code at least once and repeats while the condition is true.
do { // code } while (condition);
break
Jump Statement
Exits from a loop or switch statement immediately.
break;
continue
Jump Statement
Skips the current iteration of a loop and continues with the next iteration.
continue;
goto
Jump Statement
Transfers control to a labeled part of the code. Use is discouraged.
goto label; ... label: // code
return
Function Control
Exits a function and optionally returns a value to the calling function.
return value;
A function in C is a block of code that can be called multiple times throughout the program. Functions are usually designed to do one thing and do it well. This allows for easily readable code that provides better maintainability. It allows you to write a chunk of code one time and use it over and over again.
When declaring a function in C you use a prototype
. You must declare a function before use. This is typically done before the main
function of the program. The declaration informs the compiler about the name, return type, and parameters of the function.
The syntax of a function is: return type function_name(parameters)
. As an example of a basic function:
Function definitions specify the actual code that is executed within the function when it is called. This is a set of instructions that make up the functions logical body:
Once the function has been declared and defined you can call the function by simply calling its name and passing the required arguments. For example:
It is entirely possible to create a function that returns nothing. In order to do so you will provide it with the void
return type:
Recursion is when a function calls itself. This can be useful for solving problems without having to write extra code. For example:
Notice how we recall the function from within itself.
Key concept cheat sheet of functions:
Element
Description
Function Declaration
Specifies the function's name, return type, and parameters.
Function Definition
Contains the code (body) that performs the function's task.
Function Call
Invokes the function, passing the necessary arguments to the function.
Return Type
Indicates the type of value the function returns (IE: int
, float
, void
if no return value).
Parameters
Inputs to the function, passed when calling the function.
Void Function
A function that performs an action but does not return any value.
Recursion
A function that calls itself for repetitive tasks.
A pointer is a variable that stores the memory address of another variable. Instead of holding the value of the variable it "points to" the memory address where the value is stored. This provides a powerful way to efficiently use memory manipulation and is an essential for things like dynamic memory allocation, arrays, and function calls.
Declaring a pointer
To declare a pointer you specify the data type of the variable followed by an asterisk (*
) and finally the pointer name.
The *a
indicates that the variable a
is a pointer to an integer.
Initially the pointer will contain an arbitrary memory address and should be assigned a valid address before use.
Initialize a pointer
To initialize a pointer you assign them the address of another variable using the address of operator (&
).
The x
variable stores the integer 12
The p
stores the memory address of the x
variable, but not the value of it.
Dereference a pointer
Dereferencing pointers is when you access the value stored at the memory address.
Null pointers
Null pointers are a pointer that points to nothing or an invalid memory location. When you initialize a pointer, and it doesn't point to anything it is good practice to initialize it to null.
This indicates that the pointer is not pointing to anything valid yet.
Arrays
An array acts as pointers to the first element in the array. You can use pointers to iterate through the array. This way you don't have to declare the pointer (&
) for an array since it's already technically a pointer.
Pointer arithmetic
Pointer arithmetic is when you navigate through memory locations that are contiguous (such as arrays) using a pointer.
You can increment and decrement a pointer in order to move to the next value.
You can also add or subtract an integer to move the pointer by that many elements.
Using pointers with functions can allow you to modify a variable or array directly without returning a value.
Pointerception
C allows you to point to pointers it's like pointerception but less cool and more confusing.
Downsides and upsides of pointers
There are of course downsides and upsides of pointers. We will start with the downsides:
Pointers can become extremely complex and confusing. They can make your code extremely hard to read, understand, and debug.
Incorrect use of pointers can lead to memory leaks, segmentation faults, and security vulnerabilities.
Now the upsides of pointers:
Pointers are very efficient at managing memory and allow dynamic memory allocation.
They provide direct memory access that allows you to manipulate data at very low levels.
Pointer cheat sheet
Key concepts to remember (cheat sheet):
Pointer Concept
Description
Example
Pointer Declaration
Declares a pointer to store the address of a variable.
int *p;
Pointer Initialization
Initializes a pointer by assigning the address of a variable.
p = &x;
Dereferencing
Accesses the value stored at the memory address a pointer points to.
*p
Null Pointer
A pointer that points to nothing.
int *p = NULL;
Pointer Arithmetic
Performing arithmetic operations on pointers (incrementing, decrementing, etc.).
p++;
Pointer to Array
Points to the first element of an array, allowing easy array traversal.
int *p = arr;
Pointer to Function
Passes pointers to functions, allowing them to modify variables or arrays.
void increment(int *p);
Pointer to Pointer
A pointer that points to another pointer.
int **pp = &p;
How not to use a pointer
As a bonus as mentioned earlier pointers can get extremely complex. So, we will be showing you an example of a complex pointer and why you need to be careful when creating pointers and using them in your code. The code below should never be used in production, and never be used in anything. Read the comments to understand exactly what is going on:
Memory allocation is the process of allocating and managing memory during the execution of a program. There are two main types of memory allocations in C:
Static Memory Allocation
: this memory is allocated at compilation time.
Dynamic Memory Allocation
: this memory is allocated dynamically at runtime using pointers.
There are several functions to perform memory management. They are:
malloc()
: allocates memory dynamically
Benefits:
Allows dynamic memory allocation making it possible to only allocate the memory you need.
Allows flexibility by providing the ability to allocate memory of any size.
Does not have any initialization overhead.
Issues:
The memory is not initialized and contains garbage values. This may lead to undefined behavior if the memory is accessed before initialized.
Provides a risk of memory leaks if you don't free()
the memory allocation.
Even though it's faster than calloc()
you have to manually initialize it which adds additional complexity.
calloc()
: allocates memory an array and initializes to zero.
Benefits:
Initializes all memory to zero.
Is designed for arrays which makes it easier to understand.
Issues:
Performance overhead because it is initialized to zero.
More complicated to use because it requires two arguments.
Can produce over allocation which can produce memory waste or crashes.
realloc()
: resizes previously allocated memory.
Benefits:
Allows you to resize memory allocation dynamically as long as that memory is already allocated.
Will preserver existing data on resize.
Issues:
Does not expand the current block, just relocates the existing block somewhere else.
Has the potential for data loss. If it fails it will return NULL
.
Has performance overhead because it has to move the memory to another location.
free()
: frees dynamically allocated memory.
Benefits:
Can help prevent memory leaks.
Is a necessity for memory management in C.
Issues:
If you free memory and continue to use a pointer without resetting it to NULL
, it will become a dangling pointer. Dereferencing these pointers can lead to unexpected behavior.
If you call free on the same pointer more than once this may result in a double free
which may lead to undefined behavior, crashes, and/or vulnerabilities.
If you forget to free dynamically allocated memory this will result in memory leaks.
Static memory allocation
When you allocate memory statically you determine the lifetime and size of the variables at compilation time. These variables memory is allocated and deallocated automatically when the program starts and ends. Variables declared outside of functions or when using the static
keyword fall into the category. Arrays with a fixed size also fall into this category. An example of a static memory allocation is as follows:
Dynamic memory allocation
When you allocate memory dynamically you are doing so with pointers at runtime. This allows more flexible memory allocation because you can allocate memory based on the needs of the program. An example of dynamic memory allocation:
You can use calloc
to allocate the memory for an array:
You can also use realloc
to change the size of an already allocated memory section:
You always have to use free
after you have allocated memory and are done with it:
Memory segments
C memory is divided into multiple segments where you can allocate memory.
Stack:
Local variables and functions.
Automatically allocated and deallocated.
Limited, too much memory used leads to stack overflows
Heap:
Dynamically allocated memory comes from here (malloc
, calloc
, realloc
).
Heap memory is managed manually
No automatic reallocation can lead to memory leaks if free()
isn't used.
Global/Static memory:
Variables declared outside of any function and variables using the static
keyword
Memory is allocated at compile time
Possible issues with memory
Memory leaks
A memory leak is when dynamically allocated memory is not freed after it is no longer needed.
Dangling pointers
A dangling pointer is when a declared pointer points to memory that has already been freed.
Double free
A double free is when free()
is used more than once on the same pointer
Wild pointers
A wild pointer is an uninitialized pointer. This means if you use a pointer before it has been assigned valid memory.
Memory allocation cheatsheet
Function
Purpose
Initialization
Usage Example
malloc()
Allocates memory dynamically.
No (contains garbage values).
int *p = (int*) malloc(10 * sizeof(int));
calloc()
Allocates memory for an array and initializes to zero.
Yes (initialized to 0).
int *p = (int*) calloc(10, sizeof(int));
realloc()
Resizes previously allocated memory.
Preserves existing data.
p = (int*) realloc(p, 20 * sizeof(int));
free()
Frees dynamically allocated memory.
N/A
free(p);
C provides preprocessed directives and macros that are processed before compilation begins. These allow developers to include files, define constants, and create conditional compiled code. This improves code organization, readability, and efficiency.
Preprocessor directives
These are lines in the code that start with #
. They are instructions to the preprocessor that run before the compiler. Common types include:
#include
Include a source file or header file. Header files are either standard C library or a user defined file.
#define
Define a macro or constant.
#undef
Undefine a macro. This macro must have already been defined using #define
.
ifdef
/ifndef
Conditional compilation based on whether a macro is defined or not. These are useful for platform specific code/debugging.
#if
/#elif
/#else
/#endif
Conditional compilation based on conditions. Allows more control over #define
. Works like an if/else
statement
#pragma
Special compiler instructions. Behavior varies depending on compilers, is used mostly for optimization, warnings, or other compiler specific features.
Common issues with macros
Some common issues encountered with macros are:
Lack of type safety:
Unexpected side effects:
Debugging difficulty:
Since macros are replaced before compilation debugging issues related to macros can become extremely challenging.
Directives and macros cheat sheet
Directive/Macro
Description
#include
Includes external files (header files).
#define
Defines macros or constants.
#undef
Undefines a previously defined macro.
#ifdef
/ #ifndef
Conditional inclusion based on whether a macro is defined.
#if
/ #elif
/ #else
/ #endif
Conditional compilation based on specific conditions.
#pragma
Provides special instructions to the compiler (compiler-specific).
Macros
Preprocessor instructions that replace text or expressions in the code.
Debugging and error handling are crucial for all developers not just C developers. However, since there are no high-level features things like exceptions developers need to be able to handle errors manually and make effective use of debugging tools/strategies.
Debugging in C
Debugging is the process of identifying and fixing bugs/errors in a program. There are a lot of debugging tools, and you should use what you want. We will not be going into all of them in this course but will be focusing on the basic examples. Some common techniques include:
Print statements
This is by far the simplest method, you put print statements where you think the bug is and see what data is being passed to it. This allows you to print potential problem values and variables.
GDB debugger
The GNU debugger is a powerful tool that allows you to inspect your C code execution, set breakpoints, step through the code, examine variable values, and track the cause of runtime errors
Compiler warnings and errors
Modern C compilers provide warnings that may catch potential issues at compile time. Such as unused variables, incorrect types, possible zero division, etc. You should always enable compiler warnings when compiling your program.
Static analysis tools
Tools like lint
analyze source code for possible errors, suspicious constructs, and coding standard violations. These tools are useful for detecting before compiling.
Error handling
Error handling is done manually in C. It is done by checking return values of functions using error codes and handling the cases of the codes. C lacks built in error handling like try/catch blocks.
Checking return values:
Many standard libraries return special values when they fail such as: NULL
, -1
, or EOF
.
Using errno
for error reporting
errno
is a global variable in C. It stores the error code when certain standard library functions fail. This may provide more detailed information.
Error handling cheat sheet
Aspect
Description
Print Statements
Simple debugging method by printing variable values and program state.
GDB (Debugger)
Allows stepping through code, setting breakpoints, and inspecting program state.
Compiler Warnings
Enable warnings to catch potential issues at compile-time (-Wall
).
Static Analysis Tools
Tools like lint
help detect bugs and improve code quality.
Return Value Checking
Manually check function return values to detect errors (e.g., fopen()
).
errno
and strerror()
Provides detailed error reporting for certain standard library functions.
Now that we've been through all you will need to know to start writing in C code lets go ahead and write a program together. The objective of this program will be to do the following:
Dynamically allocate memory for an array of integers
Fill the array with user input
Calculate the sum of the array elements
Handle possible errors
Use macros to make the code cleaner
Use print statements for debugging
Use simple logging for error capturing
The code
Compiling the file
Now you will need to save and compile this file like so:
What did we just do? Well, we used the gcc compiler in order to compile the C file into an executable that we can call after compilation is finished. We used the -Wall
to show all warnings that seen during compilation.
Now you have finished the course! We hope you got enough out of this to start writing in C and be confident in what you're doing. We have taught you the bare basics of C and how it works. This course is designed to provide beginners the information they need in order to start writing C code. Please keep in mind:
This course is given to you for free by the Malcore team: https://m4lc.io/course/c/register
Consider registering, and using Malcore, so we can continue to provide free content for the entire community. You can also join our Discord server here: https://m4lc.io/course/c/discord
We offer free threat intel in our Discord via our custom designed Discord bot. Join the Discord to discuss this course in further detail or to ask questions.
You can also support us by buying us a coffee