So You Want to Write Malware?
Last updated
Last updated
Shameless plug
This course is given to you for free by the Malcore team: https://m4lc.io/course/mlwr/register
Consider registering, and using Malcore, so we can continue to provide free content for the entire community. You can also join our Discord server here: https://m4lc.io/course/mlwr/discord
We offer free threat intel in our Discord via our custom designed Discord bot. Join the Discord to discuss this course in further detail or to ask questions.
You can also support us by buying us a coffee
NOTE: This course assumes you know the basics of C, assembly, and Python.
Malware is actually an abbreviation for malicious software. It is any software that is intentionally designed to cause damage, exploit, or disrupt systems, networks, and/or devices. Malware can take a lot of different forms and serve a variety of purposes such as: stealing data, hijacking system resources, or enabling unauthorized access. Typically, malware is installed on the system secretly and is often delivered through deceptive techniques, like phishing, infected software, or exploits.
There are a lot of different types of malwares, we created a table of as many as we could think of. We will not be going into all these types of malwares, but it is good to know the different types. It’s worth noting there are multiple different types of malwares that are not on this list. We are sticking with the ones that are well known and do not have a controversy surrounding them.
Type | Description | Examples |
---|---|---|
| Spreads independently without the need of a host file or user interaction |
|
| Malicious software designed as legitimate programs to trick users into running it |
|
| Encrypts files or locks users out of the system and demands a ransom to access the files/system |
|
| A type of malware that provides admin access to a system while hiding its presence |
|
| A type of rootkit that infects the MBR (master boot record) this allows the malware to be loaded before the OS starts |
|
| A collection of compromised computers/devices (zombies) remotely controlled by an attack. Typically used for DDOS attacks |
|
| Records all the keystrokes that a user makes |
|
| Focuses on stealing certain information off a user’s system, such as passwords, credit cards, or personal information |
|
| Changes its code or appearance after spreading, preventing detection from signature based detectors |
|
| Like polymorphic but changes the entire code base each iteration |
|
| Remote access trojan, or remote access tool. Used to provide the attacker with complete control of the system. Typically comes with a keylogger. |
|
| Malicious code that lies dormant until triggered when a specific condition happens |
|
| Arguably the most destructive on this list. Is designed to completely destroy/corrupt the system |
|
| Typically a small initial stage of a larger attack. Designed to covertly install other malware onto the system |
|
It is important to note that I cannot legally tell you to go write malware and encourage you to deploy it and watch what it does. Writing malware comes with significant ethical and legal considerations. Being curious and wanting to learn more is natural and is why you are here reading this Bible, but it is essential to channel that curiosity responsibly in a way that does not infringe on other people or cause them harm.
Your pursuit of knowledge is admirable, and you should never apologize for wanting to fully understand the information you encounter. When approached correctly, studying and writing malware will provide valuable insights and foster immense growth. It is crucial to do this in a controlled environment.
If you get anything from this course, let it be this: curiosity is not a crime, but acting irresponsibly has serious consequences. There are several ethical reasons people learn how to write malware:
Understanding how it works
Provides deeper technical knowledge of the inner workings of malware
Provides a better understanding of how to reverse engineer malware successfully
Building better defenses
By understanding the techniques used by attackers, defenders can better defend
Can provide better insight of how an attack occurred and provide easier patches
Research and development
Test defenses by researching new types of attacks
Provide better threat hunting by knowing what the malware does
Payloads are small pieces of code that execute specific actions once deployed on the target. They are usually (ethically) used in the context of penetration testing, or malware research.
When writing your payloads you always want to first define the purpose of the payload:
Reverse shell
provides remote access to the system
Data exfil
extract information from the target
Keylogging
record every keystroke done by the user of the target system
System manipulation
alter system settings, change/delete files
In this example we will be writing a basic reverse shell using C. This shell will connect to the system and provide us remote access to it:
The code
Code breakdown:
Winsock
Windows uses Winsock for networking, to use networking on Windows you will need to initialize Winsock.
Socket creation
A socket is created using the socket
function. Specifying that it will use a IPv4 socket (AF_INET
) and use TCP (SOCK_STREAM
).
Connection
The connect
function is used to connect to the provided IP address. You will need to replace the IP and port to whatever is listening on your listener.
Duplicating streams
The STARTUPINFO
structure is used to redirect the standard input, output, and error streams to the socket. This allows the attacker to send commands and receive the output of those commands over the network.
Spawning cmd.exe
The CreateProcess
function starts the cmd.exe process which is the shell the attacker will use.
Waiting for the process
WaitForSingleObject
waits for the cmd.exe process to finish this keeps the connection alive.
Cleanup
After the shell terminates, the socket is closed and Winsock is cleaned up using WSACleanup()
Compiling the code:
You will need to compile this code on Windows, the command to compile the code is:
The -lws2_32
links the required Winsock library.
Set up the listener:
To set up a basic listener all you need is netcat. This is a default installation on most Linux systems. All you need to run is:
Flags:
-l
Listen for incoming connections
-v
Produce more verbose output
-p
Specify the listening port
Example of reverse shell working:
Malware usually attempts to evade detection so that it remains hidden and continues its process on the compromised system. Evading detection is crucial for most malware to succeed and achieve its objectives. In this section we will take our reverse shell and make it harder to detect by using obfuscation techniques.
String obfuscation
A simple technique to make detection harder is to obfuscate the strings you are using. As an example we can xor encrypt the cmd.exe
string and decrypt it at runtime:
This simple string obfuscation technique makes it so that this string is no longer readable in the binary file. This makes it harder to look for indicator strings such as: cmd.exe
.
Control flow obfuscation
Much like string obfuscation we can also add control flow obfuscation. Adding useless or "dead" code into the logic of the program can make it much harder to analyze. You use this to disguise the malicious intents of the malware and make it seem like its doing something else.
NOTE: It is important to mention that maldevs should not rely on known obfuscation techniques. It would behoove you to come up with your own obfuscation techniques. We added these as an example.
Dynamic import loading
Dynamic import loading (or dynamic API resolution) is a way to load imports without having to statically import them. This hides API calls and prevents static analysis of those API calls. This can be done with functions like GetProcAddress
.
The above code calls GetProcAddress
to load the API function instead of directly calling them. This will make it harder to statically analyze these functions within the file itself.
Delaying execution time
Sandboxes usually run on a set timeframe and run the code quickly to gather the information so that you don't have to wait. Sometimes malware uses sleep timers or obfuscated sleep functions to delay their execution and prevent the analysis from seeing their code run.
These timers will prevent the code from executing for 1 minute each.
Environment detection
A lot of the time malware will try to detect the environment it is being run in. This prevents the ability to dynamically analyze it within a sandbox, stopping the execution if the environment isn't favorable for the malware to run.
This code uses the GetTickCount()
Windows function that detects the uptime of the system. Since sandboxes are usually based off virtual environments and are started on a "per malware" basis we can check how long the system has been alive and stop our execution if it is not favorable for us.
Polymorphic payloads
Changing the code on each execution is called polymorphism. This prevents the payload from being analyzed easily by static analysis tools.
NOTE: polymorphic malware is rare. We are adding a simple polymorphic engine for clarity, this will most likely not work.
Each time this is executed the payload will mutate, this makes it appear different in memory and disk allowing evasion from static analysis.
Packing
Packing an executable is when you compress and alter the file structure. This makes it more difficult to analyze with static analysis and usually requires unpacking of the file. You can pack using opensource tools like UPX
or more comprehensive tools like vmprotect
. These change how the files run and usually unpack the file at runtime in memory and execute from there.
So that we can visualize packing we have taken the liberty to upload the files to Malcore so that you can see how the files behave differently when packed.
Unpacked version: https://m4lc.io/course/unpacked/revshell
Packed version: https://m4lc.io/course/packed/revshell.
Notice the difference between the dynamic analysis of the two:
This difference is present because the file is unpacking itself before it is being run. The first image is the unpacked file, the second image is the packed file. As you can see the packed version first starts to dynamically resolve imports instead of statically loading them. This makes it harder for static analyzers to determine what is being imported by the file.
Anti debugging
Detecting the presence of debuggers allows malware to prevent the program from being debugged by debugger tools. It will alter its behavior if the debugger is detected.
You can also overwrite the int3
instructions shown in assembly with another instruction. This prevents breakpoints and provides more obfuscation in the assembly.
Another way to detect if the malware is being debugged is by using assembly and checking:
This code checks the PEB (process environment block) for a BeingDebugged
flag. If the flag is not zero, it means the debugger is present.
Detecting virtual machines
A lot of malware samples attempt to detect if they are within a virtual machine. They will then modify their behavior if a VM is detected.
This above code will detect the CPU's identifier for a string that indicates it is in a VMWare virtual machine, it will exit if it is detected.
Process injection
Process injection is when malicious code is injected into legitimate processes like explorer.exe
or svchost.exe
. This helps hide the malicious intents of the malware by executing within a trusted process.
In this example the shellcode is injected into the trusted process and executed from the trusted process.
Process hollowing
Process hollowing is a technique that malware uses where it suspends a legitimate process, replaces its code with malicious code, and resumes the execution with the malicious code. This allows the malware to seem like it is a legitimate process.
In the above code example we are suspending a legitimate process, replacing its code with our payload, and resuming the thread. This disguises our payload as the legitimate process.
A lot of malware uses something called a c2 (command and control). This is a critical component for a lot of malwares particularly in APT (advanced persistent threats), botnets, and other forms of remote access malware. This c2 acts as the central point where an attacker can control and communicate with the malware on infected machines. This infrastructure allows the attack to issue the commands to the malware remotely and orchestrate the actives that the malware will take.
What does a c2 usually do?
Infection and callback
Once the malware has infected the host it will perform a callback to the c2 to establish a connection and let the attacker know that the system has been compromised. This is also called beaconing
.
Usually the malware will have an embedded IP address (it is also possible to have an encrypted config within the malware), domain, or URL that is the contact point of the c2. In sophisticated attacks, the malware may use a DGA (domain generation algorithm) that will create random domains to make it harder to block or track c2 communications.
Persistence
The malware will maintain a persistent connection with the c2 to send data, receive commands, or send periodic "pings" to let the c2 know it's still alive. It may even be a mixture of all of these.
Command execution
A c2 is also used to send commands to the malware remotely. These commands may include:
Additional payload downloading.
Exfiltrating data from the system.
Receiving and executing commands on the system.
Launching attacks on other targets from the infected system.
Data exfiltration
Malware c2 servers are commonly used to exfiltrate data off of the infected system.
Keyloggers may use a c2 to save the logged keys.
Ransomware may use a c2 to exfiltrate data for blackmail and extortion to the ransomed company later.
Stealers use a c2 to exfiltrate compromised credentials and bank info.
Updates
Sometimes (this is a rare occurrence) the malware may use the c2 to update itself with new functionality. In some sophisticated attacks the malware is modular, this allows the attackers to dynamically load "plugins" or modules into the malware from the c2 dependent on the attack phase.
Obfuscation/evasion
A c2 will normally have an encrypted communication channel setup to make it more difficult for prying eyes to determine what the c2 communication is doing.
Redirection and proxying
Fast flux is a technique used by some malware to frequently change the IP address of the associated domain name. It can resolve to a massive pool of IP addresses and those IP addresses act as a temporary proxy to the actual c2 infrastructure.
A c2 server may use a reverse proxy (also known as a jump box), this can be a compromised system that acts as a relay to the original server.
Possible c2 protocols
HTTP(s)
Commonly used to hide traffic with normal web traffic. Malware will send GET/POST requests to the c2.
DNS tunneling
Some malware will encode commands or data in a DNS query. This may go unnoticed due to DNS traffic not usually being closely monitored.
Custom protocols
Some more sophisticated malware families will create proprietary protocols to hide their c2 connections.
It is worth noting that this is very hard to accomplish and should not be attempted if you do not know what you are doing.
P2P (peer-to-peer)
Botnets have been known to use P2P communication. This is where each infected system can server as a c2 if needed, this makes it harder to take down the c2 infrastructure.
Basic c2 example
As you have read a c2 is designed to be the "communication hub" for the malware to take commands and actions on what to do next. Below is a basic example of a c2 written in Python:
Delivering malware is the process of getting the malicious programs onto the victims’ systems and getting them to execute it. Over the years criminals and attackers have developed more complex delivery mechanisms to spread their malware successfully. Some of the methods are targeted and sophisticated, while some of them are just dumb luck. The best delivery method always depends on the goal of the attacker, the type of malware, and the security posture of the target.
A breakdown of some of the most effective delivery methods that are commonly used is below:
Phishing emails
Phishing is still one of the best delivery methods. This method tricks the recipient into clicking links and downloading malware onto their systems. There are multiple different types of phishing attacks.
Spear phishing is a more targeted phishing; they craft the email specifically to the target or organization
Emotet malware often spread through phishing emails containing malicious Word documents.
Malicious websites
Attackers have been known to compromise legitimate websites or create realistic fake websites to deliver malware to visitors. This is usually effective due to users trusting the sites they visit.
Drive by downloads occur when a user visits a malicious or compromised site that exploits vulnerability causing the malware to be downloaded without the user's knowledge.
Watering holes are attacks that compromise websites that are frequently visited by specific groups. Once the user visits the site, malware is delivered to them.
The Angler exploit kit used drive-by-downloads to exploit known browser vulnerabilities and delivered ransomware, stealers, and trojans.
Malvertising
This is the process of injecting malicious advertisements into legitimate advertising networks. When the user clicks the ad, they are redirected to the attackers website.
Attackers just buy ad space in the ad network and deliver their malicious ads through it to millions of viewers.
A lot of cryptocurrency based malware (designed to use your CPU to mine cryptocurrency) uses malvertising.
Supply chain attacks
A supply chain attack is when an attacker compromises the vendor of a product and silently adds their malware into the product release.
It's worth noting, attacks of this magnitude are rare, and devastating.
Solarwinds is the best known example of a supply chain attack. The attackers inserted a malware dubbed "Sunburst" into the Solwarwinds product release.
There are plenty more types of delivery methods, but these are the most useful and well-known delivery methods.
That's all there is to this course. We hope you have been given some useful information about malware development and how to avoid detection. This course is designed to teach you the basics of malware development and evasion and show you how they work. We hope you got something out of this course and remember:
This course is given to you for free by the Malcore team: https://m4lc.io/course/mlwr/register
Consider registering, and using Malcore, so we can continue to provide free content for the entire community. You can also join our Discord server here: https://m4lc.io/course/mlwr/discord
We offer free threat intel in our Discord via our custom designed Discord bot. Join the Discord to discuss this course in further detail or to ask questions.
You can also support us by buying us a coffee