Understanding Syscalls: Direct, Indirect, and Cobalt Strike Implementation
In case images fail to load, it might be due to jsDelivr CDN ban in Egypt. To resolve this, consider using a VPN. :)
- To Bypass user-mood hooks. why?
- For Hiding a code inside a legitimate process (Process Injection)
- Avoiding EDR alerts!
Hooking user-mode functions by placing a jump to another code section. EDRs use hooks to check the function parameters. For example, if you are trying to change the memory protections of some data to add executable protections. This is a very suspicious activity so EDRs will be alert to that. Most Hooks are on the lowest level of the user-mode interface in ntdll.dll which are the system calls.
Windows has a defined schema of how
syscalls are used. Most of the documented windows APIs are just a wrapper of a lower-level Functions in
ntdll.dll which are compiled to a
syscall with the right SSN (System Service Number). To look at how
Nt* version of the higher-level API is implemented.
syscall instruction. This instruction transfers the execution to the system-handler at the kernel. The handler is specified using pre-defined SSN number loaded into
EAX Register (In this case
EAX = 0x26 at address
So, to make a
syscall The SSN associated.
The code stub of the
syscalls is simple.
Now, the missing thing is the
syscall_number. These numbers are changing based on the Build version of windows. There are some techniques to get these numbers.
SysWhispers That generate the table of these numbers in the form of a header file and assembly file that can be embedded in the code. The generated code contains
syscall number for multiple versions, The right windows build version is detected at runtime using PEB structure.
The assembly code generated (Full document at example-output)
- SSN code stub
This technique doesn’t Look for SSN number, instead it gets the code stub of the required API. This can be done by opening the PE file and parsing the Export table of
- Extract SSN
It Extract the SSN from
ntdllby parsing the Export table. The difference between it and the previous one is that it only extracts the
syscallnumber. Both methods load
ntdll.dllfrom the disk first using win32 API
OpenFilewhich might be hooked. hell’s gate for more.
- Syscalls’ number sequence This method take advantage of the SSNs are in a sequence for example if a syscall number is 0x26 the following will be 0x27 and so on. This relies also on the fact that not all the system calls are hooked! So, to get the SSN of a function, you need to find the nearest unhooked syscall. this was presented by halos gate. But This is not valid in newer versions of Windows as the SSNs sequence is no longer valid.
- Parallel loading
This is an interesting technique explained in this blog. It uses windows feature introduced in windows 10 to load DLLs through multiple threads instead of one in older versions of windows. It was found that the syscall stub of native Functions
ZwMapViewOfFile()-There is a lot of things happens between the two actions, detailed explanation in the previously mentioned blog -are copied into
LdrpThunkSignaturearray. This is done to check the integrity of the functions’ code. These APIs’ syscall numbers can be used to load a new version of ntdll.dll from the disk and avoid any user-mood hooks.
- Sorting by system call address
This technique uses the relation between the address of the system call stub and the SSN. It is known as FreshyCalls . In simple words, it walks the Export Address Table of
ntdlland saves the Name -or a hash of the name- and Address of each entry in a table. Then, it sorts the entries by the addresses in ascending order. It was found that the first function
NtAccessCheck(by address) has an SSN = 0
and if we unassembled the next function by adding one to the last address (as ret opcode is one byte) we will get that the next function’s SSN is 1!
So, by sorting the functions by the addresses, we have the SSN. for the code, look at MDSec (8. Sorting by System Call Address) blog or see FreshlyCalls implementation.
The execution of the system call is not direct by calling
syscall instruction. Instead. It uses the method explained below. Briefly, it uses the
syscall instructions from
All the methods described are workarounds to get the system call number without getting caught.
syscall instruction reveals that some suspicious activity is going on. This is done using
KPROCESS!InstrumentationCallback in windows.
Any time the windows is done with a syscall and returns to user-mode, it checks this member it is not
NULL, the execution will be transferred to that pointer. To check if the syscall is legit, the return address after finishing the syscall is checked to see if it is not from a valid place. If the address is in the address space of the process running, it’s not a legitimate place to make a syscall. This check was done by ScyllaHide to detect manual syscalls, the source code can be found here.
It checks the return address of the successful system call. If it resides on the address space of the binary we are running, it is an indication of manual system call.
The solution to this hooking method is done by Bouncy Gate and Recycled Gate method. The idea is quite simple, it is an adjusted version of Hell’s Gate. Instead of directly executing
syscall instruction and getting caught by static signatures and system call callbacks described above, the author replaces the
syscall instruction with a trampoline jump (
JMP) to a
syscall instruction address from
ntdll.dll. now there is no direct
syscall instruction and the system call originated from a legitimate place
ntdll. This is also implemented in SysWhispers3. To get the address of the syscall instruction in
ntdll we can parse the export table and search for syscall, ret opcodes
0F 05 0C or the constant pattern of syscalls in
ntdll can be used to get the syscall address. If the function is not hooked, the syscall instruction is on offset
0x12 from the function’s address, we can verify that by comparing the opcodes.
Indirect syscalls in Cobalt Strike
The sample from Dodo’s blog Where he already analyzed how indirect syscalls implemented in Cobalt Strike. for easy access, here is UnpacMe Results 020b20098f808301cad6025fe7e2f93fa9f3d0cc5d3d0190f27cf0cd374bcf04. The sample is packed. The unpacking process is easy. Just put a breakpoint on
VirtualProtect and get the base address (First Argument).
sub_18001B6B0 contains the important part, system call SSN retrieving and execution methods. You can get to this function by following the
call instruction to
rax which contains a
qword memory area or a call to the
qword directly. These locations are populated with addresses of the required APIs in this function.
We can see multiple calls to
sub_18001A73C with arguments:
qword_*, a hash (such as
0B12B7A69h), variable passed to the function
sub_18001A7F4 and another allocated memory which is also passed to
sub_18001A73C is to resolve the function address (
syscall stub address) by the hash. And function
sub_18001A7F4 used to populate the list with the system call SSN and system call stub. So,
sub_18001A7F4 is our target. In the following picture is the beginning of the function.
The function starts with getting a pointer to the first entry in
InLoadOrderModuleList structure by going through reading the Process Environment Block (PEB). here in the picture, r10 is holding the current entry of the structure and r9 is like a variable to get each entry, this is the breaking condition of the loop as the
_LIST_ENTRY structure wrap around itself (doubly linked list).
The next step is to get the Export directory of
ntdll.dll but first, get
ntdll address in memory.
It is looking for the right module in the
InLoadOrderModuleList by going through each entry, the
flink is a pointer to
LDR_DATA_TABLE_ENTRY where we can get a pointer to the module. By parsing the module (going through PE file headers) to get the name of the DLL which resides in the Export directory (First member) which is the first member of
IMAGE_DATA_DIRECTORY structure. It is then tested to see if it is the target module (
If the module is
ntdll, it saves a pointer to
AddressOfNameOrdinals. A memory region of size 0x1f40 is then zeroed as it will hold the structures of the system call information needed.
The next part is checking the function prefix
Zw. It looks for only one function prefixed by
Ki with the hash
8DCD4499h, but I couldn’t find function with this hash (using debugger). Then, a call to a hashing function is made. The hashing function is simple.
0x52964EE9 as an initial key value to start the process then:
- Get 2-bytes of the Function name (little endian).
- Rotate the key by 8 (2 characters).
- Add the key and the 2-bytes of the name.
- Increment the counter by 1 (Resulting that all the chars in between the start and end taken two times in the calculation for example
Wzin the first iteration and
Owin the second and so on).
- The result of the addition is XORed with the key to produce the new key. The hash value returned is the last result of the XOR operation.
The resulting value is stored in the following form, in the pre-allocated space.
- The first
DWORDis the hash.
- The second
DWORDis the Relative Virtual Address (RVA) of the system call0.
- The third
QWORDis the Virtual Address (VA) of the system call stub (RVA + ntdll Base Address).
So, it can be written as:
After populating the structure with the addresses. The structure elements are being sorted by the RVA of the system call stub (second entry in the structure).
After the sorting algorithm is done, the memory structure look like the following:
The first address is the address to the Lowest address
ZwMapUserPhysicalPagesScatter (Could be different at newer versions of windows) at address
00000000774E1340 If we see the system call SSN of it:
system call number is zero. This is how it gets the SSN for any function, by iterating the structure to get the right hash, the counter will be used to get the SSN (SSN = counter).
So far, this is remarkably like MDSec (8. Sorting by System Call Address) implementation of the technique known as
We could rewrite the technique using MDSec implementation as follows:
The next thing is to use the structure to get the SSN. and
syscall instruction to call. This is done by function
The function takes the following parameters:
- The array of structures that has the system call info (called
- constant value 0x1F4 the maximum length of the structure members (structure size = 0x1F4 * 0x10).
- Pre-Allocated memory
- The function hash.
- Global variable to get the system call SSN and stub. The function is simple, it searches the populated structure to find the given hash. If it’s found, the counter value is taken and to get the Address of the system call stub. To get the address, the base address of the structure is added to the offset multiplied by 0x10 (struct size) and add 8 to get the last QWORD.
The address the passed to
get_syscall_ret_address to get the
syscall ret addresses to use it to execute the system call to bypass the callback mentioned before (call stack tracing is be used to detect this trick).
The global variable is used to store:
- QWORD to store System call address (function address at
- QWORD to store
retinstruction sequence address.
- DWORD to store system call number SSN. We can rewrite it as follows:
(Creative names I know :) )
There are some choices to call the required function. This is done based on the value at a global variable (0x18004BC6C):
- 1 : Direct call using the first member of the structure (Address of the function in
- 2 : Indirect system call using trampoline jump using the system call number and the
syscalladdress stored before.
- anything else: Direct call to Win32 API.
System calls can be used to bypass user mood hooks but there are other methods to detect Direct and Indirect syscalls.
To detect Direct system calls, Windows provides a large set of callback functions, one of them is
KPROCESS!InstrumentationCallback . This callback is triggered whenever the system returns from the kernel mode to user mode. This could be used to check the return address of the
syscall which reveals the location of
syscall instruction execution. This location should be
ntdll but in case of the direct system calls, it will be from the
.text section of the PE file. This was used by ScyllaHide.
Indirect system calls solved this problem by getting the address of
syscall instruction in
ntdll and jump to it. To detect indirect syscalls the call stack tracing method can be used to check from where the system call originated -before jumping to
ntdll-. This also can be bypassed by creating a new thread to get a new call stack using callback functions like
RtlQueueWorkItem. If you want to know more about this, you can read Hiding In PlainSight 1&2
Note: This was personal notes I wrote when I was learning about syscalls, if there’s anything not accurate, please let me know