• XSS.stack #1 – первый литературный журнал от юзеров форума

Obfuscator-LLVM++ | Not a crypter/morpher

В этой теме можно использовать автоматический гарант!

Статус
Закрыто для дальнейших ответов.

drpalpatine

(L3) cache
Пользователь
Регистрация
04.08.2021
Сообщения
260
Решения
1
Реакции
108
Гарант сделки
2
Депозит
0.0001
Before start, we establish few things -->
1. Strength merely because of repeated application of transformation passes, nesting, combinations is mostly bad + technically speaks low of the obfuscator author.
2. I will not speak for MorphMePlease nor am I affiliated with Octavian in any way. The project is interesting but benchmarks are somewhat vague in terms of mathematical metrics, strength against cfg, MBA deobfuscation, symbolic execution attacks, but I am impressed by his technical feat of successfully working at the AST level (especially for modern syntaxes of C/C++) instead of opting for something at a lower level intermediate level (for me, it is convenient to work with SSA style well formed IR than with trees), and also respect the author on his contributions to the forum with his interesting research. I will not speak against or make assumptions about MMP in any way, since there's no public knowledge about it's obfuscation capabilities. I did not provide the basic definitions of what an opaque predicate or MBA is for example. I encourage buyers to read the MMP sale page + google for the understanding of existing obfuscation techniques and practically test the capabilities of existing obfuscators first.

[1] OLLVM github.com/obfuscator-llvm/obfuscator
[2] Hikari github.com/HikariObfuscator/Core
[3] Pluto github.com/bluesadi/Pluto-Obfuscator
[4] YANSO github.com/emc2314/YANSOllvm
[5] GDop https://faculty.ist.psu.edu/wu/papers/opaque-isc16.pdf github.com/s3team/gdop/
[6] github.com/quarkslab/llvm-passes, github.com/Deniskore/llvm
[7] Tigress tigress.wtf/transformations/
[8] *MorphMePlease /threads/73254/
[9] Armariris github.com/GoSSIP-SJTU/Armariris
[10] IOLLVM arxiv.org/abs/2203.03169
[11] Automatic generation of opaque constants based on the k-clique problem for resilient data obfuscation https://ieeexplore.ieee.org/document/7884620
Experimental assessment of XOR-Masking data obfuscation based on K-Clique opaque constants https://www.sciencedirect.com/science/article/abs/pii/S0164121219302663
[12] loop: Logic-Oriented Opaque Predicate Detection in Obfuscated Binary Code https://dl.acm.org/doi/10.1145/2810103.2813617 github.com/s3team/loop
[13] blog.quarkslab.com/deobfuscation-recovering-an-ollvm-protected-program/
[14] hex-rays.com/blog/hex-rays-microcode-api-vs-obfuscating-compiler/
[15] eshard.com/posts/D810-a-journey-into-control-flow-unflattening/
[16] govcert.ch/whitepapers/unflattening-confuserex-code-in-ida/
[17] MBA-Blast usenix.org/conference/usenixsecurity21/presentation/liu-binbin
https://github.com/softsec-unh/MBA-Blast
[18] https://news.sophos.com/en-us/2022/05/04/attacking-emotets-control-flow-flattening/
[19] https://www.politoinc.com/post/2020...ation-of-windows-malwareexploits-using-o-llvm

The name is somewhat a clickbait (steal of spotlight) for potential customers from the original OLLVM.
But I was hesitant to build it under the name of OLLVM++ because of an interesting reason, most people don't know that the OLLVM when it was released, all of it's techniques were easily deobfuscatable at that time, this applies similarly to Pluto, Hikari (open + closed source, commercial Hikari isn't strong as you think). The only places where Hikari commercial version shines brighter than OLLVM++ is in stronger implementation for some runtime to not break in rare nuanced, bespoke scenarios, but OLLVM++ is strong in this area as well and will get better over time.
The OLLVM++ at the time of release, the core obfuscation logic cannot be attacked because of the current limits of the deobfuscation techniques, both widely used and experimental in research stages. Moreover the practical tools in cybersecurity are always many steps behind from what is already scientifically researched.
--> Technically, every language that compiles to LLVM IR is supported. There's nothing major at the LLVM IR that breaks based on the choice frontend language, operating system, architecture.
C/C++, ObjC (clang), swift, rust (rustc), zig etc, x86, x86-64, arm etc (semantic morphing in control flow, some of the call graphs passes, MBAs, generally don't result in breaking on other arch like MIPS but there's no valid testing from my side)


In my opinion, all the existing code obfuscation techniques (non adversarial) -->
1. Data
2. Control Flow Graph (CFG) (intra function)
3. Call Graph (inter function)

Data
---
OLLVM: doesn't provide.
Hikari: String (globals) Encryption (XOR mask + decrypt under a global ctor)
Pluto: MBA
YANSO: ObfCon
GDop: not concerned
*MorphMePlease: 1. Constant Folding (MBA), 4. String encryption, 8. Legitimate strings generator, 12. Expression morphing
Armariris: doesn't provide
IOLLVM: doesn't provide

K-Clique paper [11] --> Ideally this is perfect, using NP problems that evaluate in the process of XOR masking, in essence, hard 3SAT construction based on the old paper Bart Selman --> to k clique using karp reduction --> breaking == solving np problem. Our technique can be considered an NP problem, because there's no quick way to solve the MBA in polynomial time, the given solution can be verified in polynomial time (at runtime). But the technique proposed (final math expressions) in the paper has 100x overhead compared to our technique with math expressions.

parameters to evaluate the quality of MBA -->
1. Strength: Can it be deobfuscated by state of the art MBA solvers like MBA-Blast, USENIX? If it can be simplified, under what time?
2. Runtime Overhead: essentially the compute required which is FLOP count of the expression

All of these MBA are simplified instantly by MBA solvers. A new technique was developed using algebraic geometry, special functions, secrets... that generates MBAs that cannot be solved by state of the art MBA-Solvers and also have low FLOP count.
For example, none of the techniques available can be extended from 2 variable formulae to n variables. This can be done with Pluto's truth table technique but exponentially increases with n, 3 variables itself becomes impossibly computationally heavy obfuscating a simple program.


Control Flow Graph
---
Flattening
---
This is the least interesting and even worse transformation (due to overhead) on this list provides and most authors market their obfuscators with no modifications to little modifications (like nested switching, bogus blocks).
I made my own variant of this transform (not nesting, bogus blocks, nothing so far I could find either in the public research) which puts radical complexity in the control flow graph, if you believe in cyclomatic complexity as a metric for this, then it's 10 times stronger than OLLVM, without giving away the graph structures (which vary every transformation) here, buyers will get the detailed walkthrough with graphs, cyclomatic complexity before and after, and most importantly how it's different from OLLVM, IOLLVM etc.
If overhead is not a factor, then I encourage the application of flattening.

All of the existing flattening and their variants with slight modifications have many well documented deobfuscation techniques. I propose a stronger variant that significantly increases cyclomatic complexity, bypasses all deobfuscation techniques and behaves somewhat differently in terms of graph structure compared to existing techniques.
Opaque Predicates
---
The strongest adversary of opaque predicates is symbolic execution. Symbolic execution was incredibly stronger at the time OLLVM, Pluto were released and all of them would've been solved then. Now it has gotten even stronger and by the far the most mature, strong engine is Angr. All the benchmarks in this page are predominantly done with Angr. Soon I will try to upload benchmarks with KLEE, BAP, Triton (if necessary, not familiar fully with some of the API) but I believe that experts who worked with symbolic engines can agree that Angr is the ultimate test.

OLLVM: BogusControlFlow (y < 10 || x * (x + 1) % 2 == 0)
Hikari: BogusControlFlow, IndirectBranch, SplitBasicBlocks
Pluto: BogusControlFlow (y < 10 || x * (x + 1) % 2 == 0) + TrapAngr (n - 3 * ((n * (uint64_t)0xAAAAAAAB) >> 33)) + RandomControlFlow (x = (x - 1) * (x + 3) - (x + 4) * (x - 3) - 9) (x = x * (x + 1) - x^2) (x = 3 * x * (x - 2) - 3 * x^2 + 7 * x)
YANSO: Connect
GDop: First time, actually made good progress with introduction of "dynamic" opaque predicates but can be executed symbolically today
*MorphMePlease: 2. Basic block opaque predicates morphing, 3. Symbolic execution protection, 5. Smart trash-code generation
Armariris: doesn't provide
IOLLVM: In-Degree obfuscation

All of these "magic numbers" type of conditions are simplified instantly by symbolic execution attacks, with MBA solvers. I propose the generation of minimal (not low) FLOP count opaque predicates that are delay symbolic execution attacks significantly, each insertion taking more taking anywhere from instant solving to n seconds calculated based from statistical analysis (can be configured, many templates to limit formation of signatures) + the cfgs are fucked anyway

Call Graph
---
OLLVM: SplitBasicBlocks, Substitution
Hikari: FunctionCallObfuscate, FunctionWrapper, Substitution
Pluto: Instruction Substitution (similar to OLLVM Substitution)
YANSO: Merge, BB2func, ObfCall, VM (similar to OLLVM Substitution)
GDop: not concerned
*MorphMePlease: 6. Executable fake functions generation, 7. Function calls obfuscation (chaining calls), 10. Spaghetti basic block morphing (control flow morphing), 11. Operations function replacement, 14. Function args obfuscation
Armariris: doesn't provide
IOLLVM: doesn't provide

There's no proper benchmark/metric for this. BinDiff? DeepBinDiff? These are the usual. But I propose similar techniques with few extra techniques of my own taking inspiration from combinatorics, graph theory to mathematically confuse reversing.
---


Ideally I am looking for a team for work on % basis --> i will listen to every proposal in PM --> the tool has to be finetuned+tested to ensure comfortable overhead+obfuscation balance --> check for breakage at runtime+finish touches+fixes blah blah --> to your specifics by manual testing
everything will be easier if --> a team who understands what code obfuscation is --> how he is different from crypters, morphing etc --> understands deobfuscation, benchmarks (provided in private) --> where it is not suitable in areas like payload delivery with file size limits --> experience with other obfuscators --> can have a technical discussion about obfuscation (without using such words --> "so it's a crypter?", "how many detects?", "sophos bypassed?")
we will sit through testing different settings depending on cases --> for example if speed is important --> i need team with little patience to optimize for best config with me --> everything is 'perf' on AES++
(researchers, avers, cops, sniffers, scammers, brian crabs, drug addicts --> throw away)

Price --> $4k
(The obfuscation logic is ready, but all the pieces of joining, clean CLI, exposing configs, transforms pipeline, final cleanups, blah blah graphic designing, oil painting, stickers (front end) is not finished, everything has to be processed together manually by me each step individually)
m.png


I plan on working on this project to further improve this, I have a list of fixes, bugs, notes, todos. More techniques have been researched but some nuances, todos, testing remains. Based on the interest of the team, I will consider this.
The project will not include source.
XSS Auto-guarant
Distribution of product, obfuscated binaries to others as a service is restricted.
I am not on any public XMPP, TOX, telegram. First contact in PM.
I am frequently away from the forum --> customers will be given a private XMPP/TOX for proper support
Donot encrypt correspondences if you throw away the keys.
 
Последнее редактирование:
similar to threads/34888/
In some cases, yes this can be considered a morpher because of the ambiguity in usage of those terms. Morpher is not a common term in the white hat world, and is mostly used from forums like us.
No, you cannot build something like this using llvmlite which is just an incomplete, full of bugs wrapper around the llvm C API, it was only built for numba's jit compilation needs, a very specific subset of the api (not even llvm c++ api) and shitcode with bugs. https://github.com/numba/llvmlite
Individual testing is encouraged, if there's actual interest in buying after studying the description.

any language is supported, not just c/c++/rust/swift as long as you can find a suitable compiler that compiles to llvm IR

I will not take more than 2 participants.
questions that will be of interest to potential buyers --> please post below or in private
 
screenshot243.png



the test sample taken is from https://github.com/m3y54m/aes-in-c/
***mangled name from the compiler
more specifically this simple innocent function with a xor was fucked up
C:
void addRoundKey(unsigned char *state, unsigned char *roundKey)
{
    int i;
    for (i = 0; i < 16; i++)
        state[i] = state[i] ^ roundKey[i];
}


interested buyers --> please send a sample code you want to me for testing --> we will go through testing


even this function with a single LLVM basic block was morphed up 😂
C:
unsigned char getRconValue(unsigned char num)
{
    return Rcon[num];
}

all this level such trauma can be controlled to a strong extent
screenshot244.png
 
developed using algebraic geometry, special functions
from combinatorics, graph theory to mathematically confuse
я как-то не обратил внимания на эти ваши фантазии, может что-то для конкурса?
http://xssforum7mmh3n56inuf2h73hvhnzobi7h2ytb3gvklrfqm7ut3xdnyd.onion/threads/101484/
 
Это конечно все хорошо, но что по детектам на выходе. У стандартного llvm обфускатора сразу же елка в статике Trojan/Win32.Generic, Win64:Malware-gen, Unsafe и тд, как бы я не крутил те настройки. И просто "обфусцировать" (разбавляя ветвлениями, циклами, мат операциями) не имеет смысла когда у тебя к примеру импорты (imphash), ресурсы одинаковые на каждом билде.
Можешь где то в моих сообщениях на форуме посмотреть я писал о том как по настоящему должен выглядеть морфинг что б твой софт жил годами без чисток, при ежедневном проливе и сдаче софта в аренду, не хочу просто повторятся.
 
Это конечно все хорошо, но что по детектам на выходе. У стандартного llvm обфускатора сразу же елка в статике Trojan/Win32.Generic, Win64:Malware-gen, Unsafe и тд, как бы я не крутил те настройки. И просто "обфусцировать" (разбавляя ветвлениями, циклами, мат операциями) не имеет смысла когда у тебя к примеру импорты (imphash), ресурсы одинаковые на каждом билде.
Можешь где то в моих сообщениях на форуме посмотреть я писал о том как по настоящему должен выглядеть морфинг что б твой софт жил годами без чисток, при ежедневном проливе и сдаче софта в аренду, не хочу просто повторятся.
please read the description again --> i have mentioned it specifically + multiple times to not let such comments fly --> if you work in the white --> there is no usage of "output detections" + "obfuscation" in the same page
moreover there is no extensive usage of such term "morpher" in the white --> this is somehow the translator or the forum culture that breed such terms
moreover such working with IAT, trampolines blah blah --> everything can be added but this is not the interesting theme of the product --> for a similar idea --> http://xssforum7mmh3n56inuf2h73hvhnzobi7h2ytb3gvklrfqm7ut3xdnyd.onion/threads/34888/ exactly matches my product

but thanks for the reply --> i already agree with you + i made this very clear in the post and everybody knows this

moreover --> a team was found admin please close the thread
if there is technical discussion/debate about "obfuscation" == semantic scrambling of the code --> please open a thread in the forum --> everybody can discuss
moreover i am frustrated with usage of word "morpher")
 
please read the description again --> i have mentioned it specifically + multiple times to not let such comments fly --> if you work in the white --> there is no usage of "output detections" + "obfuscation" in the same page
moreover there is no extensive usage of such term "morpher" in the white --> this is somehow the translator or the forum culture that breed such terms
moreover such working with IAT, trampolines blah blah --> everything can be added but this is not the interesting theme of the product --> for a similar idea --> http://xssforum7mmh3n56inuf2h73hvhnzobi7h2ytb3gvklrfqm7ut3xdnyd.onion/threads/34888/ exactly matches my product

but thanks for the reply --> i already agree with you + i made this very clear in the post and everybody knows this

moreover --> a team was found admin please close the thread
if there is technical discussion/debate about "obfuscation" == semantic scrambling of the code --> please open a thread in the forum --> everybody can discuss
moreover i am frustrated with usage of word "morpher")
Сори что еще раз влажу в твою тему, но если для "белого" то есть уже готовые решения в виде виртуализации кода, тот же VMProtect или аналоги, нет ничего лучше для защиты твоего программного обеспечения, llvm и рядом не стоял перед виртуализацией.
 
1. https://xss.pro/threads/73254/post-587374
The best protection is polymorphic virtualization and polyform code virtualization (just putting in my 5 cents). In the end, he will bypass Static by 100, and the runtime already depends on you and not on the morph)
I absolutely agree with this statement. Yes, the level of virtualization is, IMHO, aerobatics in terms of protection against reverse/research. From the point of view of anedetects, creating a polymorphic virtual machine with a modified bootloader from build to build, so that it is impossible to attach a sign, is very difficult, IMHO.
It is impossible to create a virtual machine solo - teams have been doing this for years.
I don't know who you have to be to create something like this.

llvm is not even close to virtualization
so according to you with llvm --> you only write such passes at the IR level and there is no control of the later part? --> who handles the compilation to binaries in the first place?
1526085070296.jpeg
 
so according to you with llvm --> you only write such passes at the IR level and there is no control of the later part?
If you understand llvm quite well, you must understand what you can do and integrate your passes at any stage, on Front End, Middle End and Back End, this is a powerful way

1701237869210.png
 
Статус
Закрыто для дальнейших ответов.
Верх