Skip to content
This repository has been archived by the owner on May 14, 2020. It is now read-only.

Add base64 decoding for some rules #369

Closed
6 tasks
lifeforms opened this issue Jun 18, 2016 · 50 comments
Closed
6 tasks

Add base64 decoding for some rules #369

lifeforms opened this issue Jun 18, 2016 · 50 comments

Comments

@lifeforms
Copy link
Contributor

lifeforms commented Jun 18, 2016

Some vulnerabilities are being exploited using base64-encoded payloads, as suggested in #353.

  • rule 930110 (vuln)
  • rule 933140 (found in logs Jun 2016)
  • rule 933150 (found in logs Aug 2016)
  • rule 933160 (vuln 2)
  • rule 933170 (vuln 1, vuln 2)
  • review if this is interesting for other rules as well (grep logs)

Consider if we can use transformation operators with multimatch in the existing rule, or if we should make siblings.

There may be performance problems associated with this transformation, so test carefully the performance impact. If this is significant, we are better off adding these as siblings, for example in paranoia level 2.

@csanders-git
Copy link
Contributor

This might be tricky depending on what these rules look like. The problem is there is no way to treat a transformation as independent save for breaking out into another rule. By default base64Decode gets kinda touchy because if it's at the beginning then if the code isn't base64 encoded then all the other transofrmations will apply on garbage. If it's at the end and it is base64 encoded something like t:lowercase will completely break the functioning of the request.
I think the best way I know of for dealing with this is to break them out into their own rules.

@lifeforms
Copy link
Contributor Author

Ack! Creating many rules with base64-encoded variants is nontrivial, so I'm scheduling this for later.

@dune73
Copy link
Contributor

dune73 commented Jun 28, 2016

Outside of the base64-encoding, there might be applications that try to decode input via other methods before they execute the input. Which opens a wide field of combinations etc. I agree with @lifeform, that we should probably postpone this and try and get a clear concept of which encodings affect the security and in which combination.

I have also been thinking about a higher PL rule which detects double encodings etc.

Close this issue?

@lifeforms
Copy link
Contributor Author

lifeforms commented Jun 28, 2016

@dune73 Let's keep the issue on the table, just for a later release. Hopefully log analysis will give us a better picture of how encoding is used nowadays.

Besides using the transformation, we might use other methods to catch base64 too. For instance, if we look for a short string, we can try to match on the encoded byte strings too.

If the transformation turns out to be expensive from a performance perspective, we might also coalesce various regexps together into one big 'base 64 rule'. Do the transform once, check several exploits with an assembled regexp. This however would complicate rule maintenance a bit (violation of DRY).

@dune73
Copy link
Contributor

dune73 commented Jun 28, 2016

I introduced two new labels to tag issues like this. I think this is a necessity. Feel free to edit the labels to your convenience (not sure about the colours).

@csanders-git
Copy link
Contributor

@dune73 you're a perfect human being, that is exactly what I was thinking of doing :)

@dune73
Copy link
Contributor

dune73 commented Jun 28, 2016

Thanks. ;)

@lifeforms lifeforms removed their assignment Aug 3, 2016
@lifeforms
Copy link
Contributor Author

lifeforms commented Aug 5, 2016

Saw these ones today:

POST /wp-includes/category.php array=ZXZhbCgnZWNobyAoMTIzNDU0MzIwKzEpO2V4aXQoKTsnKTs
POST /wp-admin/options-smedia.php z=ZXZhbCgnZWNobyAoMTIzNDU0MzIwKzEpO2V4aXQoKTsnKTs

This string is base64-encoded eval('echo (123454320+1);exit();' which would have been caught by 933150.

@dune73
Copy link
Contributor

dune73 commented Aug 5, 2016

In fact it's detected by a lot of rules, but not 933150. Namely: 931100, 932110, 933160 in PL1, another 4 with PL4.

It really forces us to do multimatch which is a resource-hog.

Thought: In PL2, update action on all PL1 rules and add multimatch. This would keep PL1 performing, give us the desired coverage at PL2 and still no need to do strict siblings of all PL1 rules (which would be a pain to do and maintain).

@lifeforms
Copy link
Contributor Author

lifeforms commented Aug 5, 2016

@dune73 Yeah true that more rules catch it (by accident, though why 931100???). Applications actually decoding base64 will usually be PHP/java/.NET apps, so my idea is we won't need to add this transformation + multimatch to many rules, just the rules that specifically pertain to those platforms, plus some generic ones like LFI/RFI. My gut feeling right now is that it adding this transformation to most rules would not be helpful. But I will keep looking for base64 in my logs!

Maybe we should also try to define coding standards for when to apply which transformations - e.g. I don't fully understand on when t:urldecodeuni is done for instance. And ideally we might lint for those standards too, e.g. php rules should get t:base64decode + multimatch, SQL injection rules should have t:urldecodeuni,t:removeComments,... etc.

@dune73
Copy link
Contributor

dune73 commented Aug 5, 2016

+1 on the coding standard!

Otherwise, let's let this sink in for some time, gather infos and experience and we'll find a common ground.

@lifeforms
Copy link
Contributor Author

lifeforms commented Aug 16, 2016

Found some more injection attempts via base64 encoding against a WordPress plugin. The plugin just attempts to base64-decode POST and unserializes whatever's in it.

@lifeforms
Copy link
Contributor Author

lifeforms commented Oct 27, 2016

Found in logs: /wp-content/plugins/photocart-link/decode.php?id=Li4vLi4vLi4vd3AtY29uZmlnLnBocA==, base64 encoded ../../../wp-config.php (vuln)

@umarfarook882
Copy link
Contributor

umarfarook882 commented May 22, 2017

is possible to detect base64 encoded payloads by the following process

At first using regex can we able to detect base64 data, if possible we can match the base64 data and decode by transformation t:urlDecode,t:base64Decode and chain the process with next rule to match the payload with regex and block it.

Is it possible ? give yours ideas folks :)

@csanders-git
Copy link
Contributor

It isn't possible as far as i'm aware to detect base64 encoded data using modsec (you could detect it statistically perhaps. Instead we might consider adding the base64 encoded varients of things we check for like ../ :)

@umarfarook882
Copy link
Contributor

umarfarook882 commented May 22, 2017

so we can't detect any base64 encode payload attack on mod security? How we can prevent it?
If some one trying injection attack i.e using base64 encoded payload on some input field or HTTP headers. Any idea?

@csanders-git
Copy link
Contributor

well the thing about this is that the application logic is written to know that something is base64 encoded. In that sense it is very binary, something is b64 or not and it shouldn't change (often). As a result, per application b64 is easy to deal with. The problem is generalizing this. @lifeforms this might be a good early blog post doing like a LUA based b64 detection script.

@umarfarook882
Copy link
Contributor

@csanders-git hey it's worked as we talk about previously

is possible to detect base64 encoded payloads

@csanders-git It isn't possible as far as i'm aware to detect base64 encoded data using modsec (you could detect it statistically perhaps. Instead we might consider adding the base64 encoded varients of things we check for like ../ :)

Its working :)

i did create a regex to match any base64 encoding value from user input, if any input data has base64 encoding it will proceed as chain rule process to check whether this user inputs has any injection payload with regex pattern and then it proceed further rules.

Check my Github Repository i have show an demo with explanation how to detect and block base64 encoded injection payloads.

@csanders-git
Copy link
Contributor

csanders-git commented May 26, 2017

It looks like you did just as i suggested which is added base64 encoded variants of specific things.

This isn't actually generically solving the problem. The same could be accomplished by just assuming everything was base64 encoded and decoding it with t:base64decode.

but looks good.

@dune73
Copy link
Contributor

dune73 commented May 26, 2017

This brings me back to the topic of one of my favorite ModSec stunts that I have not managed to accomplish.

I would like to take all arguments and then apply a transformation and then check for all arguments if the transformation was applied and if we received a useful output. With this we could easily check if a payload was processed by base64 or some other encoding technique.

@umarfarook882
Copy link
Contributor

@csanders-git hey you didn't get what i am saying? If we consider every user inputs as base64 encoded and decode it use transformation t:base64decode. it will not work out

For example
As your idea we consider as everything as base64
if user input is admin' or 1=1#
try decoding the admin' or 1=1# = nothing
t:base64decode will try to decode this data and it will get nothing after decoding process.
Then how it can detect or move further to check whether user inputs has injection payloads?

It will workout only when we are exactly matching the user inputs is base64 :)
I write a regex to check whether any user inputs has base64 value and move on with chain process for each rules with t:base64decode.

For example
As my case
User Input YWRtaW4nIG9yIDE9MSM=
t:base64decode will try to decode this data and it will get admin' or 1=1# after decoding process.
So i t can match further with regex to identify any injection payloads :)

Hope you get it now

@csanders-git
Copy link
Contributor

As i said previously, this is not possible to do generically. You can do things to determine it's NOT base64 - for instance presence of a space or other nonvalid char. You can try and detect the equals sign... this will only work in 2/3's of cases. Otherwise you're left with something like ^[a-zA-Z0-9+/]+={0,2}$ which matches 'hello' and will create massive false positives.

The other options are that we do as @dune73 described where we base64 decode and try and add some logic to determine if it was successful or we add individual base64 samples to rules and variants of rules.

This is also a slippery slope since there are infinite different ways of encoding but i think that b64 is one worth considering.

@umarfarook882
Copy link
Contributor

@csanders-git i am just working on securing all vulnerable application like OWASP Mutillidae II,Xtreme Vulnerable Web Application (XVWA), SQLi Labs... to improve my defense level on web app. I just came across the base64 payloads when i am testing SQLi Labs. so only i am working on improving my defense level against all encoding techniques.

But as far as tested that regex i used for identifying the base64 its works goods...check the regex
^([A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$

whether i am wrong or right?

Say your view after testing the regex with base64 encoded payloads.. :)

@csanders-git
Copy link
Contributor

@umarfarook882 I like what you're doing and i think it's something that will teach you a lot and your feedback is good.

Notice however that this regex is too broad. it triggers on the word test
https://regex101.com/r/2C3O9R/1

@csanders-git
Copy link
Contributor

@dune73 if the point of the rule is that the underlying logic is going to be detecting it after a base64decode, then booting up the entire regex engine is excessive, it is just doing a small sort which will almost always come back as true. Rather than doing that if the approach is to be more performant it should just always match, decode and then check for maliciousness in a chain.

@dune73
Copy link
Contributor

dune73 commented May 26, 2017

Oh, I see. Smart. But how do you carry the variable between the first and the 2nd rule in the chain? MATCHED_VAR will be gone the moment you hit the 2nd rule, won't it?

@csanders-git
Copy link
Contributor

@dune73 I imagine it would have about the same efficiency as capture (using setvar that is)

@dune73
Copy link
Contributor

dune73 commented May 26, 2017

Setvar brings the alleged performance bottlenecks of opening too many variables that zimmerle claims. I am really not sure you can construct this the way you want and the variables with the right decoding end up in the right position for the 2nd rule. But I am very interested to see an example.

@csanders-git
Copy link
Contributor

@dune73 you are correct, you must use a var but atleast you don't have to boot up the regex engine. I think the alternative is to start building base64 versions of strings into existing rules but this is a DEEP rabbit hole

@umarfarook882
Copy link
Contributor

@dune73 Oh, I see. Smart. But how do you carry the variable between the first and the 2nd rule in the chain? MATCHED_VAR will be gone the moment you hit the 2nd rule, won't it?

Yes matched _var will gone for next rule. But I am using base64 detection regex in every rule as base to detect and then move further as chain to check respective rule payload regex. :)

@dune73
Copy link
Contributor

dune73 commented May 26, 2017

I think I have an idea: How about we run basic decodings like base64decode early in phase 2 and if we have a result worth checking (base64decode successful), then we create a variable like TX.args_decoded_<decode_type>_<argname>. Then we include TX:/args_decoded.*/ in all the rules where these evasions matter. This frees us from the burden to copy all the rules and run them saperately for original and decoded payload.

Still, we would benefit enormously from a rule that would be able to check if payload == decode(payload) but I think this can not be expressed in the rule language as is. Maybe the RULE collection could be extended with such a feature / variable to allow for SecRule ARGS "!@streq %{RULE.pretransformation_var}" "id:...,t:base64decode...setvar:TX.args_decoded_base64decode_%{MATCHED_VAR_NAME}=%{MATCHED_VAR}".

@csanders-git
Copy link
Contributor

yeah, this isn't a bad idea probably the most efficient use... now the question becomes what is the best way to limit false positives?

@umarfarook882
Copy link
Contributor

@dune73 nice idea let me try and check how its works with out any false positive :)

@dune73
Copy link
Contributor

dune73 commented May 26, 2017

Good plan!

@umarfarook882
Copy link
Contributor

@dune73 Your idea is working well but i have made a little modification in your idea. So currently i am checking for any kind of false positive :)

Let me explain the issue:
SecRule REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|REQUEST_COOKIES_NAMES "@rx ^([A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$"\ "id:333333,phase:2,pass,logdata:'Matched Data: %{MATCHED_VAR} found within %{MATCHED_VAR_NAME}: %{MATCHED_VAR}',t:base64Decode,setvar:'tx.args_decoded_base64payload=%{MATCHED_VAR}'"

As our case:
User input (base64) = YWRtaW4=
above rule will decode it and try to match it with regex pattern, so it failed to match the pattern....
that's where i made a mistake, we can't do both action at one time i.e base64 regex pattern matching & t:base64decode. it will not work out..

So i modified the rule to chain process. i carried every action i.e base64 regex pattern matching & base64 decoding action in each chain process rule.

So my final idea:
SecRule REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|REQUEST_COOKIES_NAMES "@rx ^([A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$"\ "id:333333,chain,phase:2,pass,logdata:'Matched Data: %{MATCHED_VAR} found within %{MATCHED_VAR_NAME}: %{MATCHED_VAR}'"

SecRule MATCHED_VAR "@rx ^[\W\s\w]+" "t:base64Decode,setvar:'tx.args_encoded_base64payload=%{MATCHED_VAR}'"

I use TX:/args_encoded.*/ in all rules where we needed to check encoded payloads :)

Thank you for your idea @dune73

why we can't use this method as a separate rule in OWASP-CRS and allow the user to modified it i.e adding TX:/args_encoded.*/ in the rules, depends on their application for detecting encoded payloads.

@dune73
Copy link
Contributor

dune73 commented May 29, 2017

Hey @umarfarook882. I am not sure I really understand you, but here is what I got:

You apply your regex to a payload in order to find out if the payload is base64 encoded. If the regex matches, it is very likely it is base64 encoded. You then decode it and write the result into a tx variable ready to be used by subsequent rules.

(I appreciate your passion in this. However, please try and describe what you try to do in very simple steps. it is hard to follow you from your writing. This is normal for new people on a project, so no worries. But please help us understand you.)

@umarfarook882
Copy link
Contributor

Matching base64 payload. Decoding the base64 payload and stored in a variable*

SecRule REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|REQUEST_COOKIES_NAMES "@rx ^([A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$"\ "id:333333,chain,phase:2,pass,logdata:'Matched Data: %{MATCHED_VAR} found within %{MATCHED_VAR_NAME}: %{MATCHED_VAR}'"

SecRule MATCHED_VAR "@rx ^[\W\s\w]+" "t:base64Decode,setvar:'tx.args_encoded_base64payload=%{MATCHED_VAR}'"

Now i can use this variable in any rules where i need to check for malicious payloads.

SecRule TX:/args_encoded.*/|REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|REQUEST_COOKIES_NAMES|ARGS_NAMES|ARGS|XML:/* "@rx ...." "id:..,msg:'Detects basic SQL injection attempts',phase:2,block,logdata:'Matched Data: %{MATCHED_VAR} found within %{MATCHED_VAR_NAME}'"

Hope you got it now @dune73
if this idea is ok we can use this in OWASP CRS as a custom rules for detecting malicious payloads in base64. Anyway still i am checking for false positive detection.

@victorhora
Copy link
Contributor

In a sort of (un)related subject, shouldn't we consider using base64DecodeExt for some of those rules?

@csanders-git
Copy link
Contributor

perhaps we should make better use of this as python, which is becoming more and more popular discards invalid chars in a b64 encoded var. Also means that at least the early regex that @umarfarook882 was using can't be used with python (which doesn't seem to have a way to turn this off) or php if the strict argument isn't provided. Perhaps we should add a note in the reference manual

@emphazer
Copy link
Contributor

any progress here?!?

@dune73
Copy link
Contributor

dune73 commented Oct 27, 2017

Not from my side. Have not looked into this for months, and I doubt this will change. If somebody comes up with a clean and well documented implementation, then I am open to review. If we stick to the proposal laid out by me above, then the first RP should only fill the said variable and then add it to one existing rule or so (how about 942100?). I would then certainly review it.

@spartantri
Copy link
Contributor

This is a long discussion that I just found, often base64 stuff will cause false positives and it may be encoded multiple times, including multiple base64 layers encoding, also having a streq match against a base64 string is not very good idea, it is a padded encoding string so adding spaces will make the encoding change and the transformations are not available to prevent the bypasses so this can be a Russian dolls game and spending too many resources is also not desirable so a limit should be set and depending on the PL it could be tuned.

Why not using a staggered approach:
-PL1 warn about potential b64 encoded payloads but let them pass
-PL2: b64decode and detectsqli and detectxss
-PL3: string checks with the data files
-PL4: declare valid base64 encoded elements and block all detected base64 fields out of that list

Something like the following:

PL1

SecRule ARGS|ARGS_NAMES|REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|REQUEST_COOKIES_NAMES "@rx ^([A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$" \
    "id:222100,\
    phase:2,\
    pass,\
    capture,\
    log,auditlog,\
    msg:'Found potential base64 encoded parameter %{matched_var_name}',\
    tag:'BASE64',\
    tag:'paranoia-level/1'"

PL2

SecRule ARGS|ARGS_NAMES|REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|REQUEST_COOKIES_NAMES "@rx ^([A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$" \
    "id:222200,\
    phase:2,\
    pass,\
    capture,\
    log,auditlog,\
    msg:'Found potential sql injection in a base64 encoded parameter %{matched_var_name}',\
    tag:'BASE64',\
    chain"
        SecRule MATCHED_VAR "@detectsqli" "t:base64decodeext,t:cmdLine,t:urldecodeUni,\
            t:htmlEntityDecode,t:jsDecode,multiMatch,\
            tag:'paranoia-level/2'"

SecRule ARGS|ARGS_NAMES|REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|REQUEST_COOKIES_NAMES "@rx ^([A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$" \
    "id:222210,\
    phase:2,\
    pass,\
    block,\
    log,auditlog,\
    msg:'Found potential sql injection in a base64 encoded parameter %{matched_var_name}',\
    tag:'BASE64',\
    chain"
        SecRule MATCHED_VAR "@detectxss" "t:base64decodeext,t:cmdLine,t:urldecodeUni,\
            t:htmlEntityDecode,t:jsDecode,multiMatch,\
            tag:'paranoia-level/2'"

PL3

SecRule ARGS|ARGS_NAMES|REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|REQUEST_COOKIES_NAMES "@rx ^([A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$" \
    "id:222300,\
    phase:2,\
    block,\
    capture,\
    log,auditlog,\
    msg:'Found potential sql injection in a base64 encoded parameter %{matched_var_name}',\
    tag:'BASE64',\
    chain"
        SecRule MATCHED_VAR "@pmf lfi-os-files.data" "t:base64decodeext,t:cmdLine,t:urldecodeUni,\
            t:htmlEntityDecode,t:jsDecode,multiMatch,\
            tag:'paranoia-level/3'"

PL4

SecAction "id:222400,pass,nolog,noauditlog,\
    setvar:'tx.b64_elements=/mycookie/ /mysecrets/ /otherstuff/'"
SecRule ARGS|ARGS_NAMES|REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|REQUEST_COOKIES_NAMES "@rx ^([A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$" \
    "id:222300,\
    phase:2,\
    block,\
    capture,\
    log,auditlog,\
    msg:'Found potential sql injection in a base64 encoded parameter %{matched_var_name}',\
    tag:'BASE64',\
    chain"
        SecRule MATCHED_VAR "!@within %{tx.b64_elements}" "tag:'paranoia-level/4'"

Anyway this is cool stuff and there are several plugins in many pentesting tools to do the encoding automatically and the WAF would be blind to them in most cases.

Obviously had to be perfected I wrote this in 5 min so expect errors and unnecessary captures and repetitive transforms, I copied the regex for base64 but it needs to be improved it may miss some valid characters as per RFC4648-sections-4&5, also the base64 alphabet should stick to either section 4 or section 5 and not mix them.

@theMiddleBlue
Copy link
Contributor

sorry deleted my previous comment on this, I missed the @spartantri comment (it seems a cool approach using t:base64decodeext in order to handle the missing of base64_decode strict parameter).

I'm trying to test all proposed rules in my test env

@github-actions
Copy link

This issue has been open 120 days with no activity. Remove the stale label or comment, or this will be closed in 14 days

@github-actions github-actions bot added the Stale issue This issue has been open 120 days with no activity. label Nov 21, 2019
@emphazer emphazer removed the Stale issue This issue has been open 120 days with no activity. label Nov 21, 2019
@dune73
Copy link
Contributor

dune73 commented Feb 11, 2020

During the monthly CRS community chat, we decided to close this in favor of the bigger solution, that does decoding trough a variety of transformations. This is aimed for CRS 3.3.

Meeting minutes: #1671 (comment)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants