2 Outline Code Obfuscation Practical concepts examples Code Obfuscation metrics OWASP Europe Tour 2013
4 Code Obfuscation Obfuscation “transforms a program into a form that is more difficult for an adversary to understand or change than the original code”  Where more difficult means “requires more human time, more money, or more computing power to analyze than the original program.”  in Collberg, C., and Nagra, J., “Surreptitious software: obfuscation, watermarking, and tamperproofing for software protection.”, Addison-Wesley Professional, 2010. OWASP Europe Tour 2013
5 Code Obfuscation Lowers the code quality in terms of Readability Maintainability Delay program understanding Delay program modification Time required to reverse it > program Cost reversing it > cost of developing it useful lifetime from scratch Resources needed to reverse it > value obtained from reversing it OWASP Europe Tour 2013
What is it good for? Good Evil • Prevent code theft and reuse • Test the strength of security controls – E.g. Stop a competitor from using your code as a quickstart to build its own (IDS/IPS/WAFs/web filters) • Protect Intellectual Property • Hide malicious code – Hide algorithms – • Hide data Make it look like harmless code – DRM (e.g. Watermarks) • Enforce license agreements – e.g. domain-lock the code • As an extra security layer – Harder to find vulnerabilities in the client-side • Test the strength of security controls (IDS/IPS/WAFs/web filters) OWASP Europe Tour 2013
11 Why not rely on the Server? • Question often raised: why not move security sensitive code to the server and have JS request it whenever needed ? • Sometimes you can... and you should! • But there are plenty situations where you can’t: – You may not have a server • Widgets • Mobile Apps • Standalone, offline-playable games • Windows 8 Apps made with WinJS – You may not want to have a server • May not be cost effective doing computations on a server (you have to guarantee 100% uptime, support teams) • Latency OWASP Europe Tour 2013
13 Measuring Obfuscation • Potency • Resilience Next: • We’ll present each metric using • Stealthiness simple examples • This is intentional, to ease the • Execution Cost process of understanding the metrics • Maintainability • However, they do not represent to the full extent what you can obtain if you combine a large set of different obfuscation transformations. OWASP Europe Tour 2013
14 Obfuscation Potency Measuring Obfuscation Generates confusion • Measure of confusion that a certain obfuscation adds • Or “how harder it gets to read and understand the new form when compared with the original” • To the left you can see a simple example of a factorial function OWASP Europe Tour 2013
15 Obfuscation Potency Measuring Obfuscation Rename all + Comment removal Generates confusion • Now to the right you see the result of renaming every symbol to a mix of lower and upper O’s. We all know that function names and variable names are quite useful for the purpose of understanding the code. Not only we’ve lost that, but the new names can be easily confused. • Also comments were removed. They are also important to understand a program. • So we can definitely say that the obfuscation introduced a certain degree of confusion. It has added some potency.
16 Obfuscation Potency Measuring Obfuscation Rename all + Comment removal Generates confusion Whitespace removal • Now, below, you can see the result of removing whitespaces from the code. It becomes slightly more confusing, so we can say it is slighly more potent than the previous example.
17 Obfuscation Resilience Measuring Obfuscation Resistance to deobfuscation techniques be it manual or automatic • Represents the measure of the resistance that a certain obfuscation offers to deobfuscation techniques • Or “how hard it is to undo the back to the original form” • To the left you can see the same example function as before OWASP Europe Tour 2013
18 Obfuscation Resilience Rename all + Comment removal Measuring Obfuscation Resistance to deobfuscation techniques be it manual or automatic • On the right you can see the result of applying rename_all obfuscation. • This is an example of an obfuscation which is 100% resilient, because, assuming that you don’t have access to the original source code, it’s impossible to tell what were the original names. • The comment removal obfuscation is also 100% resilient as you can’t possibly know if the original form had any comments and recover them
19 Obfuscation Resilience Rename all + Comment removal Measuring Obfuscation Resistance to deobfuscation techniques be it manual or automatic • on the bottom, you see the result after applying string String splitting splitting. • You can definitly see that it is more potent than the previous, but if you look carefully, you can see that its not hard to revert back to the previous form. • So we can say that this version does not really add much resilience when compared with the previous form.
20 Static Code Analysis for defeating obfuscation One way of attacking obfuscation is using a Static Code Analyser 1. Parses the code Constant propagation: 2. Transforms it to fullfill a purpose – Usually to make it simpler => better performance x = 10; – y = 7 – x / 2; Simpler also fullfills reverse-engineering purpose x = 10; • y = 7 – 10 / 2; Example simplifications – Constant propagation, constant folding – Constant folding: Remove (some) dead code • And most importantly, it is automatic! N = 12 + 4 – 2; N = 14; OWASP Europe Tour 2013
21 • We used Google Closure Compiler, a Static Code Analyser to simplify the code. • The result is on the right, which as you can see returned much easier to read code.
22 • If we compare the code on the right with the original code (on the left) we can see that they are not far apart. • So the potency of the obfuscation is only apparent. The real potency or the potency we should consider is the one that you get after using automated ways of reversing the code. • This does not mean that the string-splitting obfuscation is useless. It has to be combined with other obfuscations that provide more resilience.
23 Dynamic Code Analysis for defeating obfuscation • Another way of attacking obfuscation • Analysis performed by executing the code – Retrieves the Control flow graph (CFG) of the code executed – Retrieve values of some expressions • How it can be used to defeat obfuscation – Understand (one instance of) the program execution • You can focus on parts that you are sure that are executed – Retrieve values of some expressions • Aids code simplification • Find needle in the haystack => e.g. retrieve encryption key – Bypasses deadcode – Not very good for automatic reversal of obfuscation • May not “see” all useful code • If you need to make sure the code will remain 100% functional, you cannot use this technique – Gather knowledge for manual reverse engineering OWASP Europe Tour 2013
24 Obfuscation Stealthiness Measuring Obfuscation • How hard is to spot? – Or “how hard is to spot the changes performed by the obfuscation” – Or “how successfull the obfuscation was in making the obfuscated targets look like other parts of the code” • An obfuscation is more stealthy if it avoids common telltale indicators – eval() – unescape() – Large blocks of meaningless text OWASP Europe Tour 2013
25 Obfuscation Execution Cost Measuring Obfuscation • Impact on performance – Runs per second – FPS (e.g. Games) – Usually obfuscation does not have a positive impact on performance, but it does not necessarily have a negative impact. It depends on the mix of transformations chosen and on the nature of the original source code. • E.g. Renaming symbols => Same execution cost • Impact on loading times – Time before starting executing – Usually a function of file size – Usually obfuscation tends to grow filesize. But there are also some obfuscation transformations which also makes it smaller. • E.g. Renaming symbols again
26 Obfuscation & Maintainability Measuring Obfuscation Effect on maintainability = 1/potency (after static code analysis) Lower maintainability => mitigates code theft and reuse This is one of the most important concepts around obfuscation OWASP Europe Tour 2013
PART 3 PRACTICAL EXAMPLES PART 1 – OBFUSCATION CONCEPTS PART 2 – OBFUSCATION METRICS PPART ART 3 3 – –J J A A V V A A S S C C R R I IPT PT O OBBFFU US S CC AA TT I IO O N N P PRRA ACC TT I I CCA AL L E E XA XA M M PL PLEE S S 27
29 Compression/Minification vs Obfuscation • This is a compressed version of it • It really seems to be more potent. No doubt about it.
31 A simple trick will do it eval( (function(....)) ); • By replacing the eval() with a document.write (just one document.write(‘<textarea> way to do it) you get access to the decoded source. (function(...)) </textarea>’); OWASP Europe Tour 2013
34 How is that possible ? • Using type coercion and browser quirks • We can obtain alphanumeric characters indirectly + -> 0 Ok, but now without alphanumerics: +!+ -> 1 (+”a”+””)[+] -> “N” +!++!+ -> 2 Easy to get any number How to get an “a” ? +”1” -> 1 Type coercion to number ! -> false “”+1 = “1” Type coercion to string !+“” -> “false” How to get letters? (!+””) -> “a” +”a” -> NaN (!+””)[+!+] +”a”+”” -> “NaN” (+(!+"")[+!+]+””)[+] -> “N” (+”a”+””) -> “N” eval( (!+"")[+!+]+"lert(1)"); OWASP Europe Tour 2013
36 Wait... What about the eval ? • “eval” uses alphanumeric characters! • eval() is not the only way to eval() ! • You have 4 or 5 methods more • Examples – Function("alert(1)")() – Array.constructor(alert(1))() – ["sort"]["constructor"]("alert(1)")() • Subscript notation • Strings (we already know how to convert them) OWASP Europe Tour 2013
37 Let me see that again! OWASP Europe Tour 2013
38 Non alphanumeric Obfuscation • 100% potent • 0% stealthy (when you see it, you know someone is trying to hide something) • High execution cost – eval is a bit slower – But the worst is: file is much larger => slower loading times • May not work in all browsers • What about resilience ? – Unfortunately, not much (you can get a parser to simplify it back to the original source) • Good for bypassing filters (e.g. WAFs) OWASP Europe Tour 2013
39 Deadcode injection Deadcode injection + Rename local Original source code Can you spot where is the dead code ?
40 Deadcode injection Deadcode injection + Rename local Original source code
41 Deadcode injection • Deadcode insertion is a natural way of adding confusion to a source code, and thus increasing the potency of obfuscation. • Being deadcode, the code isn’t really executed, so this has no impact on Execution Cost • Would a Static Code Analyser remove this particular dead code? • No, because it relies on opaque predicates – Not removable using Static Code Analysis – Predicates similar to ones found in the original source ( ++stealthiness ) • Randomly injected ( ++potent ) • Increase complexity of control flow ( ++potent ) • Dummy statements created out of own code (++potent & ++stealthiness ) OWASP Europe Tour 2013
42 All Together Now • Up to now we have mostly HTML5 Canvas seen no more than two or example from three obfuscation mozilla.org transformations working together. • Let’s go back to the first example and see what happens when we mix a larger number of code obfuscation transformations together.
43 All Together Now • remove comments • dot notation • rename local • member enumeration • literal hooking :low • deadcode injection • string splitting :high • function reordering • function outlining • literal duplicates • expiration date "2199-12-31 00:00:00"
44 All Together Now • As you can see, you get and heavily obfuscated result. • We intentionally didn’t used any encoding- based obfuscation in this example to let you see the effect of these transformations together. Also, you are not seeing the whole code here. • For the record, not all encoding transformations are easily reversed. We could use for instance a Domain-lock encoding which needs to get the correct information from the browser to decode properly.
45 • And this is the result after running the code through Google Closure Compiler. • It didn’t improved the readability much because the obfuscation transformations offered a good degree of resillience. After Closure Compiler
47 Conclusion • Don’t forget execution cost – And where the code is executed. A Smartphone usually has less resources than a desktop computer. Obfuscation should be tuned to the platform where the code is being executed. • Obfuscation can be very effective as a way to prevent code theft and reuse, by – Making it a real pain to understand of the code – Making it a real pain to change the code successfully – Significantly lower the value that can be obtained by an attacker from reversing a code OWASP Europe Tour 2013
Contact Information Porto - Headquarters Edifício Central da UPTEC Rua Alfredo Allen, 455 4200-135 Porto, Portugal Pedro Fortuna Owner & Co-Founder & CTO Lisbon office Startup Lisboa email@example.com Rua da prata, 121 5A 1100-415 Lisboa, Portugal Phone: +351 917331552