Tuesday, 22 November 2011

Javascript Obfuscation - Properties access

The way we access properties of an object in Javascript is pretty much straight forward with some tiny exceptions. What is interesting for obfuscation though is the way we use it and that's what we will see.

Common object properties access

The most common way to access property of an object is by using the dot notation. It's very simple, but for obfuscation it's not very convenient since we have to explicitly tell what property we want to access. However, there is a second way to access a property that is more convenient for obfuscation and it's with the bracket notation. Example :

var foo = {a : 1};
foo.a // dot notation
foo["a"] // bracket notation


The reason the second one is more convenient for obfuscation is that it takes a string and has we have seen before we can easily use obfuscation technique to produce the string we want. So the previous example can be rewritten has :

var foo = {a : 1};
foo[(!1+"")[1]]


Number properties access 

Numbers do have 2 extra ways to access their properties. Those 2 ways where made up because the dot notation is not adapted for numbers. For that reason we have the "dot-dot" notation and the "space-dot" notation to access properties of a number. They are usually unknown to most of the developer even from experienced Javascript developer. Example :

1..toFixed(2) === "1.00"
(1 .constructor+"")[11] === "m"

Sunday, 20 November 2011

Javascript Obfuscation - The numbers

The numbers

In this section we will explore the different ways we can get number values. The way javascript handles number and operator have a good set of particular behavior that can be surprising and we will see those behavior in detail. Also numbers in javascript only exist in one flavor 64-bit float. Whether we are talking about 1 or 0.5, it's always a 64-bit float.

Number declaration

There are multiple way of declaring a number, most of them are simple, but it can always be interesting to use a variety of them to confuse the reader.

 Notation  Expression 
Decimal100
Octal0144
Hexadecimal0x64

Using parseInt

The "parseInt" function has two particularities that are very interesting for obfuscation. The first one is that if you don't pass a 2nd argument to the function, it won't default to base 10, but it will try to guess the base of your number.

parseInt("10") === 10
parseInt("010") === 8


The second particularity of the function is that you can pass anything as a first argument including object and function. When you pass it something that isn't a string as the first argument, it will internally cast it as a string. Here are few examples that are using this point :

parseInt([].sort, 16) === 15 // function ... with base 16
parseInt([][[]], 31) === 26231474015353 // undefined ... with base 31


Casting anything to number

It's also possible to obtain number with the "+" operator as an unary operator. The result of the operation will be 1 or 0, expect if what you are prefixing is a number or a string. In fact for anything that isn't a string or a number the result of the operation is based on whether what you are trying to cast is truthy of falsy. Here's a good summary of what you can do with it :

Expression Result
+[]0
+""0
+!![]1
+null0
+true1
+false0
+"10"10
+"010"10

Note: I left the last one to point out that the "+" operator will always try to cast a number with base 10.

Sunday, 30 October 2011

Javascript Obfuscation - Getting "window"

In this section, I will show how to get the "window" global variable in an obfuscated way. This section is strongly related to how context works in javascript. If you have no clue what context are in javascript I suggest you take a look at what it is before reading this.

The most common method in javascript obfuscation to access "window" in an obfuscated way is to leak it. In standard mode (non "strict mode"), the global object (window) can leak in some cases. Here's a quick example to show how you can leak it :

function test() {
    return this;
}

a = test();


The variable "a" will now contain "window". This is a simple example, but it's not that great for obfuscation. What is better to use for obfuscation are native method that can leak the "window". One of the native method that is the simplest and most reliable to leak the global object from is "Array.prototype.concat".

Example :

a = [].concat; // We create a reference to Array.prototype.concat
b= a()[0]; // b now contains "window"


If you want to obfuscate this further you can always use the trick learned in the previous blog post and transform it into this :

[_=[][(1+{})[6]+(1+{})[2]+([][0]+"")[1]+(1+{})[6]+(!1+"")[1]+(!0+"")[0]],__=_()[0]]

Now "__" contains the "window" global object.

Javascript Obfuscation - Rewriting block of code

This section is about how to rewrite block of code in an obfuscated way. One of the obvious thing to do to chain your operation is to remove all the extra spacing, but this technique won't get you far since simple tool like jsbeautifier will unobfuscate your code very easily. What will we see is divided in 2 sections, the first one is about block of code that don't use any loop or condition, and the second one is about rewriting code that uses condition.

Simple block of code

Using array declaration

Array declaring are a nice and compact way to rewrite a block of code especially if we are re-using result from previous operation.

Example :

foo = 1;
bar = foo + 2;


Can be rewritten as :

[bar = [foo = 1][0] + 2];


This example is trivial, but there is one interesting thing to note. Most unobfuscator won't be able to rewrite the code in a nice way. If you use this pattern with larger amount of code, it will be a pain for people to understand the code even if they use tools.

Note : You can use the same principle with object declaration, but the syntax is less light and easier to follow.

Comma and parentheses

Using comma in parentheses is an other way to obfuscate code that is very similar to the previous one. It's something that most people don't know about and it's something that can leave most people perplex about the result.

Example :

({a:1},{a:2}).a


What is the result of this expression ? 1, 2 or an error ?

The actual answer is 2, because when you separate multiple operation with a comma in parentheses, the result of the parentheses is the result of the last operation. Once you know it it's simple, but for people that aren't aware of it, it can be puzzling.

Lisp style

This one last trick is interesting just for the look. It's mainly about syntax that use an abusive amount of parentheses.

Example :

(function z(){ return(z); })((foo = (1)))((bar = ((foo) + (2))))((alert((bar))))


If you are using a lot of a specific set of character in general, your code will be harder to read. Parentheses here are just an example, but it could also apply to "{" and "}".

Rewriting conditional block of code

In javascript it's possible to replace block of code that uses if/else statement using the conditional operator (ternary operator), logic operator (&&, ||), parentheses and comma.

Let's first take a look at what we can do with logic operator, parentheses and comma. This technique primarily uses the fact that logic operator are evaluated in a lazy way and that some block of code will only be executed in the cases we want.

Example :

if (test == 2) {
    bob = 1;
    foo = bob + 2;
}


Can be rewritten as :

(test == 2 && (bob = 1, foo = bob + 2))


With this technique we can also transform else statement with a little bit of logic.

Example :

if (test == 2) {
    bob = 1;
    foo = bob + 2;
} else {
    bob = 2;
}


Can be rewritten as :

((test == 2 && (bob = 1, foo = bob + 2, true)) || (bob = 2))


Note : The "true" is added there to make sure the first part of the expression always evaluate to true if test equals 2. This is a good trick to make sure the code will do the exact same thing even if we swap the "bob = 1, foo = bob + 2" part for something else. "true" can also be replaced with "1" or any expression that is truthy.

There is also the conditional operator (often called the ternary operator) that is useful to achieve the same thing. For the 2 previous examples using the conditional operator it would look like this :

(test == 2) ? (bob = 1, foo = bob + 2) : void(0)


and

(test == 2) ? (bob = 1, foo = bob + 2) : (bob = 2)

Friday, 21 October 2011

Javascript Obfuscation - Introduction

Introduction

This is the beginning of a series of blog post about Javascript obfuscation. Obfuscation in Javascript is a very interesting topic, because Javascript has a lot of special syntax and special behavior that can be abuse to produce totally unreadable code. Truly obfuscated code can be so dark that even tools like JS Beautifier won't help you to have a clue about what's going on. The only thing about Javascript obfuscation that is missing is places to find knowledge about it and this is why I am starting a series of blog post about it.

Everything in this series should be working in a modern browser (IE7 is not a modern browser) unless it's noted otherwise. If you see any mistake, you can leave me a comment or send me a message and I will correct it.

Parts

Tuesday, 27 September 2011

Javascript Obfuscation - The booleans

The booleans

In this section we will explore the different ways we can get boolean values. It might seems at the first look that there isn't much to say about the subject and I hope you will be surprised by the end of the post.

There is 2 common ways to get boolean values in an obfuscated way. The first one we will see is by using the "equality" operators and the second one is by using the "not" operator. It's also good to note that the "or" and "and" operator aren't used much in obfuscation to get boolean, simply because it doesn't always give boolean as an output (I will write up about this in an other blog post, this one).

The equality operator

Javascript has two operators to test for the "equality" of 2 variables. The first one "==" is more a similarity operator and the second one "===" is a strict equality operator. The first one is particularly interesting because it gives results that can be unexpected in various cases. Here are few example of this :

(null == undefined) => true
(false == "false") => false
(false == "0") => true
(NaN == NaN) => false
(2 == [[[2]]]) => true
([2] == [[[2]]]) => false

You can also see this Stack Overflow answer for more examples.

For the "===" operator there is less to say, because it's a strict operator. The only thing that gives a result that can be unexpected is the following :

(NaN === NaN) => false

NaN is a special value in Javascript. The way this value is defined, if you do an operation with this value the result will be NaN. This is why when you test if something equals NaN it will always be false.

The not operator

The not operator "!" in Javascript is commonly used to force something to be cast to a boolean value. But what is less known and that you can use in obfuscation is the result of this cast on value that aren't at the first look made to be cast to boolean. Here are few examples :

Expression Result
!0 true
!3 false
![] false
!({}) false
!"" true
!"0" false
!NaN true

Monday, 26 September 2011

Javascript Obfuscation - The strings

The strings

Strings can be obtained in a couple of unusual way. Most of those way of obtaining string in unusual ways relies on abusing type conversion. Abusing type conversion to produce obfuscated code is in itself  a very interesting thing that we will also explore in other parts.

To get started, I will show up the basic step that are commonly used to produce an obfuscated string.

Step 1 - Available string

There is a good number of thing that will produce a string in javascript. Here's a good list to get started :

(1/0)+"" === "Infinity"
1+{} === "1[object Object]"
("a"/2)+"" === "NaN"
false+"" === "false"
true+"" === "true"
[].concat+"" === "function concat() { [native code] }"
[][[]]+"" === "undefined"

Note that all of these are forcing type conversion of values to string in various of ways. The most common one is by adding an empty string (+""). The one that is used in the 2nd example is using the fact that when you add an object and a number it will force both of them to become a string before they get added.

Step 2 - Getting letters

Now that we have a couple of string available, the next thing to do is to get letter by using the fact that strings are array-like structure. If we combine that with step 1, we can get this :

"undefined"[0] === "u"
([][[]]+"")[0] === "u"

You can get most of the letter you need with the string given in the step 1, but there are letter that you simply can't get using that method. For those letter there are other ways to get them in an obfuscated way. Here are a few ways that you can use :

Name Notation
Unobfuscated "{"
Unicode "\u007b"
Octal ASCII "\173"
Hexadecimal ASCII "\x7b"

Step 3 - Composing string

Now that we can get individual letter, we can combine them to make our own string. Here are few examples :

(!1+"")[2]+(1+{})[2]+(!1+"")[2] === "lol"
(1+{})[4]+(!1+"")[3] === "js"

Extra

In addition to this blog post, here's a reference for how to get most letter you will need in an obfuscated way.

Letter Obfuscated letter
a(!1+"")[1]
b(1+{})[3]
c(1+{})[6]
d([][[]]+"")[2]
e([][[]]+"")[3]
f([][[]]+"")[4]
i([][[]]+"")[5]
j(1+{})[4]
l(!1+"")[2]
m*(1..constructor+"")[11]
n([][[]]+"")[1]
o(1+{})[2]
r(!0+"")[1]
s(!1+"")[3]
t(!0+"")[0]
u([][[]]+"")[0]
v*([].sort+"")[23]
y(1/0+"")[7]

* It's not guaranteed by the specification to return this value.

Sunday, 25 September 2011

Javascript Obfuscation - The variables

The variables

One of the thing about variables in Javascript that is not so known is that their name can contains a very width range of character. Lets take a peak look at what the Ecmascript specification allow :

  • Your variable name can start with $, _ or a Unicode Letter (Lu, Ll, Lt, Lm, Lo, Nl).
  • For the rest of the name of the variable you can use $, _, Unicode Letter or Unicode Number (Nd).
There's ton of stuff that is considered as a valid variable name and what's really interesting for obfuscation is that there are a lot of letter that are very similar and there are also letter for which most people will only see a square when they will view the source. The only thing you need to be careful about is encoding. If you're using UTF-8 and your Javascript file is not recognize as a UTF-8 file on the client side, that will break your script. The same thing applies to all encoding of course.

If you don't feel in a comfort zone when using Unicode extensively, there are other thing you can abuse. One of the other thing you can abuse is the way some character are displayed. The best example of this is the character underscore. In most text editor, it's very hard to know how much of them there is when you place them one after the other. Are _____ and ______ the same variable name ?


External Links

Thursday, 8 September 2011

LiveTool, P2P experiment in Javascript

I have been working in the last month in a project called LiveTool which was an experiment of how it would be possible to use P2P communication in a Javascript application. The project in itself can be found on GitHub : https://github.com/HoLyVieR/LiveTool

In brief, the project was a graphic editor that could be used by multiple people at the same time and people could see all the modification that where done as soon as they are made.

What where the obstacle ?


One of the main obstacle in starting this project was that RTMFP is not a well known technology and the documentation is very limited. In fact nearly everything you will found about it is made by Adobe. Also, the way RTMFP works in Actionscript is not very intuitive. Also when I wanted to start this project, I couldn't find any Javascript library that would allow me to use RTMFP without doing any Actionscript. This is the main reason I builded the RTMFP-JS library.

What about support for people that don't have Flash ?


This is a point I wanted to experiment with the project. Using P2P communication is very interesting, but it isn't widely supported so you still have to think about integrating fallback for those that don't support it. Technically speaking it isn't hard to implements, but if you don't integrate it from the start, it can be very hard to do. The fallback that I implemented for LiveTool is very simple, if you don't support RTMFP or if you want to send data to someone that doesn't support RTMFP, the server is used as a repeater.

Will LiveTool have more updates ?


For now, I'm leaving the LiveTool project as it is now, since I got to the point where I did the experiment that I wanted to do with it.

What about ... ?


If you have question about the project, you can leave a comment and I will try to respond to you or do a blog post about it.

Wednesday, 7 September 2011

P2P and Javascript

I have been experimenting  P2P connection with Javascript (in the browser) over the past few months, it's been quite interesting and it has revealed to me that it could have a lot of potential.

How do I do P2P communication in Javascript ?


You probably won't hear about P2P communication and Javascript a lot, because there isn't any native API that will allow you to have direct connection with other client. But it is supported through a common 3rd party plugin : Flash. Inside Flash you can use a protocol called RTMFP which allows P2P connection. If you want to read up about it you can have more information on adobe website.

But you said with Javascript, not Actionscript !


The nice thing about Flash, is that you can communicate with Javascript through ExternalInterface. With this we can use Flash technology but with Javascript and we don't have to create a Flash application to use RTMFP. If you don't really care about the Actionscript / ExternalInterface part, you can use some already made library which will take care of it. I builded up my own which is available on Github : https://github.com/HoLyVieR/RTMFP-JS-Bridge

Why should I use P2P communication ?


P2P communication is mainly aimed for application that require that multiple client speak to each other. One of the basic example of that kind of application would be a chat. All the information that are sent between the client won't have to pass through the server. It reduces traffic and load on your server since you are now sending and receiving less data while doing the same thing.

Why shouldn't I use P2P communication ?


There are 2 principle obstacle to P2P communication. Network infrastructure and Flash support. If you are inside a corporate or school network, there's a strong chance nobody will be able to connect to you because of the way the network you are in is made. As for the Flash support, it isn't supported on some mobile platform and even tough there is a larger proportion of the Internet user that have it, some don't have it for various of reasons ( OS support, they don't have admin rights on their machine, etc. ).

What about security ?


The RTMFP protocol informations are quite vague or hidden, so it's hard to say how secure the protocol is, but from what Adobe has written so far it seems that the information that is exchanged between the client is encrypted.

The only certitude that you can have is that the information that a client will receive will be from untrusted source (an other client, not the server), so you have to implement the same information checking that your server would have, but on the client.