You should worry about Vary

We’ve all heard of XSS, SQLi and CSRF. And although they keep occurring all the time, any decent web framework nowadays has some mechanisms to avoid those. Now did you know CSRF had a little sister? It is so poorly known that it seems like it doesn’t even have its own name! Before explaining how it works, let’s see what it can do.

Some background

Meet Bob, a web dev who is in charge of writing a public facing API. Bob sets the Access-Control-Allow-Origin header to allow all origin domains. Indeed, Bob wants any website to be able to talk to his API. Now, being a thoughtful developer, Bob knows about CSRF and he builds an authentication mechanism that delivers access tokens. Without a proper access token, his API won’t talk!

Still following? Good, now you may be wondering what this API does. In fact this API allows different services to store and share credentials in the cloud. The way a service gets the credentials is by issuing a request like this one:

GET /vault/{account}/{service}
Authorization: Basic QWxhZGRpbjpPcGVuU2VzYW1l

Bob heard one day that Auth basic is unbreakable, therefore he is using this method to pass the API key and username.
On top of that,the backend is doing some pretty dope crypto stuff and requesting the credentials is an expensive operation. Bob adds some Cache-Control information to tell the browser to cache the response for a little bit. This way Bob reduces the load on his server and improve the speed of apps consuming his API.
If the credentials check out, the API returns something like this:

200 OK
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers: Authorization
Cache-Control: no-transform, max-age=600
Content-Length: 42
Content-Type: application/json

{"username":"Alice","password":"I4mab055"}

Everything works fine until one day, a client complains that her credentials got stolen…
Damn it Bob, again?!

You just got hit

Bob can check his server log for a while, he will never find anything. The attacker left no evidence behind because he never even had to talk to Bob’s API; Everything happened locally, in Alice’s browser. Alice received this stupid email a few days ago and couldn’t resist opening a link to a picture of a cat wearing a ninja-turtle mask.
While she was watching this picture, a piece of javascript issued an HTTP request in the background to Bob’s API and stole her credentials. But how did the malicious website get an API token in the first place you’ll ask? Well it didn’t. Because it never had to. A simple request without the Authorization header was enough to get the information.

Introducing: Vary

So there’s this thing in the HTTP spec that you may or may not have heard about. It’s called the Vary header. What it does is inform any cache about which request headers can cause the response to change. If the Vary header is omitted, then any cache will discard the request headers when deciding whether to serve the response from cache. No matter which headers were sent to get the response the first time, the same response will be served back from cache every consecutive time until the response expires. Remember how Bob made the response cacheable for 10 minutes? This means that the attack is successful if it happens within 10 minutes of the original request.

To prevent this issue, the API should send the Vary header in its responses to inform caches that the response will be different for different values of the Authorization header. In our example we get:

200 OK
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers: Authorization
Cache-Control: no-transform, max-age=600
Content-Length: 42
Content-Type: application/json
Vary: Authorization

{"username":"Alice","password":"I4mab055"}

Going beyond

So stealing information is fun but let’s see if we can use this flaw to do something better. Imagine a web page that allows a user to set a message of the day, stores it in the cookies and displays it whenever the user visits the page for the rest of the day. The page is cached and can be fetched cross-domain just like in the previous example and it doesn’t have Vary: Cookie. If a user visits my website, I could issue a request to the greetings page with a custom Cookie. If I manage to perform this request when the page is not in cache yet, then the browser will fetch the page with my custom cookie and store it. Yes, we just found ourselves a cache-poisoning vulnerability. Now guess what happens next if this page allows the greeting message to contain javascript?

Wrapping up

I wish this article will at least allow a few people to get awareness about the dangers associated with Cache-Control and the Vary header. This vulnerability is similar to CSRF in that it allows cross domain requests to do some damage. It is harder to exploit and does not allow to reach the server. This however makes it impossible to detect attacks while still allowing some information leakage and defacing. In the case of cache poisoning, it opens up the attacks surface to find other vulnerabilities such as XSS or even SQL injections.

The fact that this vulnerability lies in the shadow of the big names like XSS makes it more likely to be looked over. In fact it has occurred before and I bet there are exploitable scenarios in the wild like those discussed above. So be careful next time you build an API and take some time to review your response headers. It can sometimes be trickier than you’d think figuring out which headers make the response change!

You may want to try out this quick and dirty proof of concept to see this flaw in action.

GitHub’s “merge pull request” is wrong

This is the story of Bob. Bob is working on some cool project on github. He’s written tests and runs them continuously on travis. Bob asks his contributors to write tests for new features they introduce. Using GitHub’s pull requests, he can make sure no bugs are introduced by looking for the green tick next to the commit number. If you look at the commit tree in Bob’s repository, you’ll probably find something like this.

Bob states in his contribution guide that people can fork master to work on new features. He guarantees that master’s tests always pass and therefore, individual contributors don’t need to worry that much about errors made by other developers.

If you’re familiar with github then you are probably familiar with this workflow and you’ll think that it’s perfectly reasonable.

But you’ll be wrong. Bob got overly enthusiastic about github’s UI and will soon discover that he’s been wrong all this time.

A catastrophic scenario

Say two contributors, Alice and Charlie, are writing code for Bob’s project in their respective branches. In this project, we find the following code for operating Bob’s car:

private void doSomething(String action) {
  switch (action) {
    case "OPEN_DOOR":
       open_door();
       break;
  }
}

Now Alice think it would be better if Bob used an enum instead of a string. She changes the function to the following and opens a pull request on Bob’s project.

private void doSomething(Action action) {
  switch (action) {
    case Action.OPEN_DOOR:
       open_door();
       break;
  }
}

Bob opens his project page and sees a pull request. He thinks Alice’s modification makes sense and, since all tests passed, he clicks on the green button.

The button of doom
The button of doom

Meanwhile, Charlie thinks Bob will need to close his car’s door at some point. Charlie adds a new case to the switch:

private void doSomething(String action) {
  switch (action) {
    case "OPEN_DOOR":
       open_door();
       break;
    case "CLOSE_DOOR":
       close_door();
       break;
  }
}

The next day, Bob opens his project on GitHub and sees Charlie’s pull request. All tests passed, no merge conflict, that’s a clear LGTM and Bob merges the code.

A few days later, Bob receives complaints from multiple developers, they forked master and it doesn’t build!

What could go wrong?

The problem lies in a common misunderstanding of the following assumptions.

  • Different PRs must be completely independent
  • No merge conflicts ≠ No conflicts

If you accept one of these to be violated, then expect your main branch to fail tests!

If we speak in terms of diff, then Alice’s diff is the following:

- private void doSomething(String action) {
+ private void doSomething(Action action) {
    switch (action) {
-     case "OPEN_DOOR":
+     case Action.OPEN_DOOR:
         open_door();
...

While Bob’s diff looks like this:

...
         break;
+     case "CLOSE_DOOR":
+        close_door();
+        break;
  }
...

Put together we get

- private void doSomething(String action) {
+ private void doSomething(Action action) {
    switch (action) {
-     case "OPEN_DOOR":
+     case Action.OPEN_DOOR:
         open_door();
         break;
+     case "CLOSE_DOOR":
+        close_door();
+        break;
...

This diff does not cause a merge conflict but it is actually in conflict!

–no-ff and the button of doom

The flaw resides in that big old “merge pull request” button. While the green checks above the button and its fat style make pressing it a rewarding thing, it hides the real danger behind it: a blind merge!

The above example looks like this in terms of commit history:

        +-PR_Charlie-+
       /              \
      +--PR_Alice--+   \
     /              \   \
 ---+----------------x---x
                         |
                        master

All changes introduced by Alice and Charlie on their respective PRs have been tested, but the merge commits (marked ‘x’ above) were not! This is called a “non fast-forward” merge (–no-ff in git) and is basically like committing directly on master.

The right way to do it

Assuming you are like Bob and want your main branch to always pass tests, then here is the right way to do this: Merge master into the PR fisrt, and merge only when fast forward. It would look like that:


Initial state: both PRs are unmerged:

        +-PR_Charlie-+
       /              
      +--PR_Alice--+  
     /                 
 ---+
    |
  master

on master: git merge --ff-only PR_Alice

      +-PR_Charlie-+
     /                              
 ---+--PR_Alice--+
                 |
               master

on PR_Charlie: git merge master

                  merge branch master into PR_Charlie   
                     |
      +-PR_Charlie-+-+
     /              /                  
 ---+--PR_Alice----+
                   |
                 master

Then on master: git merge --ff-only PR_Charlie

      +-PR_Charlie-+
     /              \                  
 ---+--PR_Alice----+-+
                     |
                   master

The end result is exactly the same as before EXCEPT that master was merged into the PR first, and this (if properly pushed to github) allows the automated tools to test the last commit. This ensures that master builds because master will always be on a commit that has been tested before.

Many beginner (and even advanced) programmers get fooled into thinking that merging a green pull request cannot harm. In an ideal world, clicking the green button would: merge master into the branch, wait for the tests to pass, and on success, fast-forward master to the merge commit. But unfortunately I don’t think this is coming anytime soon. In the meantime, think about it twice before you press the button of doom. And when in doubt, run the merge sequence manually in the right order. This may save you a few hours of debugging.

Hacking FunRun2 — how to reverse engineer a Corona app

Disclaimer: This article is a bit old and does not apply to recent versions of the corona sdk. It is put here for educational purpose, showing how one may try to break into this kind of applications. It is not a magic formula for hacking all corona apps across all versions of the sdk.

funrun2 hack

If you haven’t played funrun yet, you probably will soon. Funrun2 is one of the best multiplayer mobile games out there. It’s one of these rare games where you can have all the fun from the very first game you play, but still stays enjoyable after month of playing it. It’s my favorite game on mobile and since I had some spare time, I decided to tear it open (yeah, that’s what I do to things I like). It was a bit harder than expected and involves some not so trivial hacking as well as a bit of luck. By the way, yes, all these buttons work.

The easy part

First I need to get my hands on the .apk file. There’s plenty of tools out there to do that, some of them are even open source.

An apk file is an archive, just like a jar or a zip so I can unpack it with regular unzipping tools. This gives me access to a well documented directory structure. The two things I want are the classes.dex file, containing java bytecode, and the assets directory. Reversing the first one allows me to get access to java source code so it’s pretty obvious why I want to look at it.
To understand why the second one is interesting though, you must consider that this game was built with a cross platform framework, namely corona. These kind of framework generally allow you to code your app with some scripting language that gets executed on every platform the same way. Therefore, unless the script is cross compiled to native language (which nobody ever does), it must be stored somewhere. That’s when the assets folder comes to play. It contains only a few images and this resource.car file which extension conveniently matches Corona ARchive. How cool is that ?
I open the file in a text editor and mostly get binary-unreadable-shit but there’s also strings like “lua.whatever.that.is.lu”. There are the scripts.
In case you don’t know: lua is one of the main scripting languages used in games with it’s open-source libraries and tools.

Now I still haven’t figured out where the images and sounds are (the corona archive weights a mere 2MB so it doesn’t contain any asset), but this will come later.

Things getting interesting

Trying to open the corona archive, I quickly realize it’s not a common archive format (even 7-zip can’t open it !). So there are two ways to go: finding the loading procedure in the java code or analyzing the file structure by hand. The first seems more reasonable.
But first, let’s decompile the dalvik bytecode into human readable java. These tools are helpful: dex2jar to covert the .dex to .class (dalvik to jvm) and jd to go back to plain java source code.

Now I have a load of java source code to look into, great! Let’s look for the string “resource.car”: no results. Looking manually, I quickly find corona’s source code in com/ansca/corona, that’s where the magic must happen. However, nothing is nearly dealing with a “resource” nor a “.car” file. I find however references to images. They were actually located on the device’s external storage (sdcard/Android/obb) in a .obb file. This is something common on android devices (that I wasn’t aware of before doing this hack), my guess is that moving resources allows the apk to be lighter and makes the internal memory footprint smaller. A .obb file is just a zip with a fancy extension. I can grab this file on a computer, edit the images/sounds/whatever and put it back in place. The game still runs with my custom images.

funrun_image
Besides adding childish drawings to the game, there’s not much that can be done this way.

I’m in a dead-end with the java code. Let’s give a shot at manually reversing the resource.car file:

I open it with a hex editor and quickly realize the format used is trivial:

First, a header to rule them all, then comes an index made of each file name + position (address) of it’s contents in the archive, then comes the data block with each file contents.

[header] 16 bytes

[index]
  [entry]
   [header] (4 bytes) 1
   [block addr] (4 bytes)
   [nameS] (4 bytes)
   [name] (nameS+1 bytes) 0-terminated
   [padding] (0s to complete 4x addr)

[files]
  [entry]
    [header] (4 bytes) 2
    [??] (4 bytes)
    [size] (4 bytes)
    [file] (no 0 termination)
    [padding] (to next 4x addr)

The resource.car file format

Now I didn’t have to code a packer for this format myself because luckily, someone already did that. So I grabbed this corona archive packer/unpacker which you won’t find a link to on my blog.

It turns out the archive contains uncompressed precompiled lua scripts, using luadec and then luac again you can decompile and then recompile these scripts which is very convenient. These lua scripts define absolutely every aspect of the game. From where a button might be to how fast a player can move and what a specific powerup does.
Injecting our custom resource.car in the app is a big deal because I could modify anything in the game.

First I try modifying something inside the precompiled lua scripts, such as a string since we can easily recognize them. If this works, I can later try to recompile one file from the (modified) source.
I repack the resource and put it back in the apk file, sign it with dex2jar (as explained here) and install it on my device. I launch the game and, with no real surprise, it crashes.

I figured there’s probably some kind of check performed against the archive to guarantee integrity. However, the loader is nowhere to be found in the java source. This battle is lost but not the war.

Going down

a88

Yeah, maybe Di Caprio is right, we need to go low-level and find this resource loading procedure.
Inside the .apk, there’s another directory called “lib”. It contains architecture-specific libraries. You’d usually build a native library either when you have no choice but use C (for instance, lua is written in C) or when you want higher-than-java performances (for a game framework, it makes sense).

In this lib folder, there’s some lua libraries, but also random libs for sound, video, analytics and whatever. Most importantly I find here a lib called libcorona.so. Let’s quickly open it with a text editor to search for the string “resource.car”, bingo!

I’ll have to disassemble the lib in order to understand what’s happening in there. It turns out, there’s an amazing tool that does this and it’s called IDA. It’s more than a cross-architecture disassembler, it does lot’s of stuff but for my purpose I’ll just use it as a disassembler. It handles libcorona very well and kindly gives us the assembly code.

FYI: I picked the armeabi-v7a version so what follows would be different different with an apk targetting another platform

I look again for the string “resource.car” and find it referenced at offset 0x1081D4.

res_car
The only piece of code referring to “resource.car”

 

It took me a while to figure out how things worked, but at some point i had a theory:
From the occurence of “resource.car”, I clearly see two execution paths: one that leads to a portion of code referencing the string “could not verify” and exiting, and one that leads to a subroutine that I deduced was meant to initialize the lua runtime or something like that. What stands between them is a fragile Branch No Equal (BNE) that we would like to turn into a rock solid Branch.

jump_annot
Take the yellow jump to get to the blue subroutine and avoid the nasty red branches

 

Looking at this document and some other branch instructions around in the code, I deduced that I needed to change byte x108247 from 1A to EA to turn my BNE into a B. Going back to my hex editor (I should really learn how to use IDA properly some day) I change the aforementioned value.

patched
It’s a branch!

 

Now let’s pack our patched lib back into the apk, sign it, install it, run it… It runs! Let’s see if our custom value is used.

modded
First time ever a guessed binary patch works upon first try.

Let’s try the full pipeline: modify the lua source code, compile it, repack the .car and place it back in the .apk. It works provided we use the proper luac version (a quick search for the version string in liblua.so saves you the guessing). That’s it, the path is paved for hours of fun modding the game!

Conclusion

I had seen many game cheats and always wondered how they actually work, now I know at least for this one. To illustrate this article, I managed to strip all ads from the game, create a speed hack, fly mode, and unlimited powerups. Hacking the in-game money is harder because only the server issues coins, so even if you trick your phone into thinking you are rich, the server knows how much you really have.
I’d like to emphasis that I personally think buying or downloading a cheat for a game is lame and those who do that just deserve a trojan. However, hacking a game is really something fun that one should try. It requires patience, and a capability to keep a clear mind even after hours of work leading nowhere.