Exetools

Exetools (https://forum.exetools.com/index.php)
-   General Discussion (https://forum.exetools.com/forumdisplay.php?f=2)
-   -   Any pointers on this troublesome algorithm? (https://forum.exetools.com/showthread.php?t=17967)

Cryo 10-14-2016 08:10

Any pointers on this troublesome algorithm?
 
For a few weeks now, I've been trying to figure out a cryptography problem that I just couldn't wrap my mind around, and I think I might need guidance and/or recommendations on where to learn more about the math/logic behind it.

I've got an program whose decryption method I want to reverse engineer; I know the exact decryption routine and subroutines, I know the encryption key(s) used in the process, and I've already analyzed it and created a fully working PoC decrypter whose input is legitimate license file data and whose output is the decrypted license data in a hex dump-like format (only 64 bytes).

I have multiple license files (which they allow)—all of which are radically different in terms of original encrypted data, but whose decrypted data is all the exact same except for one single 4-byte value, which is seemingly random and pre-determined upon license file generation. In other words, every license file decrypts to the same exact values each time, but that one DWORD in the decrypted data is different across the different license files; the other remaining 60 bytes are exactly the same across all of the files.

To exemplify this, I've noted the data of 3 legitimate license files below:
Code:

f595c42c 856dba93 c4c0d727 ffd57d2b
856ed27b b9cc5541 8f0319ce 5ba13412
31fbd174 63e16ce8 f09f8ce7 198db818
c74906ce 95bbaf74 dd94c717 60b43434
38c9b73b 1f480921 a2b4c9e8 2b29aed2
123914a2 abde2de8 4755fe59 19054298
4be819bc 6bdd26a9 33f993da 28c6d1f6
03484b72 56366f7d 37d36a4c 8c9d7c72

bca7ea45 9caa59b6 b0ed8a69 c46dc7a7
daddc9be 7eb9ac5a bc52d5ac ae60e5b1
c3d8e996 64596547 6f72317f 927b9b4c
ecc35ec0 c38f569f 2a134665 13b80770
6be8d4c1 ee0f273d 30953dcb f87254dd
78d9dbbb 25c279a4 43b59bf9 44b51c8b
aab2e8f0 bdb13c03 afa0b98a 2c5c2c67
930d251e 09b5efed 0a9417bf 4d650961

2e505aed 22ffb2f4 a9c6767a 69a5e2b9
566e4415 f70a0d01 f38a4b66 cae31d07
22971759 8a8ea209 2b5f5a0f 8d1b7247
01d11a7d c7fb87a3 712f4cc1 0d0b99ba
95079986 387d677d 1f8d4334 f3d7f586
50bc363d b5cc9fa7 6dbb58e5 b5637206
5112fd20 1d784619 4bbedde5 a7efbb67
7a2d1078 2b321fee 7b077516 953030e3

However, when each of these files is decrypted, they produce the following output, respectively:
Code:

00000001 13a303bf 04010503 02070609
140f760b 00000001 02a00bd1 00000001
ffffff00 ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff 0001ffff

00000001 13a303bf 04010503 02070609
1c42bafb 00000001 02a00bd1 00000001
ffffff00 ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff 0001ffff

00000001 13a303bf 04010503 02070609
4207e5e1 00000001 02a00bd1 00000001
ffffff00 ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff 0001ffff

In terms of algorithm specifics, it's a 17-pass loop whose output is directly fed as its input. On the first step of decryption, the input data is recursively multiplied against itself to produce a temporary key. However, the only exception is the last pass (pass 17), during which the output of step 16 is multiplied against the original encrypted file data to produce the temporary key.

With my current level of knowledge, I'm not sure how this would be able to be reversed, since one of the first steps would involve knowledge of the final output, which would be seemingly impossible. =/

At this stage, I need help reversing the process. I only want to create a PoC encrypter; I don't want to patch the program or otherwise bypass the mechanism. Is there anywhere I can go from here?

It's verbosely commented and kind of messy-ish, but you can find the source code for my PoC decrypter here:
Code:

https://mega.nz/#!Z4F2nIbC!jMVjh0jxz5jBU6SpvAicSaydPkx9S-jyEbGoMfJVL88


NOTE: I decided to post in General Discussion because of the educational opportunities I'm hoping to gain from this (so I can start contributing more), but feel free to delete this thread if it's too request-y for this section. ^^;

dila 10-20-2016 08:31

It's an interesting algorithm. I looked, but I am not good enough to solve it :(

Just wanted to let you know that some people saw this and tried. Maybe others have thoughts?

Syoma 10-20-2016 15:29

From the description you provide I can guess it is something asymmetric.
16 steps may be x^2 16 times, then step 17 +x. Totally 65537 prime.
Then full algo may be powmod(data, 0x10001, N), where N == magic?

void decode_1(unsigned int *src, unsigned int iter, unsigned int *dest) = bigint.mul
void decode_2(unsigned int *src, unsigned int *key) = bigint.sub

Cryo 10-21-2016 10:18

Quote:

Originally Posted by Syoma (Post 107464)
From the description you provide I can guess it is something asymmetric.
16 steps may be x^2 16 times, then step 17 +x. Totally 65537 prime.
Then full algo may be powmod(data, 0x10001, N), where N == magic?

void decode_1(unsigned int *src, unsigned int iter, unsigned int *dest) = bigint.mul
void decode_2(unsigned int *src, unsigned int *key) = bigint.sub

...Woah. I think that just kind of blew my mind. It makes a lot more sense, now that I look at it and step through it as a whole instead of as individual functions.

I manually calculated the return values in each step of the process using your advice, with the first 16 iterations being (x^2)%N where x is the data, and the last being (x*y)%N where x is the data and y is the original data, and sure enough, the result was exactly as expected!

Checking your proposed algorithm of (data ^ 0x10001) % N, it also yields the exact same results, which is kind of unreal to me at this point that it can be simplified so easily. Thank you so much! :D

You described it at such a base level and with such ease that I can't help but be impressed; for the sake of additional learning opportunities, how would you suggest that I build upon these kinds of skills? Would reversing crackme challenges be helpful, or would you recommend something else?

Syoma 10-21-2016 14:31

You can be even better. Recipe: read books, get online courses, take part in the forums and conferences and, for sure, work a lot in the area you like. If you will do at least half of noted above you become a real pro very soon. ;)

t3xc0d3 10-21-2016 18:22

Your algorithm is one form of fast binary exponentiation, known as exponentiation by squaring. It is a common algorithm for asymmetric cryptography since it is much more efficient than naive multiplication. It is, for instance, an important core of RSA and algorithms based on discrete logarithms. Variations do also exist for other domains, such as the double-and-add algorithm for elliptic curves.

For x ** e mod n, you calculate x * (x ** 2) ** ((n-1)/2) if n is odd (square and multiply), otherwise (x**2)**(n/2) (square).

To exemplify:

x=17
e=33
n=73

33 is the same as 0b100001.

1. square -> 17 * 17 = 70 mod 73 = x ** {0b10}
2. square -> 70 * 70 = 9 mod 73 = x ** {0b100}
3. square -> 9 * 9 = 8 mod 73 = x ** {0b1000}
4. square -> 8 * 8 = 64 mod 73 = x ** {0b10000}
5. square -> 64 * 64 = 8 mod 73 = x ** {0b100000}
multiply -> 8 * 17 = 63 mod 73 = x ** {0b100001}

In (not so great) python code it looks as

Code:

def exp(a,e,p):
        binary = bin(e).lstrip("0b")

        res=a
        for i in xrange(1, len(binary)):
                if binary[i] == "1":
                        res = (res ** 2) % p
                        res= (res * a) % p
                else:
                        res = (res ** 2) % p
        return res


Cryo 10-22-2016 08:20

Ahh, just did a bit of research into some of the specifics of prime calculations and discrete logarithms in cryptography, and this does certainly seem to match up with that. Thanks for the detailed explanation, t3xc0d3!

I can't find anything on easily reversing modular exponentiation other than brute force, but would brute force really be the only option to approaching this with such large inputs?

t3xc0d3 10-22-2016 11:19

Quote:

Originally Posted by Cryo (Post 107482)
I can't find anything on easily reversing modular exponentiation other than brute force, but would brute force really be the only option to approaching this with such large inputs?

Generally speaking, yes. If a general algorithm that is simpler than brute force would exist, then the cryptosystems would not longer be secure. To be exact, it is it known if simpler methods exist. It *might* be the case, but they have not been found.

One thing you might try is to analyse if they have been included their secret key somewhere. You will not find anything, if they have done their homework. However, I have found private RSA keys in binaries several times...

In addition, if you are able to determine which algorithm they use and which configuration (e.g., which prime values, ...) they use, you might be able to find an attack vector. If they used a insecure configuration (e.g., too short primes), you win. The analysis can (will) be really cumbersome.

sendersu 10-22-2016 17:41

>> I have found private RSA keys in binaries several times...
thats just a biggest mistake whatsoever :)

chants 10-25-2016 00:19

The general number field sieve is faster than a naive brute force, and there are a variety of other techniques to reduce prime checking in DLs. It is sub-exponential but seems to be above polynomial in time to find a solution.

I would suggest studying discrete mathematics, computer algebra, cryptography and algorithms if you want to be great at reversing these types of scenarios. Sitting and looking at assembly code is not so helpful if you are not versed in the algorithms and mathematics behind what is going on. However it is a theoretical and long journey to master. You can known Fermat's little theorem and Carmichael numbers and other useful tools in primality testing and modular exponentiation and so forth.

An example solution which is more efficient than exhaustive brute force is as follows: given b^l=n (mod p) and you want to solve for the smallest l with just normal integer sizes:

Code:

                unsigned int c;
                long long an = 1; //x0 = b ^ sqrt(p) % p
                for (c = 0; c * c < p; c++) {
                        an = an * b % p;
                }
                unsigned int d = c; //sqrt(p)
                std::map<long long, long long> m;
                long long cur = an; //x(i) = b ^ sqrt(p) % p to b ^ (sqrt(p) ^ sqrt(p)) % p
                for (c = 1; c <= d; c++) {
                        if (!m.count(cur))
                                m[cur] = c;
                        cur = cur * an % p;
                }
                cur = n; //y(i) = nb % p to n(b ^ sqrt(p)) % p
                long long minans = -1;
                for (c = 0; c <= d; c++) {
                        if (m.count(cur)) {
                                long long ans = m[cur] * d - c;
                                if (ans < p) {
                                        minans = minans == -1 ? ans : (ans < minans ? ans : minans);
                                }
                        }
                        cur = cur * b % p;
                }
                if (minans != -1) std::cout << minans << std::endl;
                else std::cout << "no solution" << std::endl;

If you can explain the mathematics behind it, then you will be at a very good problem solving level in this area.

Cryo 11-08-2016 06:07

After working on this off and on for quite a while, I can't seem to really find any way to crack this. However, one thing I have noticed is that each of the generated ciphertext all have a GCD of 9. Is it a common expectation of RSA-generated ciphertexts to have a similar GCD, or does the fact that they all have a GCD > 1 mean that the possibilities can be reduced by any amount?

Thanks again for the helpful advice! I've been studying crypto more, but couldn't quite find whether this situation was expected or whether it indicated a vulnerability in this particular implementation.

chants 12-05-2016 07:35

It looks like it is not encryption/decryption but RSA signature verification/signing:

Code:

00000001 13a303bf 04010503 02070609
4207e5e1 00000001 02a00bd1 00000001
ffffff00 ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff 0001ffff

Notice the 0001 which generally indicates PKCS#1 v1.5 RSA signing/private key.

See the RFC for yourself
Quote:

https://tools.ietf.org/html/rfc2313
on page 9.

What you show is not "decryption" but "verification" using the public exponent of a signature. After all the 0xFF there is a 0x00, then an ASN.1 code and finally a hash of a message being signed.

So you are looking for a private key used to sign messages.

Using chosen plaintext attacks, you might find a way to break this so that you can sign messages without even knowing the private key as long as you can verify signed messages and if the padding is bad. I suggest looking up chosen ciphertext attacks against RSA and look up Bleichenbacher in particular.

If the public exponent is low enough (like e=3), then you can forge messages almost certainly. Using mere small numbers and cube/cube root properties if the padding is not checked properly.


All times are GMT +8. The time now is 14:54.

Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2026, vBulletin Solutions, Inc.
Always Your Best Friend: Aaron, JMI, ahmadmansoor, ZeNiX