Search This Blog

10 September 2005

RSA Security - The RSA Challenge Numbers

While trying to decide what practicum project to have a capstone student do for me, I realized that writing an AI to solve the RSA factoring challenge might be too extensive for a capstone. So, I am debating working on it in my spare time. Here, I am going to put a few notes that I have figured out this morning to make sure I don't forget.

The RSA FAQ says that the best known method for solving the factorization is to use a General Number Sieve and then Matrix all the resulting primes to find the correct one. While the first is an algorithm, the second is obvious brute force. It is because of the length of time these two take that the RSA feels that these algorithms are secure for "decades". I believe that an AI could solve the problem quicker (even if it still took months).

We can speed up the first part (the sieve) by matching patterns. For example, some basic things we know:

  • base2

    • has 2048-bits, based on the specific challenge to solve

    • has to start with a 1, or it would be a 2047-bit number

    • actually, it appears all primes start with 1

    • has to end in 1, or it would be a multiple of 2

  • base6

    • has to end in 1,5, or 0, or it would be a multiple of 3

  • base10

    • can not end in 2,4,6,8, or 0, or it would be a multiple of 2

    • can not end in 5 or 0, or it would be a multiple of 5




One AI solution to speed things up would to be locate patterns in different bases (as seen with base2, base10 and base6 above). This would speed the sieve engine. Speeding up the matrix/brute force might best be done with a median-approach, but that is hardly AI-based.

We can assume that teams are currently working to break the RSA-640 algorithm. As such, an alternative to the RSA-2048 algorithm up front would be to run the program against the RSA-704, then move on to the next one once it has been solved. If we wanted to take this approach, then the program should be written so that it makes as much of the reusable data collected/created available for the next run.

Also, 2048-bit keys are going to take a lot of memory to keep track of every single one in memory. This is probably part of the reason it takes so much resources. Perhaps a better approach would be to utilize a random-access memory mapped file, so that we can keep the minimal amount of data in memory at any given point in time.