Search This Blog

10 September 2005

RSA Security - The RSA Challenge Numbers

While trying to decide what practicum project to have a capstone student do for me, I realized that writing an AI to solve the RSA factoring challenge might be too extensive for a capstone. So, I am debating working on it in my spare time. Here, I am going to put a few notes that I have figured out this morning to make sure I don't forget.

The RSA FAQ says that the best known method for solving the factorization is to use a General Number Sieve and then Matrix all the resulting primes to find the correct one. While the first is an algorithm, the second is obvious brute force. It is because of the length of time these two take that the RSA feels that these algorithms are secure for "decades". I believe that an AI could solve the problem quicker (even if it still took months).

We can speed up the first part (the sieve) by matching patterns. For example, some basic things we know:

  • base2

    • has 2048-bits, based on the specific challenge to solve

    • has to start with a 1, or it would be a 2047-bit number

    • actually, it appears all primes start with 1

    • has to end in 1, or it would be a multiple of 2

  • base6

    • has to end in 1,5, or 0, or it would be a multiple of 3

  • base10

    • can not end in 2,4,6,8, or 0, or it would be a multiple of 2

    • can not end in 5 or 0, or it would be a multiple of 5




One AI solution to speed things up would to be locate patterns in different bases (as seen with base2, base10 and base6 above). This would speed the sieve engine. Speeding up the matrix/brute force might best be done with a median-approach, but that is hardly AI-based.

We can assume that teams are currently working to break the RSA-640 algorithm. As such, an alternative to the RSA-2048 algorithm up front would be to run the program against the RSA-704, then move on to the next one once it has been solved. If we wanted to take this approach, then the program should be written so that it makes as much of the reusable data collected/created available for the next run.

Also, 2048-bit keys are going to take a lot of memory to keep track of every single one in memory. This is probably part of the reason it takes so much resources. Perhaps a better approach would be to utilize a random-access memory mapped file, so that we can keep the minimal amount of data in memory at any given point in time.

24 comments:

  1. um, well, duh -- all primes start in base2 as '1' because leading 0s are always ignored in base2 ;)

    oops

    ReplyDelete
  2. in base4, all numbers (above 2) end in 1 or 3

    ReplyDelete
  3. in base8, all primes (above 2) end in either 1, 3, 5 or 7

    in base12, all primes (above 3) end in either 1, 5, 7, b

    in base14, all primes (above 7) end in 1,3,5,9,b,d

    ReplyDelete
  4. The base2 filter makes the base4 filter obsolete. The base2 and base4 filters make the base8 filter obsolete. The base2,4,6,8 filters makes the base12 filter obsolete.

    ReplyDelete
  5. base16, they all end in an odd-number (in base 16)... so, 1,3,5,7,9,b,d,f

    ReplyDelete
  6. Hmmm, I was looking at what it would take to do balanced ternary as well... took some notes on an alternate page:

    http://eoti.org/~malachi/blog/2005/09/balanced-ternary.html

    ReplyDelete
  7. Name: RSA-2048
    Prize: $200000
    Digits: 617
    Digit Sum: 2738
    25195908475657893494027183240048398571429282126204032027777137836043662020707595556264018525880784406918290641249515082189298

    55914917618450280848912007284499268739280728777673597141834727026189637501497182469116507761337985909570009733045974880842840

    17974291006424586918171951187461215151726546322822168699875491824224336372590851418654620435767984233871847744479207399342365

    84823824281198163815010674810451660377306056201619676256133844143603833904414952634432190114657544454178424020924616515723350

    778707749817125772467962926386356373289912154831438167899885040445364023527381951378636564391212010397122822120720357


    KEYS -- NUM BITS: 2048

    START:

    10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

    00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

    00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

    00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

    00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

    00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

    00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

    00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

    00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

    00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

    00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

    00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

    00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

    00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

    00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

    00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

    000000000000000000000000000000000000000000000000

    END:

    11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

    11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

    11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

    11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

    11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

    11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

    11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

    11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

    11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

    11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

    11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

    11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

    11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

    11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

    11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

    11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

    111111111111111111111111111111111111111111111111

    START:

    16158503035655503650357438344334975980222051334857742016065172713762327569433945446598600705761456731844358980460949009747059

    77957524546054754407619322414156031543868365049804587509887519482605339802881919203378413839610932130987808091904716923808523

    52908229260181525214437879457705329043037761995619651927609571666948341712103424873932822847474280880176631610290389028296655

    13096354230157075129296432088558362971801859230928678799175576150822952201848806616643615613562842355410104862578550863465661

    734839271290328348967522998634176499319107762583194718667771801067716614802322659239302476074096777926805529798115328

    END:

    32317006071311007300714876688669951960444102669715484032130345427524655138867890893197201411522913463688717960921898019494119

    55915049092109508815238644828312063087736730099609175019775038965210679605763838406756827679221864261975616183809433847617047

    05816458520363050428875758915410658086075523991239303855219143333896683424206849747865645694948561760353263220580778056593310

    26192708460314150258592864177116725943603718461857357598351152301645904403697613233287231227125684710820209725157101726931323

    469678542580656697935045997268352998638215525166389437335543602135433229604645318478604952148193555853611059596230655

    ReplyDelete
  8. Well, those numbers in the last comment are cut off.... need to fix the blog to display them scrollable...

    so, one of the issues is how to reduce the mathematical computations done once we find all those primes...

    there are some simple ways to narrow it down... for example... the smallest 2048-bit number ends in '328' and the largest ends in '655'. We know that the one we are looking for ends in '357'. Thus, we don't need to do the math. Why? the last digit will be 0 if we multiply those together.

    ReplyDelete
  9. CORRECTION:

    answer is 2048 bits NOT the keys

    patterns allow us to narrow down which primes to check

    3bits * 3bits
    100 * 100 through 111 * 111
    10000 through 110001
    5bits - 6bits

    4bits * 4bits
    1000 * 1000 through 1111 * 1111
    1000000 through 11100001
    7bits - 8bits

    3bits * 3bits = 5-6bits
    4bits * 4bits = 7-8bits
    5bits * 5bits = 9-10bits

    3bits * 4bits = 6-7bits
    4bits * 5bits = 8-9bits

    3bits * 5bits = 7-8bits

    THUS
    x bits * y bits = (x+y-1)bits to (x+y)bits

    THEREFORE
    for 2048 bit answer
    x+y = 2048 | 2049

    THEREFORE
    1024b*1024b possible
    1024b*1025b possible
    512b*1536b possbile
    513b*1536b possible
    512b*1537b possible
    513b*1537b NOT possible
    511b*1535b NOT possible

    SO....
    only have to do the math, IF:
    1) x bits + y bits = 2048 or 2049
    2) last x digit * last y digit ends in last result digit (ie: ....2 * ....5 is NOT ...3)
    3) x is prime
    4) y is prime

    -----------------------------------------
    http://members.fortunecity.com/jonhays/activithm.htm

    multiples of 9 add up to 9
    18=1+8=9
    99=9+9=18=1+8=9
    etc
    but, our base6 removes all multiples of 3, and thus multiples of 9

    multiples of 4 do the same thing
    12478=1+2+4+7+8=22=2+2=4
    also, we can remove the 9s from it
    12478=1&8, 2&7, leaving 4
    also, rearrange
    12478,87421,48217 all %9=4
    but... our base2 filter should already have gotten rid of those

    11s....
    alternate adding and subtracting
    thus...
    176: 1 - 7 + 6 = 0, thus multiple of 11
    1353? 1 - 3 + 5 - 3 = 0... hmmm...

    to do... read the section on 16s, 144s, octal and the Bookkeepers Check
    --------------------------------------------
    The author presents a sieve algorithm that determines all primes between 2 and N in time O(N log log N) with storage requirement O(N). Consequently, neglecting constant factors, this algorithm has the same complexity as Eratosthenes' sieve. The new algorithm, however, decreases the constants. This linear speedup is achieved by dealing with the elements 5, 7, 11, 3i + 2, 3(i + 1) + 1, . . . , - N only, where i is odd. Moreover, this new algorithm is even more compact than Eratosthenes' sieve. Nevertheless, the paper would have been more readable if the author had used only one “programming language” to formulate the algorithms, or if he had been denotationally more precise. Finally, since he states that Pritchard's O(N/log log N) additive sieve algorithm has more theoretical than practical significance, it would have been better to compare the new algorithm with Pritchard's sieve instead of with Eratosthenes's.

    hmm, I will have to check those out... sounds like the same basic idea I am doing -- and maybe reinventing the wheel

    this guy mentioned a few of them...
    http://cpe000103c34069-cm014300001653.cpe.net.cable.rogers.com/weblogs/ben/programming/A-Segmented-Sieve-A-Reflection.writeback

    Pritchard's stuff
    http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/p/Pritchard%3APaul.html
    but I only recommend that if you really like mathematical set notation ;)
    and even more so with this: http://www.cs.wisc.edu/~cdx/Sieve.pdf

    this has a very easy to understand explanation of the wheel sieve
    http://www.physicsforums.com/archive/t-86174_Seiving_for_Primes.html

    that page also points to
    http://mathworld.wolfram.com/PrimalityTest.html
    and
    http://primes.utm.edu/notes/faq/LongestList.html
    "If you want an even longer list, run a sieve program on your machine. Folks quite regularly resieve to find all the primes up to 1,000,000,000,000, this should take well less than a minute."
    WHAT???? then mine is going REAL slow....


    --------------------------------------------
    some more multiple tricks
    http://www.mathsyear2000.org/xplusyfiles/home/issue_11/multiple_tricks/
    it has a multiple of 13 trick, but doesn't seem efficient for really large numbers...
    --------------------------------------------
    http://www.geocities.com/CapeCanaveral/Launchpad/5577/flesson/base12.html
    shows an interesting diagonal pattern for primes...

    this shows some interesting math behind the number 13
    http://www.shyamsundergupta.com/unlucky13.htm

    this has some REALLY good ones
    http://home.egge.net/~savory/maths1.htm

    in fact... that last one might be a good start for an algorithm or AI?

    quote:
    Other blogreaders (sadly even people from .edu domains, who should be able to do the elementary algebra themselves) have asked why I sometimes say ADD and for other primes say SUBTRACT, and ask where the apparently arbitrary factors come from. So let us do some algebra to show the method in my madness.

    We have displayed the recursive divisibility test of number N as f-M*r where f are the front digits of N, r is the rear digit of N and M is some multiplier. And we want to see if N is divisible by some prime P. We need a method to work out the values of M. What you do is to calculate (mentally) the smallest multiple of P which ends in a 9 or a 1. If it's a 9 we are going to ADD, if it's a 1 we are going to SUBTRACT later. Then we will use the leading digit(s) of the multiple as our multiplier M.

    Example for P=17 : three times 17 is 51 which is the smallest multiple of 17 that ends in a 1 or 9. Since it's a 1 we are going to SUBTRACT later. The leading digit is a 5, so we are going to SUBTRACT five times the remainder r. The algorithm was stated above. Now let's do the algebraic proof. Writing N=10f+r, we can multiply by -5 (as shown in the example for 17), getting -5N=-50f-5r. Now we add 51f to both sides (because 51 was the smallest multiple of P=17 to end in a 1 or a 9), giving one f (which we want), so 51f-5N=f-5r. Now if N is divisible by P (here P=17), we can substitute to get 51f-5*17*x=f-5r and rearrange the left side as 17*(3f-5x)=f-5r and therefore f-5r is a multiple of P=17 also. Q.E.D.

    ReplyDelete
  10. Ok, here's a thought

    smaller numbers are faster to sieve without filtering

    larger numbers are faster to filter before sieving

    what if we right an AI/Algorithm that starts off with say, sieving the first 100 primes.. then, uses those to create filters for the next 1000 primes... then uses those for more filters for the next 10000 primes, etc.... maybe then the program would become more efficient as the numbers get larger without sacrificing efficiency early on.

    ReplyDelete
  11. Well, I have rewritten my program to:
    1) use BigInteger instead of Integer
    2) create filters automatically for every prime >= 7
    3) filter population instead of populating all possible and then removing them...

    it seems a LOT slower at the moment, though it gets the correct results and, in theory, is not limited to size....

    one issue is that for the first 123456 numbers, it finds 11602 primes... that means that we have an ArrayList<BigInteger> of 11602 elements and also an ArrayList<CompositeFilter> of 11602 elements... that is obviously not very memory efficient...

    two possible alternatives....
    1) instead of creating a new filter for every single possible prime, maybe we can make it a self-modifying program that just alters a single filter?
    2) instead of creating a list of all the primes in memory, maybe do a random-access memory mapped file... each bit represents that that position is/not a prime... ie: the 3123 bit would tell you whether 3123 is a prime?

    ReplyDelete
  12. Ok, I think what I need to do is use some kind of BitSet... but... BigInteger allows numbers larger than int/Integer -- yet the bit operations on it and on BitSet are limited to int/Integer... might have to do my own implementation that takes BigInteger for position instead of int.

    ReplyDelete
  13. wait a sec...
    an int can hold (2^31)-1 bits....
    so, the same is true with BigInteger and BitSet.

    We are only looking at going up to 2048 bits for the RSA stuff... which is only 2^11, right?

    that shouldn't be a problem. in fact, maybe I didn't need to convert my program to use BigInteger... oh well...

    ReplyDelete
  14. so max the BitSet can handle is 0x7fffffff / 2147483647 / 1111111111111111111111111111111 which is nowhere near big enough to handle the RSA-2048 challenge... it is 617-digits and this one is just 10... hmmm. need to figure out better way to do this...

    how do the Cryptography packages do it?

    ReplyDelete
  15. The largest byte value is 127
    The largest short value is 32767
    The largest integer value is 2147483647
    The largest long value is 9223372036854775807
    The largest float value is 3.40282e+38
    The largest double value is 1.79769e+308

    ReplyDelete
  16. byte
    0x7F
    127

    short
    0x7FFF
    32767

    integer
    0x7fffffff
    2147483647

    long
    0x7fffffffffffffffL
    9223372036854775807

    float
    0x1.fffffeP+127f
    3.4028235e+38f

    double
    0x1.fffffffffffffP+1023
    1.7976931348623157e+308

    ReplyDelete
  17. byte
    8 bits
    -2^7 to 2^7-1

    short
    16 bits
    2 bytes
    -2^15 to 2^15-1

    int
    32 bits
    4 bytes
    -2^31 to 2^31-1

    long
    64 bits
    8 bytes
    -2^63 to 2^63-1

    float
    32 bits
    4 bytes
    6-7 significant digits

    double
    64 bits
    8 bytes
    14-15 significant digits

    ReplyDelete
  18. hmmm. so I am going to have to figure out a way to handle numbers larger than 64-bits... preferably without requiring that ALL numbers (even 2) be handled as 1024-bits...

    hmmm. UTF-8 style perhaps?

    ReplyDelete
  19. It appears that the Keys in the JDK use multiple BigIntegers (exponent & value, modulus & value)...

    Thus, to represent the RSAPublicKey, for example, they use 4 BigIntegers...

    hmm, seems a bit complicated for what I want to do... not really sure what the best approach would be... but obviously using one BigInteger isn't going to cut it for coming up with the primes necessary for testing...

    ReplyDelete
  20. this page shows how to use BigInteger to get bigger values

    http://www.leepoint.net/notes-java/data/numbers/10biginteger.html

    ReplyDelete
  21. tried this program up to 200, and it seemed to work just fine:

    http://www.leepoint.net/notes-java/data/numbers/60factorial.html

    ReplyDelete
  22. Well, my idea was working pretty well:

    Displayed 183073 primes <= 2500000 in 197.175 seconds


    until I realized how many bits I would need to keep track of all 2048-bit possibilities... specifically, 3674018224877614477545066243768582348871416646683824641528456757842452036455585004341310356212820063739027086844759880875030780674761777060562980339733290193252159652649992960022040932743681435885725663682031559833011694836684427646314144833294923327068959889861196761886228147379945972785157255103050645803013405087502672179276143513853249884829912848013728086934612026576704260231561001336360572266150611167852685380064696955468929349115301864049308461884866941457758649074869505253122622630932178648172173618679232496907161165421825744285288825084244005613936921626330968666174894520389915236768940031 TERRABYTES!

    I think I am going to have to rethink this whole approach, and design the AI to either somehow recognize primes without figuring out their predescesors, or a way to ignore whether a number is prime to find the solution.

    ReplyDelete
  23. It's really a cool and helpful piece of information. I'm satisfied that you just shared this helpful info with us.
    Please keep us informed like this. Thanks for sharing.



    Here is my web site :: continue

    ReplyDelete
  24. Hi, I believe your site may be having internet browser compatibility issues.
    When I take a look at your web site in Safari, it looks fine however, when
    opening in Internet Explorer, it's got some overlapping issues. I just wanted to give you a quick heads up! Aside from that, fantastic site!

    Also visit my web site; Nike Free

    ReplyDelete