Interesting People mailing list archives

IP: CRYPTO An Analysis of Shamir's Factoring Device (fwd)

From: Dave Farber <farber () cis upenn edu>
Date: Wed, 05 May 1999 09:47:26 -0400

From: 7Pillars Partners <partners () sirius infonex com>
Reply-To: iwar () sirius infonex com
To: g2i list <g2i () xmission com>, IWAR list <iwar () sirius infonex com>
Subject: [IWAR] CRYPTO An Analysis of Shamir's Factoring Device

The symbols don't translate into text; please follow the link using a
graphical browser if you need the specifics.

http://www.rsa.com/rsalabs/html/twinkle.html

   An Analysis of Shamir's Factoring Device
   Robert D. Silverman
   RSA Laboratories
   
   May 3, 1999
   
   At a Eurocrypt rump session, Professor Adi Shamir of the Weizmann
   Institute announced the design for an unusual piece of hardware. This
   hardware, called "TWINKLE" (which stands for The Weizmann INstitute Key
   Locating Engine), is an electro-optical sieving device which will
   execute sieve-based factoring algorithms approximately two to three
   orders of magnitude as fast as a conventional fast PC.  The announcement
   only presented a rough design, and there a number of practical
   difficulties involved with fabricating the device.  It runs at a very
   high clock rate (10 GHz), must trigger LEDs at precise intervals of
   time, and uses wafer-scale technology.  However, it is my opinion that
   the device is practical and could be built after some engineering effort
   is applied to it.  Shamir estimates that the device can be fabricated
   (after the design process is complete) for about $5,000.
   
   What is a sieve-based factoring algorithm?
   
   A sieve based algorithm attempts to construct a solution to the
   congruence A2 = B2 mod N,  whence GCD(A-B,N) is a factor of N.  It does
   so by attempting to factor many congruences of the form  C = D mod N,
   where there is some special relation between C and D.  Each of C and D
   is attempted to be factored with a fixed set of prime numbers called a
   factor base.  This yields congruences of the form:
   
                            (Pia) =  (pi b) mod N
                                       
   where  Pi are the primes in the factor base associated with C and  pi
   are the primes in the factor base associated with D. These factored
   congruences are found by sieving all the primes in the factor base over
   a long sieve interval.   One collects many congruences of this form (as
   many as there are primes in the two factor bases) then finds a set of
   these congruences which when multiplied together yields squares on both
   sides. This set is found by solving a set of linear equations mod 2.
   Thus, there are two parts to a sieve-based algorithm: (1) collecting the
   equations by sieving, and (2) solving them.  The number of equations
   equals the sum of the sizes of the factor bases.  A variation allows
   somewhat larger primes in the factorizations than those in the factor
   bases.  This has the effect of greatly speeding the sieving process, but
   makes the number of equations one needs to solve much larger. One could
   choose not to use the larger primes, but then one needs a much larger
   factor base, once again resulting in a larger matrix.
   
   It should be noted that sieve based algorithms can also be used to solve
   discrete logarithm problems as well as factor.  This applies to discrete
   logs over finite fields, but not to elliptic curve discrete logs.
   Solving discrete logs takes about the same amount of time as factoring
   does for same-sized keys.  However, the required space and time for the
   matrix is much larger for discrete logs. One must solve the system of
   equations modulo the characteristic of the field, rather than mod 2.
   
   What has been achieved so far with conventional hardware?
   
   Recently, a group led by Peter Montgomery announced the factorization of
   RSA-140, a 465-bit number.  The effort took about 200 computers, running
   in parallel, about 4 weeks to perform the sieving, then it took a large
   CRAY about 100 hours and 810 Mbytes of memory to solve the system of
   equations.  The size of the factor bases used totaled about 1.5 million
   primes resulting in a system of about 4.7 million equations that needed
   to be solved.
   
   How long would RSA-140 take with TWINKLE?
   
   Each device is capable of accommodating a factor base of about 200,000
   primes and a sieve interval of about 100 million. RSA-140 required a
   factor base of about 1.5 million, and the sieve interval is adequate, so
   about 7 devices would be needed.  One can use a somewhat smaller factor
   base, but a substantially smaller one would have the effect of greatly
   increasing the sieving time. This set of devices would be about 1000
   times faster than a single conventional computer, so the sieving could
   be done in about 6 days with 7 devices.  The matrix would still take 4
   days to solve, so the net effect would be to reduce the factorization
   time from about 33 days to 10 days, a factor of 3.3.   This is an
   example of Amdahl's law which says that in a parallel algorithm the
   maximum amount of parallelism that can be achieved is limited by the
   serial parts of the algorithm. The time to solve the matrix becomes a
   bottleneck.  Even though the matrix solution for RSA-140 required only a
   tiny fraction of the total CPU hours, it represented a fair fraction of
   the total ELAPSED time: it took about 15% of the elapsed time with
   conventional hardware for sieving. It would take about 40% of the
   elapsed time with devices. Note further that even if one could sieve
   infinitely fast, the speedup obtained would only be a factor of 8 over
   what was actually achieved.
   
   How long would a 512-bit modulus take with TWINKLE?
   
   A 512-bit modulus would take 6 to 7 times as long for the sieving and 2
   to 3 times the size of the factor bases as RSA-140.  The size of the
   matrix to be solved grows correspondingly, and the time to solve it
   grows by a factor of about 8. Thus, 15 to 20 devices could do the
   sieving in about 5-6 weeks. Doubling the number will cut sieving time in
   half. The matrix would take another 4 weeks and about 2 Gbytes of memory
   to solve. The total time would be 9-10 weeks.  With the same set of
   conventional hardware as was used for RSA-140, the sieving would take 6
   to 7 months and the matrix solving resources would remain the same.
   
   Please note that whereas with RSA-140, solving the matrix would take 40%
   of the elapsed time, with a 512-bit number it would take just a bit
   more.  This problem will get worse as the size of the numbers being
   factored grows.
   
   How well will TWINKLE scale to larger numbers? 
   
   A 768 bit number will take about 6000 times as long to sieve as a
   512-bit number and will require a factor base which is about 80 times
   large.  The length of the sieve interval would also increase by a factor
   of about 80. Thus, while about 1200 devicess could accommodate the
   factor base, they would have to be redesigned to accommodate a much
   longer sieve interval. Such a set of machines would still take 6000
   months to do the sieving.  One can, of course, reduce this time by
   adding more hardware.  The memory needed to hold the matrix would be
   about 64 Gbytes and would take about 24,000 times as long to solve.
   
   A 1024-bit number is the minimum size recommended today by a variety of
   standards (ANSI X9.31, X9.44,  X9.30, X9.42).  Such a number would take
   6 to 7 million times as long to do the sieving as a 512-bit number.  The
   size of the factor base would grow by a factor of about 2500, and the
   length of the sieve interval would also grow by about 2500.  Thus, while
   about 45,000 devices could accommodate the factor base, they would again
   have to be redesigned to accommodate much longer sieve intervals.  Such
   a set  would still take 6 to 7 million months (500,000 years) to do the
   sieving.
   
   The memory required to hold the matrix would grow to 5 to 10 Terabytes
   and the disk storage to hold all the factored relations would be in the
   Petabyte range.  Solving the matrix would take "about" 65 million times
   as long as with RSA-512.  These are rough estimates, of course, and can
   be off by an order of magnitude either way.
   
   
   What are the prospects for using a smaller factor base?
   
   The Number Field Sieve finds its successfully factored congruences by
   sieving over the norms of two sets of integers.  These norms are
   represented by polynomials.  As the algorithm progresses, the
   coefficients of the polynomials become larger, and the rate at which one
   finds successful congruences drops dramatically.  Most of the successes
   come very early in the running of the algorithm. If one uses a
   sub-optimally sized factor base, the 'early' polynomials do not yield
   enough successes for the algorithm to succeed at all. One can try
   sieving more polynomials, and with a faster sieve device this can
   readily be done. However, the yield rate can drop so dramatically that
   no additional amount of sieving can make up for the too-small factor
   base.
   
   The situation is different if one uses the Quadratic Sieve. For this
   algorithm all polynomials are 'equal', and one can use a sub-optimal
   factor base. However, for large numbers, QS is much less efficient than
   NFS.  At 512-bits, QS is about 4 times slower than NFS.  Thus, to do
   512-bit numbers with devices,  QS should be the algorithm of choice,
   rather than NFS.  However, for 1024-bit numbers,  QS is slower than NFS
   by a factor of about 4.5 million. That's a lot. And the factor base will
   still be too large to manage, even for QS.
   
   What are the prospects for speeding the matrix solution?
   
   Unlike the sieving phase, solving the matrix does not parallelize
   easily.  The reason is that while the sieving units can run
   independently, a parallel matrix solver would require the processors to
   communicate frequently and both bandwidth and communication latency
   would become a bottleneck.  One could try reducing the size of the
   factor bases, but too great a reduction would have the effect of vastly
   increasing the sieving time. Dealing with the problems of matrix storage
   and matrix solution time seems to require some completely new ideas.
   
   Key Size Comparison
   
   The table below gives, for different RSA key sizes, the amount of time
   required by the Number Field Sieve to break the key (expressed in total
   number of arithmetic operations), the size of the required factor base,
   the amount of memory, per machine, to do the sieving, and the final
   matrix memory.
   
   The time column in the table below is useful for comparison purposes. It
   would be difficult to give a meaningful elapsed time, since elapsed time
   depends on the number of machines available. Further, as the numbers
   grow, the devices would need to grow in size as well. RSA-140 (465 bits)
   will take 6 days with 7 devices, plus the time to solve the matrix.
   This will require about 2.5 * 1018 arithmetic operations in total. A
   1024-bit key will be 52 million times harder in time,  and about 7200
   times harder in terms of space.
   
   The data for numbers up to 512-bits may be taken as accurate. The
   estimates for 768 bits and higher can easily be off by an order of
   magnitude.
   
   Keysize
   
   Total Time
   
   Factor Base
   
   Sieve Memory
   
   Matrix Memory
   
   428
   
   5.5 * 1017
   
   600K
   
   24Mbytes
   
   128M
   
   465
   
   2.5 * 1018
   
   1.2M
   
   64Mbytes
   
   825Mbytes
   
   512
   
   1.7 * 1019
   
   3M
   
   128Mbytes
   
   2 Gbytes
   
   768
   
   1.1 * 1023
   
   240M
   
   10Gbytes
   
   160Gbytes
   
   1024
   
   1.3 * 1026
   
   7.5G
   
   256Gbytes
   
   10Tbytes
   
   Conclusion
   
   The idea presented by Dr. Shamir is a nice theoretical advance, but
   until it can be implemented and the matrix difficulties resolved it will
   not be a threat to even 768-bit RSA keys, let alone 1024.
   
   References:
   
   (1) A.K. Lenstra & H.W. Lenstra (eds),  The Development of the Number
   Field Sieve, Springer-Verlag Lecture Notes in Mathematics #1554
   
   (2) Robert D. Silverman,  The Multiple Polynomial Quadratic Sieve,
   Mathematics of Computation, vol. 48, 1987, pp. 329-340
   
   (3) H. teRiele,  Factorization of RSA-140,  Internet announcement in
   sci.crypt and sci.math, 2/4/99
   
   (4)   R.M. Huizing,   An Implementation of the Number Field Sieve,   CWI
   Report NM-R9511,  July 1995

   Questions and Answers: Shamir's Factoring Device and RSA
   What is RSA announcing?
   
   At the Eurocrypt '99 conference this week in Prague, Adi Shamir, a
   coinventor of the RSA public-key algorithm and a professor at the
   Weizmann Institute in Israel, is presenting a design for a special
   hardware device that would speed up the first part of the process of
   factoring a large number. The design, called TWINKLE, which stands for
   "The Weizmann INstitute Key Locating Engine," is based on
   opto-electronics. Shamir estimates that the device would be as powerful
   as about 100 to 1,000 PCs in the factoring process called "sieving," and
   would cost only about $5,000 in quantity.
   
   Does this mean that RSA can be cracked?
   
   No. Shamir's device offers the possibility of recovering keys less
   expensively than with a network of PCs, but does not crack RSA in the
   sense of making it easy to recover keys of any size. Rather, the device
   speeds up the "sieving" step of known methods of factoring large
   numbers, which are the primary avenues for attacking the RSA public-key
   algorithm. The design confirms what was previously expected about the
   appropriateness of certain RSA key sizes, including 512 bits. Larger RSA
   key sizes are still out of reach, one of the obstacles being the amount
   of work and storage involved in the rest of the process of factoring a
   large number.
   
   What would it take to build the new device?
   
   Building the device would involve a fair amount of opto-electronic
   engineering, but it is likely to be feasible.
   
   RSA sponsors competitions that demonstrate that DES can be cracked by a
   group of determined computer enthusiasts using networked computers. Why
   can't this be applied to RSA?
   
   Actually, such competitions can and have been applied to RSA. Since
   several years before RSA started the
   DES Challenges, which offer prizes for successful recovery of a 56-bit
   DES key, RSA has been awarding prizes for successful factorizations of
   large numbers, including many RSA numbers. Very few of the RSA numbers
   have been factored so far, however, the largest one being about 450 bits
   long, still short of the 512-bit mark targeted by the new device.
   
   Perhaps the new device, if built, may figure into future cracking
   efforts around the 512-bit level, just as the Electronic Frontier
   Foundation's Deep Crack device has facilitated the last two cracking
   efforts for DES.
   
   What can developers do to safeguard their products against advances due
   to the new device?
   
   One of the benefits of the RSA public-key algorithm is that it has a
   variable key size, so, in effect, it has variable strength. This is in
   contrast to DES, which has a fixed, 56-bit key size and is difficult to
   safeguard if the key size is found to be insufficient. Products based on
   RSA can be protected against the new device and other developments in
   factoring technology with appropriate key sizes.
   
   In the 1980s, the "default" key size for many RSA implementations was
   512 bits, which even as of this writing has not been broken. Several
   years ago, recognizing that 512-bit keys might be at risk in the near
   future, RSA Laboratories recommended that developers choose a minimum
   key size of 768 bits for user keys and 1024 bits for enterprise keys.
   (The recommendation for certificate authority "root" keys was 2048
   bits.) Products following these recommendations are safe against the new
   threat, and products that support a variable key size can be safeguarded
   through the deployment of longer keys.
   
   Last week, RSA issued a bulletin about a weakness in the ISO 9796
   signature format. Today, you're disclosing that it is possible to break
   RSA.  Is RSA's encryption technology still secure?
   
   Yes, RSA's encryption technology is still secure. Last week's
   announcement was about a weakness in an alternative format for preparing
   messages for RSA signatures that is not supported by RSA's products. The
   weakness was related to the format, and not the RSA public-key algorithm
   itself. Today's announcement is about a clever design for a hardware
   device that speeds up known methods of factoring large numbers, not a
   new method of attacking the RSA public-key algorithm. In both cases, the
   strength of the methods supported in RSA's products and of the key sizes
   recommended by RSA Laboratories have been confirmed.
   
   RSA Laboratories will continue to report on these and similar
   developments.

Copyright  1999, RSA Data Security, Inc.  All Rights Reserved.
Current thread:

IP: CRYPTO An Analysis of Shamir's Factoring Device (fwd) Dave Farber (May 05)