CMSC 603 Notes, 2/19/99

Bianca Benincasa

bbenin1@cs.umbc.edu

CMSC 603 Notes -- February 19, 1999

1. The set of functions mapping N to N is uncountable.

The proof is by contradiction. Assume the set is countable; then every function mapping N to N can be labeled as fi for some i in N.

We construct function f mapping N to N as follows: f(i) = fi(i) + 1, for all i in N. f maps N to N since the natural numbers are closed under addition (and 1 is a natural number).

If the funtion f is in the listing, then f is fj for some j in N. But this implies that fj(j) = f(j) = fj(j) + 1, and so fj(j) = fj(j) + 1, and so (by subtracting fj(j) from both sides of the equation) 0 = 1, which is a contradiction.

Therefore, our assumption that the set of functions mapping N to N is countable causes a contradiction, and thus the set must be uncountable. Q.E.D.

2. P vs. NP

Diagonalization proofs (such as the one above) were for a long time the main method used by those trying to prove or disprove that P = NP. However, Baker, Sullivay and Gill later showed that no such proof (or disproof) of this conjecture can exist. To understand their result, we must discuss relativized Turing machines.

A relativized Turing machine is a Turing machine with the addition of an "oracle", which is an initially empty tape to which the machine may write questions (over the alphabet Sigma of the machine) and then later get the answer to any previously written question in one cycle.

Let P(A) denote the set of problems solvable in polynomial time with oracle A, and NP(A) denote the set of problems solvable in polynomial time with _nondeterministic_ oracle A. Baker, Sullivay and Gill showed that there exist oracles A and B such that P(A) = NP(A), but P(B) != NP(B). Since every proof by oracle is equivalent to a proof by diagonalization, any diagonalization proof that P=NP would fail at oracle B. (Similarly, any diagonalization proof that P!=NP would fail at oracle A.) Thus, no diagonalization proof can demonstrate whether P=NP, and so complexity theorists are forced to look to other methods to decide this question.

A related topic is the Random Oracle Hypothesis. This hypothesis states that if p is true with probability 1 for some oracle A, then p must be true. (Oracles define probability distributions; an example follows: Any oracle A either has P(A) = NP(A) or P(A) != NP(A). Thus, we can partition the set of oracles into those for which P(A) = NP(A) and those for which P(A) != NP(A), and then form the ratio of the number of oracles A for which P(A) = NP(A) to the total number of oracles A. This ratio is the probability that an oracle A satisfies P(A) = NP(A).) There is no general consensus yet on whether the Random Oracle Hypothesis is true.

3. Maxwell's Demon

An aid to showing that Maxwell's perpetual motion machine cannot exist is Landauer's Principle. This principle states that for every bit lost during a computation, the computing device dissipates an amount of energy given by the formula kT *ln(2), where k is Boltzmann's constant and T is the temperature. (We say that a quantity x is lost during computation if x is used as part of a smaller computation but not saved; for example, if x+y is computed without saving at least one of x and y, x and y have both been lost, since the sum does not tell us which two addends formed it. If x had been saved, y could be recovered by taking (x+y) - x, and no bits would have been lost. The same would occur had y been saved.) We refer to a calculation during which bits are lost as a nonreversible calculation.

If all computations by a given machine are reversible, the heat generated during computation can be made arbitrarily small. (There exist "reversible compilers" which can compile any program into an equivalent program with all calculations reversible, although the programs they generate require exponential space in order to save all intermediate values during calculations.)

4. Rabin's Cryptosystem

Rabin's cryptosystem encodes a member x of Zn to another member y of Zn by the formula y = x^2 (mod n), where n is the product of two large primes p and q. The security of this cryptosystem is based on the assumption that factoring n is "hard".

To understand why this assumption assures the security of the system, we first need some number-theoretic background. One elementary method of factoring an integer n is by exhaustion: simply divide n by each prime p which is less than the square root of n; if the quotient is an integer, then n is broken down into factors p and n/p. (The process may be repeated on n/p, if desired.) However, there is a way which is (in theory) faster.

If we were aware of two numbers x and y in Zn such that x^2 = y^2 (mod n), then that would imply that n divides the quantity x^2 - y^2. We can factor this difference of squares to obtain that n divides the quantity (x+y)(x-y). Since n = pq, pq divides (x+y)(x-y), and since p and q are primes, p must divide either (x+y) or (x-y) (or both), and so must q. If p divides (x+y) and q divides (x-y), then we have the result that gcd(n,x-y) = q. (If p and q both divide (x-y) or (x+y), then x=y or x=-y (mod n) and so this gcd is just 1, and we don't know anything about n's factorization.) Now, by taking the greatest common divisor of n and (x-y), we have factored n into q (!=1) and n/q = p. (The same result would occur with p and q reversed.)

Thus, for any x,y in Zn such that x^2 and y^2 are congruent mod n, if p and q each divide one of (x+y),(x-y), we can easily factor n.

We reduced the problem of breaking Rabin's system to the problem of factoring, by showing that if there were a (deterministic polynomial time) algorithm that could break Rabin's system, that algorithm could be used to factor n=pq. The proof went as follows.

Suppose such an algorithm A existed. Take any number x in Zn and calculate x^2 (mod n). Then input this x^2 (mod n) and n into algorithm A to obtain y such that y^2 = x^2 (mod n). If p and q each divide one of (x+y),(x-y), then we can factor n into p and q as outlined above. How often can we expect this to happen?

Since n=pq and p,q are primes, Zn is isomorphic to Zp x Zq. Thus, every quadratic residue in Zn has four square roots: the positive root mod p, the negative root mod p, and the positive and negative roots mod q. Thus, choosing two of these four roots at random (with replacement) will result in choosing one root mod p and one root mod q half of the time. This output corresponds to the case in which we can easily factor n. Since the algorithm A was only given x^2 (mod n) and not x, even a "malicious" algorithm could not give a root of x^2 mod p if x was a root of x^2 mod p, or vice versa, any more often than this would happen randomly. Thus, the output of our algorithm A will enable us to factor n into p and q about half the time. Therefore, we expect that on average after running algorithm A as described above twice, we will have x and y which will enable us to easily factor n.

This reduction proves that if an algorithm to break Rabin's system existed, then an algorithm (as outlined above) would exist to factor n where n is a product of two large primes. This means that if no algorithm to factor such n currently exists, then no algorithm to break Rabin's system currently exists. Thus, the system is secure as long as factoring is an intractable problem. (Breaking the system is "at least as hard as" factoring.)

Another cryptosystem is the RSA (Rivest, Shamir, Adelman) scheme, in which the encryption of x is x^e (mod n), where there exists a decryption code d such that ed = 1 (mod phi(n)), where phi(n) denotes Euler's totient function. There is currently no reduction from the problem of breaking this cryptosystem to factoring.

5. Recurrence Relation from Homework 3

The solution to the recurrence

T(n) = T(floor(an)) + T(ceiling(bn)) + n, for n>=2

T(n) = 1, for n < 2, where a+b < 1

was discussed.

The claim to be proven is that there exist positive constants c and n_0 such that for all n>=n_0, T(n) < cn. This can be proven by induction on n.

Let S be the set of positive integers n which are greater than 0 and which satisfy T(n) < = cn.

Base case: 0 is in S; holds as long as c >= T(n_0)/n_0.

Inductive step: Let N be a positive integer, and assume that {0,1,...,N} is a subset of S.

Then T(N+1) = T(floor(a(N+1))) + T(ceiling(b(N+1))) + N+1

< = c*floor(a(N+1)) + c*ceiling(b(N+1)) + N+1

as long as n_0 is chosen large enough

< = ca(N+1) + c(b(N+1) + 1) + N+1

= (a+b)c(N+1) + c + N + 1

= (1-epsilon)c(N+1) + c + N + 1, where epsilon=1-(a+b)

= c(N+1) + c + N + 1 - epsilon(cN + c)

= c(N+1) + c + N + 1 - epsilon(cN) - epsilon(c)

< = c(N+1) + c + N + 1 - epsilon(cN)

since epsilon,c>0

= c(N+1) + N(1 - c*epsilon) + c + 1

< = c(N+1), if N(1 - c*epsilon) + c + 1 >= 0.

This condition implies N(1 - c*epsilon) >= -1 - c

or N >= (-1 - c)/(1 -c*epsilon).

The expression on the right hand side is just a constant, and so by choosing n_0 to be at least this constant, we can insure that the inequality holds. (To assure that (1 - c*epsilon) is positive, and thus that we did not violate the inequality by dividing by it, we must choose c to be at least 1/epsilon.)