CMSC 28400 Introduction to CryptographyAutumn 2019Notes #9: Pseudorandom Functions and (Many-Time) CPA SecurityInstructor: David CashContents9.1Pseudorandom Functions19.2Birthday Attacks39.3Chosen-Plaintext Security59.1Pseudorandom FunctionsWe now give an abstraction describing a main security goal of block ciphers. In this definition wewill need the concept of an oracle for an algorithm. You can think about oracles intuitively as“subroutines” that an algorithm can call. When A is an oracle algorithm and O1 is a function, wewrite AO1 for A connected to the oracle (subroutine) O1 . If O2 is another oracle, we write AO2for A connected to O2 , and so on. A key point in this formalism is that A can only observe theinput/output behavior of its oracle, and not the code implementing the oracle. So if O1 and O2 arethe same function, then AO1 and AO2 will behave exactly the same.The following definition also needs the concept of a random function f : {0, 1} {0, 1} . Youcan think of such a random variable as a table listing an output for every possible input, and suchthat the outputs are all uniformly random and independent. For a concrete example with 3,we might have f represented by the table:x000001010011100100101110111f (x)100011000101011111001000110So “picking a random function” means filling in the right side of the table with uniformly randomand independent entries. These entries may repeat, but the outcome of one entry does not influencethe outcome of any other entry.1

Example 9.1. Let f be a random function from {0, 1}128 to {0, 1}128 . Then the following hold:Pr[f (0128 ) 0128 ] 12128,1Pr[f (0128 ) starts with 0] ,21Pr[f (0128 ) f (1128 )] 128 ,21128128128Pr[f (0 ) f (1 ) 1 ] 128 .2These can be proved rigorously by the independence and uniformity of f (0128 ) and f (1128 ) (or, onecould actually count the fraction of all possible functions for which these hold).Things are not always so simple. For instance,Pr[f (f (0128 )) 0128 ] 12128 (1 12128)12128,which can be seen by applying the law of total probability to split up the probability into the caseswhere f (0128 ) 0128 and f (0128 ) 6 0128 .Exercise 9.1. Let Func({0, 1} , {0, 1} ) be the set of all functions from {0, 1} to {0, 1} . This set has size 2 2 . Using the table point of view on functions, or another method, explain this formula.We can now state the definition, and afterwards interpret it.Definition 9.1. Let E : {0, 1}n {0, 1} {0, 1} and A be an adversary. Let K be uniformon {0, 1}n , and let f be a random function from {0, 1} to {0, 1} . We define the pseudorandomfunction (PRF) distinguishing advantage of A against E to beE(K,·)Advprf 1] Pr[Af (·) 1] .E (A) Pr[AAn adversary A can be judged according to its runtime, the number of queries it issues, and itsadvantage. We won’t distinguish much between runtime and queries, but in practice an adversarywill usually have far more computation time than it does queries, because the queries need to be runby the parties under attack. They won’t be enabling anything like 280 queries, but the adversarymight have that much computation. If the advantage of every reasonable A is small (say 2164 or1, depending on the application), we informally say that E is a good PRF.2128In that case we think of E as “essentially looking like a random function,” and all of theproperties of random functions are thus inherited by E. So E(K, x) and E(K, x0 ) should look liketotally random, independent strings, when x 6 x0 , even when they differ only by one bit. If not,then there would be an adversary with good PRF advantage.Example 9.2. Suppose E : {0, 1}128 {0, 1}128 {0, 1}128 satisfies E(k, 0128 ) 0128 for everyk {0, 1}128 . We show that E is not a good PRF. Consider AO that queries 0128 to its oracle Oand calls the response y. If y 0128 then A outputs 1, else 0. ThenPr[AE(K,·) 1] 1andPr[Af (·) 1] 21,2128

the latter because a random function satisfies f (0128 ) 0128 with probabilityAdvprfE (A) 1 121281.2128Thus,which is high.Note that we had to define A with respect to a generic oracle O; We don’t get to say how Aworks with E(K, ·) and f (·) separately, since A only gets to see their input/output behavior. Morecomplicated patterns can be found with more clever attacks.Example 9.3. Define E : {0, 1}128 {0, 1}128 {0, 1}128 by E(k, x) k x. We show thatE is not a good PRF. Consider AO that fixes any x1 6 x2 {0, 1}128 , queries y1 O(x1 ), andy2 O(x2 ). If y1 y2 x1 x2 then A outputs 1, else it outputs 0.Pr[AE(K,·) 1] 1and1,2128the latter because a random function satisfies that equation with probabilityare uniform and independent). ThusPr[Af (·) 1] Pr[f (x1 ) f (x2 ) x1 x2 ] AdvprfE (A) 1 1212812128(because f (x1 ), f (x2 ),which is high. Note that we needed two queries to break this one – It’s actually impossible to breakit in one query!Finally we address the need for oracles in the definition. It would be simpler to, say, provide afunction table (like the one we diagrammed above) to A, and ask it if the table represents E(K, ·)or f . But this table would gigantic for the sizes we care about: 2128 · 128 bits for AES. But noalgorithm with realistic resources could even read its input in that case. We resort to oracles toallow “selective” access to parts of the function without paying a huge runtime.9.2Birthday AttacksYou may have noticed that in the definition of PRF advantage, when the oracle is E(K, ·), thenthe adversary will never see a repeated output for two distinct inputs. But when the oracle is f (·)then two distinct inputs will produce the same output with non-zero probability. This suggests thefollowing adversary, which attacks E : {0, 1}n {0, 1}k {0, 1}k and issues some number q 2 queries:Adversary AOInitialize an empty set SFor i 1, . . . , q:Let xi {0, 1} be the -bit encoding of iQuery yi O(xi )If yi S: Output 1Add yi to SOutput 03

The set S could be implemented with a hash table or another data structure that allows for fastmembership testing and fast addition.Let’s find the advantage of this adversary. First it is simple to show thatPr[AE(K,·) 1] 0because the yi will all be distinct points, and hence will never be in S in the for loop. This isbecause the yi are the outputs of a block cipher applied with the same key to distinct inputs. Onthe other hand,Pr[Af (·) 1]is less obvious to calculate. We will need the following interesting and well-known “birthday bound”.Theorem 1. Let N, q 1 be integers, and let Col(N, q) be the probability that there is a repeatedvalue (a collision) in q independent uniform samples from a set of size N . Then1 exp q(q 1)/2N Col(N, q) 0.5If q q(q 1).N2N , then this implies0.3q(q 1)q(q 1) Col(N, q) 0.5.NNAbove exp(x) ex is the usual exponential function from calculus.Exercise 9.2. Prove the upper bound of the theorem via a union bound.The lower bound is only slightly harder to prove; See page 273 of rogaway/classes/227/spring05/book/main.pdf.What is surprising about this bound is that the probability of a collision is approximately q 2 /Nand not q/N . In particular, once q is only a little larger than N , the probability jumps upexponentially close 1 and is therefore almost certain. This results in some surprising phenomena,like the result of the following exerciseExercise 9.3. Use the theorem to find the minimum number q of people required to have a 1/2 orgreater chance of two people having the same birthday. Assume birthdays are uniformly randomand independent values amongst the 365 days of the year. Returning to our adversary, and assuming q 2 · 2 , we havePr[Af (·) 1] 0.3q(q 1),2 because the yi values in the algorithm are all uniform and independent samples from a set of size2 when the oracle is a random function. ThusE(K,·)Advprf 1] Pr[Af (·) 1]E (A) Pr[A 0 Pr[Af (·) 1] Pr[Af (·) 1]q(q 1) 0.3.2 4

Thus, for -bit outputs, we only need q 2 /2 , not q 2 queries, to have good PRF advantage!For AES, this means about 264 queries suffice, which is astronomically smaller than 2128 .These attacks are called “birthday attacks,” and are avoidable if one insists on working with ablock cipher (as we do). Thus we’ll aim for a block cipher to have only this defect, and remain agood PRF until about 2 /2 queries are issued.9.3Chosen-Plaintext SecurityWe now extend the one-time CPA definition to many queries.Definition 9.2. Let Π (Enc, Dec) be a randomized encryption scheme with key-space K, messagespace M, randomness-space R, and ciphertext-space C. We assume that the message-space M isa set of bit strings, i.e. M {0, 1} . Let A be an algorithm. Define algorithm ExptcpaΠ (A) asAlg ExptcpaΠ (A) Pick k K, b {0, 1}02 Run ALRk,b (·,·) , where the oracle is given below. Eventually A halts with output b̂03 If b̂ b: Output 106 Else: Output 001Oracle LRk,b (m0 , m1 )If m0 , m1 are not the same length: Return Pick r RCompute c Enc(k, mb , r)Return cDefine the CPA advantage of A against Π ascpaAdvcpaΠ (A) Pr[ExptΠ (A) 1] 12The definition for the stateful version of the scheme is the same, except the oracle works asfollows:Oracle LRk,b (m0 , m1 )If m0 , m1 are not the same length: Return Compute (c, s0 ) Enc(k, mb , s)Overwrite the current state s with s0Return c5