the design space of encrypted algorithms
An encrypted algorithm is an algorithm that operates over encrypted data. These algorithms are at the heart of encrypted systems and, in particular, of encrypted databases. In this series, we will learn how to design such algorithms, focusing on practical solutions that are efficient enough to be deployed in real-world systems.
Before jumping into the design process, it is useful to understand the design space because encrypted algorithms come in different forms, each of which is better suited to specific types of problems.
Interactive vs. non-interactive solutions. One way to partition the design space is into different interaction regimes. Specifically, solutions can be:
- non-interactive, meaning they do not require interaction during computation. Non-interactive solutions are sometimes referred to as single-round because the client needs to send at least one message with the encrypted input and receive at least one message with the encrypted result.
- interactive, meaning they require multiple rounds of interaction between the client and server.
Linear vs. sub-linear solutions. Another way to partition the design space is with respect to running time. Encrypted algorithms can be:
- super-linear, where the server-side runtime is \(\Omega(n)\), where \(n\) is the size of the input.
- linear, where the server-side runtime is \(\Theta(n)\).
- sub-linear, where the server-side runtime is \(o(n)\).
A natural question is how an algorithm can run in sub-linear time since it does not even have time to read the input. The answer lies in the use of data structures: we first spend linear time to preprocess the input into a data structure that allows for efficient querying and management of the data, enabling sub-linear query times.
This distinction between linear and sub-linear solutions is important not only algorithmically but also cryptographically because it provides a starting point for designing encrypted algorithms. This is due to the fact that the cryptographic primitives at our disposal can also be categorized along the linear vs. sub-linear axis. More precisely, primitives like fully-homomorphic encryption (FHE) and secure multi-party computation (MPC) can be used to design linear solutions whereas primitives like property-preserving encryption (PPE), structured encryption (StE), and oblivious RAM (ORAM) can be used to design sub-linear solutions. But I’ll note that this is not completely true because MPC and ORAM can be combined to get sub-linear solutions.
Native vs. emulated operations. To better understand why certain primitives result in linear solutions while others enable sub-linear solutions, it is helpful to consider the design process. In actuality, all of the cryptographic methods to compute on encrypted data mentioned above only support a very limited set of operations “natively” or explicitly. Specifically,
- FHE natively supports arithmetic operations (i.e., addition and multiplication) on finite fields;
- MPC natively supports either Boolean operations (i.e., AND, OR, and NOT) on bits or arithmetic operations on finite fields;
- ORAM natively supports read and write operations on arrays;
- encrypted multi-maps (EMM), a common type of StE scheme, support get and put operations on multi-map data structures;
While these primitives natively support only a limited set of operations on encrypted data, they can support a broader range of operations. For example:
- FHE and MPC can support any operation by first compiling/transforming it into arithmetic and Boolean circuits, respectively, and evaluating the circuits using the natively-supported operations on encrypted data;
- ORAMs can emulate any operation by compiling it into standard instructions (i.e., read, write, addition, multiplication, etc.) and executing the reads and writes over the ORAM and the arithmetic operations locally;
- EMMs can emulate any operation in the same manner as ORAMs.
So the total cost of a solution basically depends on two factors: the cost of the native operations themselves; and the overhead introduced by the compilation process. In the case of FHE, for example, the native operations have improved in recent years (e.g., TFHE-rs library can perform 32-bit unsigned integer additions in \(106\) ms and multiplications in \(207\) ms on an AWS hpc7a.96xlarge instance) but the number of homomorphic operations required is linear in the data size.
Big data vs. small data. Linear-time algorithms are often sufficient for small-scale data. But, as data size grows or time constraints tighten, sub-linear solutions become necessary. For instance, computing the average exam score of 300 students given \(24\) hours is perfectly feasible with a linear-time algorithm; but searching for a single record in a billion record database within a second requires a sub-linear solution.
As mentioned above, sub-linear (plaintext) algorithms crucially rely on data structures. So it should be no surprise that designing sub-linear encrypted algorithms will crucially rely encrypted data structures. And, as we will see in future posts, we will design those structures using structured encryption schemes.