Fast Fully Oblivious Compaction and Shuffling

Sajin Sasy, Aaron Johnson, and Ian Goldberg

Citation: Sajin Sasy, Aaron Johnson, Ian Goldberg. "Fast Fully Oblivious Compaction and Shuffling". 29th ACM Conference on Computer and Communications Security, November 2022.

Paper: CCS 2022 version; extended version. The extended version makes a small correction to the pseudocode for BORPStream in Figures 9 and 10, as well improving its notation. It also provides appendices including definitions, proofs, and additional experimentation graphs.

Abstract: Several privacy-preserving analytics frameworks have been proposed that use trusted execution environments (TEEs) like Intel SGX. Such frameworks often use compaction and shuffling as core primitives. However, due to advances in TEE side-channel attacks, these primitives, and the applications that use them, should be fully oblivious; that is, perform instruction sequences and memory accesses that do not depend on the secret inputs. Such obliviousness would eliminate the threat of leaking private information through memory or timing side channels, but achieving it naively can result in a significant performance cost.

In this work, we present fast, fully oblivious algorithms for compaction and shuffling. We implement and evaluate our designs to show that they are practical and outperform the state of the art. Our oblivious compaction algorithm, ORCompact, is always faster than the best alternative and can yield up to a 5× performance improvement. For oblivious shuffling, we provide two novel algorithms: ORShuffle and BORPStream. ORShuffle outperforms prior fully oblivious shuffles in all experiments, and it provides the largest speed increases—up to 1.8×—when shuffling a large number of small items. BORPStream outperforms all other algorithms when shuffling a large number of large items, with a speedup of up to 1.4× in such cases. It can obtain even larger performance improvements in application settings where the items to shuffle arrive incrementally over time, obtaining a speedup of as much as 4.2×. We additionally give parallel versions of all of our algorithms, prove that they have low parallel step complexity, and experimentally show a 5–6× speedup on an 8-core processor.

Finally, ours is the first work with the explicit goal of ensuring full obliviousness of complex functionalities down to the implementation level. To this end, we design Fully Oblivious Assembly Verifier (FOAV), a tool that verifies the binary has no secret-dependent conditional branches.

Code: Version 20220901