In the GhostRider paper, we presented a memory trace oblivious system including a type-system, compiler, ISA additions, and hardware FPGA implementation. GhostRider leverages compiler and microarchitecture co-design to improve upon past systems such as Ascend and Phantom that protect against attackers snooping the off-chip DRAM address bus. By exposing on-chip scratchpads at the ISA-level, the GhostRider compiler is able to control when off-chip accesses occur, preventing timing side channels. Similarly, direct control of the placement of data in Oblivious RAM (ORAM) or Encrypted RAM (ERAM) based on static analysis of the program's access pattern allows for improved performance with the same security guarantee as placing all data in ORAM. The GhostRider prototype was implemented using JavaCC for the compiler and a modified RISC-V Rocket Chip that ran on the Convey HC-2ex.
In our prototype, the specific latencies reported for ERAM/ORAM accesses include overhead for moving data to and from the scratchpads. This overhead is specific to our choice of the Convey HC-2 platform as well as our hardware design. Our simple academic implementation makes many tradeoffs that do not reflect the fundamental characteristics of the architecture. For example, the transfer unit expects data to be received from an ORAM controller on a separate FPGA. This incurs additional latency as the cross-FPGA link is only 32 bits wide and data must be received and formatted to be written to the scratchpad. Similarly, the ERAM access latency is dominated by data movement that could be optimized with more engineering effort. This makes the difference between ORAM and ERAM latency <10x in our prototype, when prior work has overheads of almost 100x bandwidth.
In light of this discrepancy, one might question whether the GhostRider prototype is unrealistic, and the experimental results presented in the paper invalid. However, this reasoning misses the point of why we do experiments in systems and architecture research – to convince ourselves that our idea works and we did not overlook something simple, and to convince the community the idea has potential benefit and deserves further investigation. Thus when comparing a new system to GhostRider, it is vital to ensure that parameters such as ERAM/ORAM latency are consistent with the experimental setup of the new system. An even better approach would be to re-implement the compiler analysis and scratchpad architecture in the same system for comparison.
I am currently a graduate student working on a Ph.D. in Electrical & Computer Engineering in the SPARK Research Lab. I am supervised by Professor Mohit Tiwari.