Keith Adams of Facebook pointed me to this paper:
Flipping Bits in Memory Without Accessing Them:
An Experimental Study of DRAM Disturbance Errors
The fun part is that we have a new and exciting way to induce memory corruption. Reading memory.
The interesting part is the proposal of a probabilistic algorithm to address the issue.
This continues my enduring belief that reliance on reliable hardware to make software work is increasingly a fools errand. As we patch more and more of the cracks, eventually we will have to stare at the chasms and rethink software design.
In the meantime, the next time I get a memory corruption I am pointing my boss to this paper.
Alan Yoder says
Back in the day, all satellite or other remote data got encoded using Hamming codes and what have you. It wasn’t the hardware we didn’t trust, it was the signal. Same diff.