|
### Using paper as apocalyptic-resilient backup storage ###
Today, out of the blue, I was pondering about paper. You know, this cellulose
based thingy that Chinese people came up with back in the good old days of
the Han dynasty. Since these how remote times, humanity relied on paper for
storing information and, as thousands of years demonstrated, paper proved to
be an excellent long-time medium.
Nowadays we use diskettes, magnetic drives, flash memories and CD/DVD
discs... But neither of these comes anywhere close to paper's longevity and
resilience. Any serious magnetic disturbance will flush out data from most of
our computers, mild UV exposure will erase the content of our writeable
CD/DVDs, and in any case - all of these modern storage contraptions are
unlikely to be readable in 50-100 years.
So I thought - wouldn't it be nice to use paper as a storage for binary data?
After some quick research I found out (unsurprisingly) that I am not the
first to have come up with such idea. There are at least three software
packages out there that are dedicated to exactly that:
* PaperDisk, a commercial offering from Cobblestone Software (Win 3.1, 9x)
* PaperBack, an open-source software by Michael Mohr (Windows only, sadly)
* Optar, an open-source software by Karel Kulhavy (POSIX)
All these solutions share one big limitation, though: paper has a relatively
low data density, which translates to low data capacity. Apparently, it is
realistic to expect about 300-500K of storage on a single A4 sheet. This is
not much, but still - if we own some data that has a high value-per-byte
ratio, such backup could be an excellent long-term storage strategy.
Before using paper-based backups, one needs to think about a major (and
easily overlooked) potential problem - let's assume our paper backup sheet
survives 3000 years, and in some dystopian future a sentient race discovers
it. How would they know how to decode it? Surely they will not have access
to a PC running Windows 3.11 with proper software. Without going to such
extreme scenario, this problem might just as well apply to the paper backup
author itself, 20 years later: "I encoded important data on this A4 sheet
20 years ago using some exotic software - how do I decode it now?.
For the above reason, it may be safer (although much less efficient) to opt
for a paper-based backup solution that prints data in a way that a human
could easily understand and decode. Writing data encoded as a suit of letters
comes naturally to mind: after all, it is what we do since 6000 years now,
with very good results. Binary data could therefore be encoded in a BASE-32
"alphabet", composed of the most recognizeable latin symbols. Instructions
how to decode it would be short enough to be written on the margin of the
sheet. On a single A4 sheet it is possible to fit comfortably 10'000
characters. When encoded with a BASE-32 scheme, this translates to a useable
storage of 6 KiB. It is much less than with sophisticated methods used by the
dedicated softwares listed before, but it has the advantage of being trivial
to decode either by hand, or using any kind of OCR software - now or in the
future.
Attached to this article is a copy of the latest (as of February 2019)
PaperBack source code, the shareware PaperDisk 1.0 release (1997), an
extremely interesting white paper on the subject from Cobblestone Software,
as well as the source code and website copy of Optar (2007).
=== Attachments ==========================================
|