Destroying
Data on Magnetic
Disks - Linux
or Windows
© 2005 by
Increa Technology (Brian Mork), Rev 2.2
Introduction
Many people no longer use floppy disks with their computers. New laptops may not have a 3.5” floppy drive, and I haven't seen a 5.25” floppy drive in a new computer for years. Now might be the time to copy onto CD-ROM all the information you want from your old floppies laying around the house, and sell off your floppies. I've had reasonable success selling stacks of floppy disks through web auction sites, but I had no idea who I was sending the disks to, so I wanted to ensure any private information was not recoverable from the disk. Of course, deleting a file with normal operating system methods doesn't erase the data. It simply makes the table-of-contents on the disk report that the data is longer available. So I knew that method wasn't any good for my purposes. To further understand the problem, you might want to read Dean Devera's Computer Science 574 thesis on the problem of permanently erasing digital data, written in December 2001, and titled "The Difficulty of Data Annihilation from Disk Drives: or Exnihilation Made Easy". It is available on the web as a pdf, and referenced in the Bibliography of this security software page.
I initially looked
for answers to permanently erase data in the
Windows 2000 environment, but in the end, Linux gave me the greatest
flexibility for saving data I wanted and securely erasing the disks. I
could have just magnetically erased them all, but I found that
buyers on web auctions pay noticeably more for pre-formatted disks,
so formatting was an important part of my plan. Additionally,
assertively re-writing the disk with new data makes it harder to
recover old data.
A secondary goal of the project was to automate the process so I could execute it over and over conveniently. To this end, the commands are bundled into Linux shell scripts or Windows batch files. This article highlights a number of differences between Windows and Linux, as I go about a real-life computer task. Lastly, this exercise also served an educational goal to learn more about Linux system tools, which has been important to me in recent months because I'm converting my home network away from high dollar operating systems.
This article describes how to accomplish nearly equivalent tasks in any of three environments:
Large Linux environment – I used Mandrake 2006.0 (aka Mandriva 2006.0) and Red Hat 9.0. Both offer a full GUI environment with complete sets of system administration tools. Downloading either over the internet reasonably requires a fast connection because each comes as a 3 CDROM set, plus extras if you want. Installation can take an entire afternoon, just like a major Windows OS installation.
Small Linux environment – I used tomsrtbt (ver 2.0.103), which is available for free on the internet. The entire bootable command line operating system fits onto a single bootable 3.5” disk, so it's reasonable to download even over a slow dial-up connection. Installation consists of using a provided tool to copy a disk image onto a 1.44” disk. There is no installation process; you just boot the disk.
Microsoft OS environment – MS-DOS 6.22, Windows 98SE, and Windows 2000.
Technical Background
Floppy disks are divided into concentric rings of data called tracks, and each track is divided up into a number of sectors. Most disks these days also include data on both sides. If you want to know more of this type of information, visit http://www.jegsworks.com/Lessons/lesson6/lesson6-2.htm, which gives a beginner level tutorial on storage media.
For this article, I'm not concerned about the physical organization of the disk. So instead, I conceptually unwrap all the sectors and tracks, leaving a long string of data. When thought of this way, each sector has a “sector number” that is really an offset ranging from 0 (zero) up to the size of the disk. Certain portions of the disk are used for different purposes. A garden-variety disk, formatted under the FAT (File Allocation Table) format popularized initially by the CP/M operating system, and adopted by MS-DOS and Windows is arranged like this:
The boot sector is only 512 bytes. On a bootable disk, this contains executable code that has just enough to tell the computer how to access the rest of the disk, such as MSDOS.SYS and IO.SYS files in the Windows environment. In the Linux environment, it allows the computer to load in the kernel image. MSDOS.SYS, IO.SYS, or a boot kernel are nothing more than files stored in the “user data” section of the disk.
A floppy has two copies of the FAT. The duplicate FATs offer some degree of security, in case one gets damaged. A FAT uses either 12 bits, 16 bits or 32 bits to contain the address or number of all the data clusters on the disk. Notice that if 12 bits are used, that caps the number of data clusters to 212, or 4096. Generally, as disks got bigger through the years, the standard shifted upwards. An exception to this is floppy disks, which are naturally limited in size and therefore don't need more addressable clusters. They've standardized on 12-bit FATs.
The last section of the system data is the root directory, which includes things like filenames, file sizes, dates of creation, etc. Most people don't realize there's a physical limit to the number of files you can have in the root directory of a disk. Practically, this isn't a problem because subdirectories are created in the user data section of the disk. So unless you try to jam all your files in the root directory, you'll probably never hit this limit.
Formatting a floppy disk under Windows XP or Windows 2000 puts information in the boot track identifying the disk as MS-DOS 5.0, along with a few other things such as a message saying the disk is not bootable. Of course, if you make a bootable floppy, then this section has computer code that allows it to look further for the IO.SYS and MSDOS.SYS file.
Windows 2000 also zeroed the FAT tables and the root directory, and all the data area was filled with F6h. Formatting under MS-DOS 6.22 was noticeably worse from a security standpoint because it re-wrote the system area, but left the original data intact on the user section.
Using the Norton Utilities program wipeinfo.exe zeroed the user section of the disk, but left data on the disk in the 512-byte boot sector. For example, after wiping a disk, a pkzip backup disk still contained the text “this is a pkzip backup disk", among other things, in that portion of the disk. Additionally, multi-tasking operating systems such as Windows 2000 and Windows 98 would not let let wipeinfo do it's job on the disk. Instead, I had to boot to the Windows 98 real mode, or use MSDOS 6.22. I assume newer versions meant to run under Windows 2000 would work, but one of the advantages of Linux is that more versatile programs are available for free.
Security Thoughts
Doing both an operating system format, and a Norton wipe would replace data on the entire disk, but I don't want any evidence of software that has used the disk, or any of the data stored on the disk. If done as described above, the boot track advertises at least what operating system I'm running, or worse, what software packages had used the disk. Years of working around classified military information has made me interested in doing better. I've accepted it as an educational challenge to remove any possibility of data recovery in some absolute sense.
Realize that serious data recovery experts can play games with disk drive hardware and work “miracles” you may not have thought of. Think NSA-type activities. One could pull data from the magnetic media by reading the disk with a known offset of the repeating F6s or 00s, at the magnetic hardware level. What's obtained with this method is the noisy remnants of the “underneath” data. Your drives and mine can't do this. Others can. Averaging many reads of this noisy data has a good chance of reconstructing the original data. As an aside, if you don't believe this can be done, acquire the free audio editing program Audacity, and try recording some cassette tapes back and forth with your sound card. Subtract audio tracks from each other, and you'll be surprised what can be pulled out of the background.
One tactic I use to make data recovery more difficult is to leave the disk filled with random bytes. This makes it impossible to “read through” a fixed pattern (such as F6s or 00s) with hardware circuitry. The random bytes have to be read into software, processed there, and then fed back into the hardware circuitry in real time to do background correction. One more layer of complication tilts toward my advantage of keeping old data gone for good.
However, all the disk writing in the world may not help under other circumstances. For example, if disk drive heads are slightly misaligned, re-writes may not matter since a clean, narrow copy of the original data may still exists bordering the re-written data – kind of like the shoulders of a road. If you're processing older 360 KB disks on 1.2 MB drives, you may never be able to totally erase data because the magnetic heads of the 1.2 MB drive are narrower, and are unable to write a big fat data track like those found on 360 KB drives. In general, the center of the data tracks may be rewritten, but the edges aren't.
If you're worried about misaligned heads or are using a 1.2 MB drive with 360 KB disks, the best thing you can do is use a bulk magnetic eraser. The problem with these is you don't know how much exposure makes a disk unreadable. Even if a disk is unreadable on your computer, others might find success using a drive specially tuned to pick out noisy disk signals at a hardware level. And with modified drive hardware, recovering weak data is easier than recovering data that's been written over. It becomes a simpler job of averaging the signal over and over. In fact, Steve Gibson (http://www.grc.com) has had commercial success doing data averaging with normal drives for disks that have lost strength or accidentally got exposed to magnetic fields. Check out his Spin-rite software product, if you’re interested.
Personally, I overdose disks with a magnetic eraser, and then still run them through the processes I discuss below to leave them in a pre-formatted state with new data written over the old.
This article was about magnetic floppy drives. I've also been interested in destroying CD data, but in the end, that's a trivial problem. I noticed a $40 CD-ROM data destroyer for sale the other day, and couldn't for the life of me, understand why someone would buy one. I experimented with my kitchen microwave. The response with a metallic CD-ROM is rather spectacular. Two seconds isn't enough for any visible effect. Four seconds is definitely too much, and could be a fire hazard. Three seconds with my microwave gives a very comforting visible and audio confirmation that the CD-ROM is no longer readable unless someone can pick data off individual metal flakes. Not that they couldn't, mind you. A laser scanning reflectometer comes to mind. But securing data is all about the cost of effort. You always want to make the data recovery much more costly than the process used to destroy it.
Introduction to the Tool Set
Here are the tools I'll use:
format.exe
(W98 and W2K) –
fill the boot area with a message, zeroes the system area, and fills
F6h throughout the data area
format.exe
(MS-DOS 6.22) –
fills the boot area with a message, zeroes the system area, and
leaves existing data untouched.
Filesnoop.exe
(Windows) –
PC Magazine utility that allows inspection of binary files.
wipeinfo.exe
(Windows 98 in real mode, or MS-DOS 6.22) – wipes out the
data
section of the disk.
Rawread.exe
(Windows or
MS-DOS) – reads the entire disk and records an image on a
different drive.
Rawrite2.exe
(Windows or
MS-DOS) – writes a properly sized file from disk and records
the entire image onto a floppy disk.
fdformat
(Linux) – fills
the entire disk with F6h.
mkdosfs
(Linux) – fills
the boot area with a message, and zeroes the system area.
dd (Linux)
– fill any
portion of the disk with data from any source file. It also works in
reverse, making copies of any sectors of bootable or non-bootable
disks. This is how I archive bootable floppies before erasing them.
hexedit
(Linux) –
displays binary files in a conveniently navigatable terminal window.
Use the -s
option to break
data into 512-byte screen pages.
Although this isn't part of the scope of this article, I can't resist pointing out that two of the above tools also do efficient jobs of copying entire disk images into backup files before the disks are recycled. Under Linux, you can copy direct from drive to drive, so it can be used for hard drives, also. Copying to a file can be as simple as:
dd if=/dev/fd0
of=image.bin (under
Linux), or
raread -d a: -f image.bin (under
Windows).
The following table shows the actual sizes of the disk sections diagrammed above. If you have a different format, Windows won't help you, and tomsrtbt environment is too limited because of the small size. You'll probably need a full Linux installation like Mandrake or Red Hat, which include drivers that will properly read and write from other floppy formats.
|
360 kB (5a000h) |
1.2 MB (12c000h) |
1.44 MB (168000h) |
||
---|---|---|---|---|---|
|
format.exe |
format.exe, Kfloppy, or fdformat & mkdosfs |
Gfloppy |
format.exe, Kfloppy, or fdformat & mkdosfs |
Gfloppy |
Total Size |
720 |
2400 |
2400 |
2880 |
2880 |
System Area (boot, FATs, root directory) |
12 |
29 |
35 |
33 |
35 |
Boot Code |
1 @ 0h |
1 @ 0h |
1 @ 0h |
1 @ 0h |
1 @ 0h |
FATs (2 copies) |
4 @ 200h |
14 @ 200h |
2 @ 200h |
18 @ 200h |
2 @ 200h |
Root Directory |
7 @ a00h |
14 @ 1e00h |
32 @600h |
14 @ 2600h |
32 @ 600h |
Data Area (files, sub directories) |
708 @ 1800h |
2371@ 3a00h |
2365 @ 4600h |
2847 @ 4200h |
2845 @4600h |
Tracks |
40 |
80 |
80 |
80 |
80 |
Sectors |
9 |
15 |
15 |
18 |
18 |
Allocation |
2 |
1 |
1 |
1 |
1 |
Hex numbers in the table header are the number of bytes of storage on the disk. Hex numbers in the table are offsets to the starting location. Decimal numbers in the table are the number of 512-byte sectors in that portion of the disk. Additional observations:
Notice that (tracks) x (sectors) x (2 sides) gives you the total size in 512-byte sectors. Multiplying the number of sectors by 512 gives you the total disk size in bytes. Sector size is broadly standardized at 512 bytes across many operating systems.
The KDE Linux desktop ships with Kfloppy. It handles raw (magnetically erased) disks. It reports a number as “Raw Capacity” when it finishes, e.g. 737280 for 1.2 MB floppy. This number appears to be half of the total byte size of the disk, so I think it's a bug in the program.
The Gnome Linux desktop ships with Gfloppy. It chokes on raw (magnetically erased) disks. More interesting is that, when given a Windows formatted disk, it recognizes the file system, and then formats it with a file system using system components sized differently. These disks are not recognized under Windows, so I don’t use this program.
The fdformat/mkdosfs tool chain (two programs used in sequence) is available on nearly every Linux installation, including tomsrtbt. These are the programs used in the scripts I provide.
Random Byte File
One goal I had in mind was to leave the disk filled with random bytes. Linux offers a built in data source (like a modem or scanner), that provides random bytes. I wanted to use the /dev/random Linux device as a source for random bytes directly, but it has a limited number of bytes per second it can deliver, because it generates real hardware-dependent random numbers. Instead, I ended up using the algorithm based /dev/urandom, which should be okay since I'm not doing heavy cryptography, but rather just looking for varying patterns to write on the magnetic medium.
If you use Linux (either a big distribution or tomsrtbt), the need is solved. If you're using Windows or MS-DOS, tomsrbt provides the necessary environment to make the random files. Of course, you can execute these same commands at the command line of a full hard-drive based Linux installation, if you already have that. After booting tomsrtbt, remove the boot floppy, and put a different formatted floppy into the drive. Then log in as user “root” (default password is “xxxx”), and execute this command to store a bunch of random bytes on the floppy:
dd if=/dev/urandom of=/dev/fd0 count=2880
The parameter “if” stands for “in file”, and “of” stands for “out file”. Use count=2880 if you'll be processing 1.44 MB floppies, or 2400 if you're doing 1.2 MB floppies, or 720 if you're doing 360 KB floppies. The “fd0” refers to your first floppy – what ever would be “Disk A:” under Windows. Use “fd1” if you want to use the 2nd drive in your computer.
Then boot up the computer under Windows, and use raread.exe to read the random bytes off the fdisk and into your working directory. Use the destination filename random.bin if you want the batch files I give you later to work, unchanged.
Linux
Formatting a floppy disk using fdformat under Linux gives a disk full of hex F6. Unlike Windows, I mean really full – from byte 1 to the last byte. Of course, this isn't recognized by any operating system because there are no FATs or root directories. If you put it into a Windows computer, it will identify it as a “raw” format disk. Choose the command below based on what size floppy you're processing; these are what I used. Notice the change between “fd0” and “fd1”, because my 1.44 MB drive is the first, and my 1.2 MB drive is the second. Additionally, I've noticed some variance between Linux installations in the 4th letter (h vs. u vs. H). If it doesn't work as given below, try the other letters.
fdformat /dev/fd1h1200
fdformat /dev/fd1h360
fdformat /dev/fd0u1440
You have to install a file system on the disk before it can be used as a normal floppy. All the disks I was using have less than 2^12 allocation units, so Windows and Linux make a FAT12 file system by default.
/sbin/mkdosfs -n volume_name /dev/fd1
Under Linux, I use the the scripts shown below. Pick the one that matches your disk drive. The fdformat command is used to fill the disk with F6h. Then a full disk of random bytes is written using the dd command. Then I use mkdosfs to generate the system artifacts (boot sector, FAT tables, and root directory), which make the disk usable by an operating system. Lastly, dd is used to write different random bytes to all of the disk space following the system areas. These script aren't high-powered, automated processes. I've intentionally kept them trivially sequential so you can type the commands manually and try experimenting if you wish. If you enter the commands manually at the command line, skip the “#!/bin/bash” line.
#!/bin/bash
echo "1.2 MB disk cleaner by Brian Mork - Fall 2004"
echo "Formatting (all F6)."
fdformat /dev/fd1h1200
echo "Writing all random bytes."
dd if=/dev/urandom of=/dev/fd1 count=2400
echo "Installing file system."
/sbin/mkdosfs -n volume_name /dev/fd1
echo "Filling with random data, offset 29x512 bytes."
dd if=/dev/urandom of=/dev/fd1 seek=29 count=2371
echo "COMPLETE (Be sure format verify gave no errors, above)"
#!/bin/bash
echo "1.44 MB disk cleaner by Brian Mork - Fall 2004"
echo "Formatting (all F6)."
fdformat /dev/fd0u1440
echo "Writing all random bytes."
dd if=/dev/urandom of=/dev/fd0 count=2880
echo "Installing file system."
/sbin/mkdosfs -n volume_name /dev/fd0
echo "Filling with random data, offset 33x512 bytes (past boot & FAT)."
dd if=/dev/urandom of=/dev/fd0 seek=33 count=2847
echo "COMPLETE (Be sure format verify gave no errors, above)"
#!/bin/bash
echo "360 KB disk cleaner by Brian Mork - Fall 2004"
echo "Formatting (all F6)."
fdformat /dev/fd1h360
echo "Writing all random bytes."
dd if=/dev/urandom of=/dev/fd1 count=720
echo "Installing file system."
/sbin/mkdosfs -n volume_name /dev/fd1
echo "Filling with random data, offset 12x512 bytes."
dd if=/dev/urandom of=/dev/fd1 seek=12 count=708
echo "COMPLETE (Be sure format verify gave no errors, above)"
Notice that on my computer, the 3.5” drive is installed in the /dev/fd0 slot. This is the same as the Windows A: position. My 5.25” drive is installed as /dev/fd1, and corresponds to a Windows B: designation. You'll have to adjust the fd0 vs. fd1 references if your drives are installed differently.
Windows
Use an old boot floppy to get an MS-DOS command line, or use S)tart, R)un under Windows and enter either “cmd” or “command” (no quotes) in the prompt. For Windows 2K, I use the cmd processor. For Win98, I use the command processor.
A disk format is done first in case the disk was erased magnetically. Format.exe includes writing a boot sector, a FAT table filled with zeroes, root directory filled with zeroes, and F6h over the data area. In all the testing I did, Windows identified MSDOS5.0 as the author of the boot sector, and identified a 12-bit FAT. At this point, I don't need the file system re-built in the system area (the boot sector, FAT, and root directory), but under Windows, you can’t decouple the two.
Once the disk is formatted, you can put down random bytes on the sectors. Windows doesn't have an ability to write only part of a random byte file to disk (similar to the dd count parameter). So you have to generate random byte files of the exact size of your disk (total size from the table, times 512), per the earlier instructions. Also, two different sets of random numbers aren’t used by the scripts I provide. If you want to write two different sets of random numbers, you'll have to generate two files and modify the scripts to use both, in sequence.
Lastly, I run a Windows “quick” format, which re-writes only the file system (system areas) and writes a unique serial number on the disk. It leaves the random bytes spread out on the data section of the disk.
The batch files for Windows are shown here. Again, these are setup for my computer, which has a 1.44 MB drive at A: and a 1.2 MB drive at B:. You'll have to make a few changes if yours is different. If you have to make changes, just run the programs at the command line, using the “/?” option and they will tell you what notation to use on the command line.
echo off
echo "1.2 MB disk cleaner by Brian Mork - Fall 2004"
echo "Formatting (zero system, F6 data)."
format /f:1.2 b:
echo "Writing all random bytes."
rawrite2 -f random.bin -d b
echo "Formatting (put file system back)."
format /q /v:volume_name b:
echo on
echo off
echo "360 KB disk cleaner by Brian Mork - Fall 2004"
echo "Formatting (zero system, F6 data)."
format /4 b:
rem above statement assumes you have a 1.2 MB drive
rem if you have a 360 KB drive, use the following, instead
rem format /f:360 b:
rem rawrite reports "can't figure out how many sectors/track on this diskette"
rem echo "Writing all random bytes."
rem rawrite2 -f random.bin -d b
rem
echo "Formatting (put file system back)."
format /q /v:volume_name b:
echo on
echo off
echo "1.44 MB disk cleaner by Brian Mork - Fall 2004"
echo "Formatting (zero system, F6 data)."
format /f:1.44 a:
echo "Writing all random bytes."
rawrite2 -f random.bin -d a
echo "Formatting (put file system back)."
format /q /v:volume_name a:
echo on
Tomsrtbt
Tomsrtbt can use the same Linux shell scripts as presented earlier, with a few modifications. Instead of using “u1440” in the device name, you have to use “h1440” to specify a 3.5” drive with a 1.44 MB disk.
As shipped, tomsrtbt does not handle 5.25” drives, and it only can do 2 format types on 3.5” disks. I've accepted this limit because it's self contained in a very small footprint, and has a very reproducible environment that will work on essentially every PC-compatible computer that has a floppy drive, regardless of what the normal operating system is.
Conclusion
Whether you use Windows, tomsrtbt, or a more complete Linux installation, these procedures can give you a high confidence that personal data won't be recovered from floppies you pass on to others. Tomsrtbt is universally available through the web and can service all of your 3.5” floppy needs. Windows can do 3.5” and 5.25” formats, but requires a small work around because it can't copy partial floppy images to disk. A full Linux installation gives the greatest capability and flexibility.
Author Biography
Brian Mork, Ph.D, is an Engineer, Scientist, and Aviator. He works at Edwards AFB as a Directed Energy Systems Engineer and at USAF Test Pilot School. He has experience in data collection, industrial control, program management, classified aerospace systems, and has 2500+ hrs in a variety of civilian and military aircraft. He used Linux first in 1993, and committed to “do the switch” in 2004. Contact him at via email, or visit http://www.increa.com.