Knute Snortum wrote:Personally, if I bought a cheap HDD from someone I don't know, and it said not to format the drive, I would be worried that it contains some kind of malware.
So would I.
However, as Martin has pointed out, things were different in the past. Take floppies, in particular. There were actually 2 types of floppies. One was hard-sectored, one was soft-sectored. A hard-sectored drive had multiple index holes on the disk, marking the start of each sector. Hard-sectored drives weren't common, however.
More often, floppies were soft-sectored. As Martin mentioned, instead of physical sector marks, the drive used the single index hole as a baseline and all the other sectors on the track were indicated by metadata markers visible to the hardware.
What few people knew was that this was actually the IBM CKD disk formatting technique. CKD stands for Count, Key, Data and refers to the IBM architecture where a disk sector's metadata contained a count (the sector length), a key (so that you could rapidly search files based on a key value), and data, which was the primary payload. It was, in fact, the exact same architecture used on their mainframe DASD devices. Direct Access Storage Device, which is IBM-ese for disk, the way IPL is IBM-ese for boot. Well, technically, IBM did make some direct-access CKD devices using other technologies like drums and datacells, but for most people, DASD meant disk.
You didn't format a DASD device. Instead, when you brought it into service, you initialized it. That process consisted of writing a set of 80-byte data-only ANSI-format label records to track zero. Most importantly, the VOL1 label, which, in EBCDIC, contained the 6-character volume ID. The ANSI labels were all text, incidentally, no binary data.
You also had the Volume Table of Contents, or VTOC. This was the "directory" of the disk and mapped the 44-character filenames to specific cylinder/track number extents on the disk. A file might spread over multiple extents. The directory was flat. No subdirectories.
Allocating an extent didn't format the extent itself, just marked its location. When you wrote a file to the disk, you'd write data or write key and data on the fly. Parts of the extent beyond the last physical record were undefined garbage. CKD not only supported hardware search by key, but also variable length records. And that's all I need to say, if not more. If you want further details, most of that documentation is online.
So CKD didn't work with fixed-length sectors. Unless it wanted to. And the sector length and existence was solely defined by the programmer. CKD did have its advantages, though. The gap between one sector and the next was "wasted" disk capacity back when a 5MB (!) disk was a respectable size. So breaking a disk track into lots of small sectors was undesirable usually.
When PCs came along, they ignored the key-related parts of the disk controller logic and drives because the original PCs were pretty limited. They also lifted their filesystem pretty much straight from Digital Equipment Corporation systems, which is where we got the A:, B:, C: stuff and the 8.3 filenames. Using fixed-length sectors was simply easier. However, you could choose your sector length when you formatted the disk, as long as you had your BIOS set up to work with that length. On CP/M systems, you actually built your own customized BIOS - it wasn't in ROM like on the IBM PCs.
SCSI disks introduced the concept of hard-sectored drives. With SCSI, you didn't address records in terms of cylinder, track, record, you used a simple block offset. This practice got carried into the
IDE architecture and IBM-compatible BIOS configuration got a lot easier as a result. So did the location logic in PC operating systems. It used to be that you'd set up the sectors as part of installing a new disk (hard formatting). These days, hard formatting is done at the factory and you only do the volume initialization (soft formatting). Soft formatting comes in 2 flavours: short, which just initializes basic volume info and full, which physically erases (and thereby tests) each physical sector. Full formatting can also be done on a per-partition/filesystem basis for only that partition.
OK so much for the history lesson. What this all means in practical terms is that it's possible that funny things were done with Campbell's disk to make it look bigger than it is. Although that's just one of many possibilities. Some that I can thing of are:
A) Possible malware hidden on it.
B) Making the disk look bigger
C) It actually
is a bigger disk, but there's bad zones, so it was formatted to look
smaller, avoiding the bad zones. Or the physical access mechanism cannot reliably move to some parts of the physical disk.
D) ???
I wouldn't be surprised to see that it's option C. It's common practice to sell not-up-to-spec devices as smaller/slower units. Even the manufacturers themselves do that. Although when a no-name supplier does it, there's a high probability that they didn't fully
test the downscaled unit before fobbing it off.
On a Linux computer, the best bet would be to run the badblks utility using the writeback option. This will totally destroy the data on the disk, but will exhaustively test and record whether they read and write correctly, including under stress conditions. A few bad blocks can be mapped by filesystem. If there are large bad zones, you can isolate them into unused partitions and use the rest of the disk. I have a few of those.
Of course, any drive with massive bad spot infection is more likely to fail catastrophically, so I'd keep it for non-critical applications.