The following paragraphs describe such a program.
I. How to read a full track
I.1. Technical background
The PC-AT controls the floppy disk drives using a standard Nec PD 765 floppy disk controller (FDC) addressable at I/O addresses 3F4h and 3F5h plus two output registers (the Digital Output Register at I/O address 3F2h, and the Configuration Control Register at I/O address 3F7h), and one input register (the Digital Input Register at I/O address 3F7h).
The following technique makes use of the architecture of the 765, and of the fact that the 765 is not linked on a programming point of view to the other hardware registers.
The Digital Output Register gives control over the disk drive motors, and the selection of a particular disk drive for disk operations. The Configuration Control Register enables the selection of a bit rate for disk read/write operations. These controls are made independently from the NEC 765.
I.2. The algorithm
This algorithm requires two disk drives installed (read the 1DISKDRV.TXT file to get an explanation why reading raw data generally with only 1 disk drive is physically impossible).
Insert the disk you want to fully read in one drive (let’s say A: in this example).
Insert another disk in the other drive (B: in this example). This disk must be IBM-formatted.
Select drive A: using port 3F2h. Also turn on motors for both A: and B:.
Go to the desired track and side using standard FDC 765 commands.
Swap to B: using port 3F2h.
Select density appropriate for IBM-formatted disk in B: using port 3F7h.
Issue a “Read a Track (Diagnostic)” command using the FDC 765. The parameters should match a sector that is physically present on the IBM-formatted disk, for example sector #1. For this command, set a sector size of at least 8KB (even if the physical sector is 512 bytes long). 16KB and 32KB sector sizes can be set to read more raw data. DMA registers should have been set accordingly.
Watch continuously the DMA address until it is different from the starting address. When it is so, it means that the 765 has begun transferring sector data, so it has previously found the sector header on B:.
As soon as the DMA address is increased, swap to A: using port 3F2h. This is the main idea behind this technique. The 765 has no way to know disk selection has changed because port 3F2h is not linked to it. This 1st step was discovered on the 11th of December, 1999.
Change density (bit rate) using port 3F7h. For a full track read, including MFM synchronization bits, you must set a bit rate twice the standard value. For example, when reading a 250 000 bits/sec track (double-density track), set the bit rate to 500 000 bits/sec. This 2nd step was discovered on the 18th of December, 1999.
The FDC 765 will now read the disk in drive A: from now on, thinking it is a big sector on disk in drive B:. Wait for FDC interrupt. Of course, most status bits at the end of this operation should be just ignored, such as the data error (CRC) flag (which will be obviously set). The main indicator of a successful operation is the DMA counter or address. For a 32KB “sector” read, the DMA address will equal the starting address plus 32768 if the operation was successful.
Since some bytes were read at first from drive B:, and the swap of drives and bit rates will require a little time to settle, it is wise not to consider the first 50 bytes read.
If the track to be read contains an IBM sector, drive swapping may not be necessary. Yet, bit rate swapping can be useful, especially for protected or non-standard tracks.
II. Example of data written to disk
Let’s take the example of a byte written to disk. Let this byte be 4Eh (ASCII character ‘N’). In binary, this is 01001110. We’ll consider it is written on a double-density disk (at 250 000 bits/s).
The MFM encoding will insert a synchronization bit between every two bits. A ‘0’ synchronization bit is placed if either of the two neighboring bits is a ‘1’ (or both). A ‘1’ synchronization bit is placed otherwise, i.e. if both neighboring bits have the value ‘0’.
In our example, the character ‘N’ is encoded this way (if the bit on the left was ‘0’):
0 1 0 0 1 1 1 0 The character ‘N’.
1 0 0 1 0 0 0 0 The MFM synchronization bits.
1 0 0 1 0 0 1 0 0 1 0 1 0 1 0 0 The resulting data, written to disk.
In hexadecimal notation, the written data corresponds to 9254h
This encoding will ensure that not too many ‘0’s will be written consecutively. A ‘0’ corresponds to a stay of the magnetization on the surface of the disk, and a ‘1’ corresponds to a change of the magnetization. If too many ‘0’s were written consecutively, the FDC may lose synchronization with the data stream from the disk drive when reading.
The FDC 765 encodes data using MFM, but it also decodes data always considering it was previously encoded using MFM.
The FDC will first look for three standard MFM synchronization words. They are different from the synchronization bits for MFM encoding. The standard MFM synchronization word corresponds to a value of 4489h, encoded value (would be A1h when decoded), and it is not a standard MFM encoding (A1h MFM encoding would be 44A9h). This way, this value (4489h) will never be encountered when reading normal data. This word is used for the FDC to find the start of sector headers and data.
One of the many differences between IBM and Amiga sectors for example is that the IBM format requires three synchronization word, whereas the Amiga format only requires two.
III. What is read of these data based on various methods
When using the technique described in this paper, the FDC will decode data as it does usually. This will of course result in a buffer where only half the full data is present.
Since the data read from the disk in drive A: were not aligned with synchronization words of a sector header on the disk in drive A:, but on the disk in drive B:, the data read can be misaligned. The probability for the data to be misaligned is of 50%.
Two different methods can be considered when reading data.
III.1. “Normal” density reading
...
III.2. “Double” density reading
...
V. What to do with full data from a track
When the full data from a track are extracted, they can be either stored in this raw format for later use, or analyzed.
One of the first tasks is to detect synchronization words (4489h) in the case of a standard MFM track (IBM, Amiga, …). Sector headers, sector data, and gap areas can be parsed to create a higher level, structured buffer.
In a last operation, standard tracks can be detected (9-sector IBM track, 11-sector Amiga track, …) and the buffer simplified with only the decoded sector data.
In any case, the track buffer should be saved to disk in an appropriate format. It can also be exploited directly, for example within an emulator or a file manager.
[свернуть]