I was working on a contract where we were going to sell SCSI connected floppy disk drives. We got a drive in, but it didn't work, so they assigned it to me. I added in some debugging code, and it looked like the drive was wonky. The computer would send a "read" command, and the device would respond "OK". This confused the computer deeply, as it expected the drive to either return a block of data and "OK", or an error. Just responding "OK" didn't make sense. All the other devices on the bus worked just fine.

The hardware engineer on the project rented a SCSI analyzer, which also insisted the drive was misbehaving. It clearly showed the read command going out, and the "OK" status coming back. So we ordered another drive, and when it arrived, it did exactly the same thing. Monitoring other devices showed the expected behaviour: send a read command, get data, then get the OK status.

I, however, did not trust the SCSI analyzer. It operated on the assumption that everything was operating according to specifications, and was designed to show what data was going back and forth, not investigate weird protocol violations.

Accordingly, I went and rounded up a logic analyzer, which just shows the raw signals, and does not interpret them at all. It is more effort to figure out what's going on from the raw logic levels, but the logic analyzer doesn't hide anything either. And sure enough, when I puzzled out what the logic analyzer was telling me, it became clear what was happening. The computer would put the read command on the bus, one byte at a time, assert the strobe signal to indicate that the command byte was ready to read, take away the byte, and wait for the "ack" (acknowledge) signal back from the target device. And this is wrong. What it should do is leave the byte on the bus until it gets the ack back. The SCSI control chip in the computer was very simple, and did not do the signal sequencing itself, depending on its device driver to do so. And, looking at the device driver source code (fortunately, we had access to it), it showed the same sequence of events: put data on bus, assert strobe, take data away, wait for ack. So I swapped two lines in the driver, so it would put the data on the bus, assert strobe, wait for ack, and then take the data away.

And lo, the SCSI floppy disk drive started to work perfectly! The remaining question was, why did the other devices work? My theory is that the other, fancier, devices had hardware SCSI interfaces that latched the incoming command bytes immediately upon strobe, so they didn't care that the data went away immediately afterward. Whereas the floppy drive implemented its SCSI interface with a microcontroller. The strobe signal would send an interrupt to the microcontroller, which would then go read the data byte off the SCSI bus. Unfortunately, by the time it got around to it, the data was gone, and the bus terminators had pulled the data lines back to their idle state of zero. And, sure enough, a SCSI command block of all zeroes is a valid command: "test unit ready", for which the correct response is simply "OK".

When I was working at the particle beam lab at the University of Maryland, I learned not to say whether I knew how to do something, just assume that I could learn how. One of the researchers asked me to create a high voltage pulser with a high voltage output with a very short risetime (less than a nanosecond).

This was a tall order, so I went to hit the books. It turned out that there was a special tube designed for just such a purpose, known as a krytron. This was an obscure beast, which actually used radioactive nickel to keep the gas in the tube partially ionized, ready to switch at any moment. It had originally been designed for firing explosives in nuclear weapons, and had been classified. But this was an advanced lab and had accumulated a great assortment of oddball parts. Some time spent asking questions and rummaging around actually managed to produce a krytron.

This was a little thing, about the size of a peanut. I built a charged transmission line setup, with the tube switching it into a 50 ohm load, and ran the thing at a few thousand pulses a second with a sampling plugin to an oscilloscope to fine-tune it and measure the actual risetime (which turned out to be an astonishing 370 picoseconds or so).

So I went to show the finished apparatus to the researcher, only to find out he had basically given me the assignment as a prank, figuring a young college student who wasn't even in his research program wouldn't be able to solve such a difficult and arcane problem. He'd also given it to one of his EE grad students, who'd assembled this huge board with a chain of "avalanche" transistors in series to do the switching. It took about ten minutes between pulses, and would fry the transistors every dozen cycles or so. My board had run for hours at thousands of pulses per second, and was still on the original tube.



November 2013

10 111213141516


RSS Atom

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Oct. 17th, 2017 04:38 pm
Powered by Dreamwidth Studios