Yes, the randomization is done behind the scenes, otherwise not a true double blind test. The entire point of an ABX is whether you can tell the difference between the two tracks. Typically, one of the tracks is lossless and the other is some sort of lossy (mp3). So you are testing tranparency of the lossy codec. But you could also use mp3 vs aac, or mp3 of different bitrates. Doesn't really matter what is at play, the key is whether one can detect any difference. If not, then the codec, the bitrate, etc. is obviously not critical to the listener. ABX doesn't say one is better, it merely tests whether the listener can detect whether the two tracks are *different*. And it turns out that it is quite difficult with normal music to detect differences between lossless originals and lossy versions, so long as the lossy version is good enough (which for many means a bitrate of about 192kbs or more). More here:
http://en.wikipedia.org/wiki/ABX_test