But how does it detect such data like this? I can understand if the file system is ISO9660 or UDF. But on the other hand Nintendo disc images (GCN and Wii, I am not sure about the Wii U as I don't have one) use a file system that some call FST.
All compressed formats from the PSP (CSO, ZSO, DAX and JSO) can be generally specified as a file containing a header (with a magic number to detect format and information about the file needed to read and decompress data), followed by a table containing the offsets of each and every single block (a block is generally equal to a DVD sector, but it can be bigger than that). After that there might be more data depending on the format and finally the actual data (the blocks) start.
To detect if a block is not compressed various formats came with different solutions:
- CSO/ZSO: the offset of the block (in the offset table at the beggining of the file, right after the header) is encoded such as the top bit of the offset is a flag determining if the block is compressed or not. So lets say that block number 4 starts at offset 0x00001234, but the block is not compressed, so its offset will be encoded as 0x80001234.
- DAX: version 0 doesn't support non-compressed blocks, which actually results in some "compressed" blocks being bigger than when uncompressed, for version 1 it one uses something called the NCAreas, which is data that is appended after the block offset table and before the blocks actually start. Parsing this information is a pain in the ass and I don't recommended, you can use the same trick that JSO does for determining NC blocks (see bellow).
- JSO: this one is quite simple and easy, and can actually be applied to other formats (CSOv2 actually adopts this method). It's as simple as checking the size of the "compressed" block, if it's equal to (or bigger than in CSOv2) the uncompressed block, then the "compressed" block is not actually compressed.
So the way a CSO/ZSO/DAX/JSO reader would work (very simplified) is as follows:
- Game requests reading N sectors starting with sector S.
- CSO reader checks the block offset table to get the starting offset of sector S and the ending offset (of last sector S+N+1). So it now knows exactly how much compressed data it needs to read (last_sector_offset-first_sector_offset).
- Now the reader has to iterate over every single sector and handle: if it's compressed, decompress it, if not, memcpy it.
- The end result is the game obtaining the data exactly as if it were a RAW sector read.