Tes4Mod:BSA File Format
BSA files are the resource archive files used by Oblivion. If you're familiar with the Morrowind BSA file format, Oblivion BSA format is similar, but not identical to it (if you're familiar with the generic IFF format, you'll find it very easy to understand --- BSA, ESM/ESP and ESS are IFF variants).
|version||ulong||Currently 103 (0x67).|
|offset||ulong||Offset of beginning of folder records. All headers are the same size, therefore this value is 36 (0x24)|
|archiveFlags||ulong||List of flags: (ones not shown here are considered unknown)
|folderCount||ulong||Count of all folders in archive.|
|fileCount||ulong||Count of all files in archive.|
|totalFolderNameLength||ulong||Total length of all folder names, including \0's but not including the prefixed length byte.|
|totalFileNameLength||ulong||Total length of all file names, including \0's.|
|fileFlags||ulong||Not sure yet -- seems to be a set of flags indicating what kinds of files are in the archive.
List of flags: (ones not shown here are considered unknown)
|folderRecords||Folder Record[folderCount]||See specification of Folder Record below.|
|fileRecordBlocks||File Record blocks[...]||See specification of Folder Record blocks below.|
|fileNameBlock||File Name block||A list of filenames. Each filename ends in \0. See specification of File Name block below.|
|files||data or specification of Compressed File block||Raw file data. If the file is compressed the file data will have the specification of Compressed File block.|
 Folder Record
|nameHash||hash||Hash of the folder name (eg: menus\chargen). Must be all lower case, and use backslash as directory delimiter(s).|
|count||ulong||Amount of files in this folder.|
|offset||ulong||/*Offset to file records for this folder. (Seems to include totalFileNameLength)*/
Offset to name of this folder + totalFileNameLength.
 File Record blocks
|name||bstring||Name of the folder, ends with \0.|
|fileRecords||variable (File Record)||Many records in the amount of files specified in the associated folder record.|
 File Record
|nameHash||hash||Hash of the folder name (eg: race_sex_menu.xml). Must be all lower case.|
|size||ulong||Size of the file data.
If the (1<<30) bit is set in the size:
If the file is compressed the file data will have the specification of Compressed File block. In addition, the size of compressed data is considered to be the ulong "original size" plus the compressed data size (4 + compressed size).
|offset||ulong||Offset to raw file data for this folder. Note that an "offset" is offset from file byte zero
(start), NOT from this location.
 File Name block
A block of lower case file names, one after another, each ending in a \0. They are ordered in the same order as those generated with the file folder block contents in the BSA archive. These are all the files contained in the archive, such as "cuirass.nif" and "cuirass.dds", etc (no paths, just the root names).
 Compressed File block
|originalSize||ulong||Size of uncompressed data.|
|data||ubyte[compressedSize]||File data that has been compressed with zlib.|
 Files and Dirs order
Inside bsa, folders and files in folders must be sorted by hash values. Sort all folders, then all files within the folders, keeping the folders contiguous. The 64 bit unsigned integer of hash value is used for sorting.
 Encoding Numbers in the BSA
Numbers are encoded low byte to high byte. A ulong of 0xABCDEF01 would have 0x01 in the first file byte, 0xEF in the second file byte, 0xCD in the third file byte, and 0xAB in the fourth and last file byte. This is true of the 64 bit (8 byte) hash value as well.
 Hash Calculation
Here's a python version of the hash calculation.
def tesHash(fileName): """Returns tes4's two hash values for filename. Based on TimeSlips code with cleanup and pythonization.""" root,ext = os.path.splitext(fileName.lower()) #--"bob.dds" >> root = "bob", ext = ".dds" #--Hash1 chars = map(ord,root) #--'bob' >> chars = [98,111,98] hash1 = chars[-1] | (0,chars[-2])[len(chars)>2]<<8 | len(chars)<<16 | chars<<24 #--(a,b)[test] is similar to test?a:b in C. (Except that evaluation is not shortcut.) if ext == '.kf': hash1 |= 0x80 elif ext == '.nif': hash1 |= 0x8000 elif ext == '.dds': hash1 |= 0x8080 elif ext == '.wav': hash1 |= 0x80000000 #--Hash2 #--Python integers have no upper limit. Use uintMask to restrict these to 32 bits. uintMask, hash2, hash3 = 0xFFFFFFFF, 0, 0 for char in chars[1:-2]: #--Slice of the chars array hash2 = ((hash2 * 0x1003f) + char ) & uintMask for char in map(ord,ext): hash3 = ((hash3 * 0x1003F) + char ) & uintMask hash2 = (hash2 + hash3) & uintMask #--Done return (hash2<<32) + hash1 #--Return as uint64