cascette-rs Documentation
About This Project
cascette-rs is part of the wowemulation-dev initiative to build open source tooling for World of Warcraft emulation. The project focuses on modern WoW Classic clients (1.13+, 2.5+, 3.4+) which use Blizzard’s NGDP content distribution system.
Why Modern Clients?
The WoW emulation and modding community has historically focused on the 3.3.5a client from 2008. While functional, this approach has limitations:
- Outdated technology: MPQ archives, no content addressing, manual patching
- Fragmented tooling: Many tools exist only as abandoned Windows binaries
- Limited modding: Technical constraints restrict what can be modified
Modern Classic clients differ from 3.3.5a:
- Active development: Blizzard continues updating these clients
- Better architecture: NGDP/CASC enables content addressing and streaming
- Cross-platform: Same content format works on Windows, macOS, and Linux
- Preservation: Community CDN mirrors ensure historical builds remain available
What You Can Do with cascette-rs
For Emulator Developers
- Download specific WoW Classic builds for server development
- Extract game data files (DBCs, maps, models) for server-side use
- Verify client installations match expected versions
- Serve game content to clients via the Agent API
For Archivists
- Mirror complete WoW builds from Blizzard’s CDN
- Preserve historical game versions before they disappear
- Access builds from community CDN mirrors when Blizzard removes them
- Track version history across all WoW products
For Modders
- Extract assets from any WoW build for modification
- Understand file relationships through encoding and root manifests
- Work with modern file formats instead of legacy MPQ tools
- Build custom content distribution for modified clients
For Tool Developers
- Parse all NGDP/CASC binary formats with the cascette-formats library
- Build applications on top of cascette’s CDN and protocol layers
- Integrate CASC reading into existing toolchains
- Create cross-platform tools that work on Linux, macOS, and Windows
What is NGDP?
NGDP (Next Generation Distribution Pipeline) is Blizzard’s content distribution system. It replaced MPQ/P2P/Torrent distribution with World of Warcraft 6.0 in 2014.
For technical details, see NGDP on wowdev.wiki.
System Overview
NGDP consists of three components:
- Ribbit API: Provides product versions, CDN endpoints, and configuration data
- CDN Distribution: Delivers game content through HTTP/HTTPS
- Agent: Local HTTP service (port 1120) that manages downloads and installations
Key Differences from MPQ
-
Distribution Method: CDN-based delivery instead of P2P/Torrent
-
Content Addressing: Files identified by content hashes rather than names
-
Update Mechanism: Incremental updates through partial file downloads
-
Archive Format: CASC (Content Addressable Storage Container) replaces MPQ archives
-
Content Protection: Encryption support for secure pre-release distribution
Benefits of NGDP
For Distribution
-
Reduced Server Load: CDN infrastructure handles content delivery
-
Faster Downloads: Users connect to nearest CDN nodes
-
Incremental Updates: Only changed content needs downloading
-
Parallel Downloads: Multiple files retrieved simultaneously
-
Pre-release Distribution: Encrypted content can be distributed before launch
For Development
-
Content Deduplication: Identical files stored once across versions
-
Stream Installation: Games playable before download completes
-
Platform Independence: Same content system across operating systems
Core Concepts
Content Addressing
Files are identified by MD5 hashes of their content. Identical content produces identical hashes, enabling deduplication, integrity verification, and cache efficiency.
System Files
NGDP uses metadata files to manage content:
-
Root File: Maps game files to content keys
-
Encoding File: Maps content to compressed versions
-
Install Manifest: Defines installation requirements
-
Download Manifest: Sets download priorities
BLTE Format
BLTE (Block Table Encoded) is the container format for game data. It supports:
-
Block-based compression
-
Multiple compression algorithms
-
Encryption per block
-
Chunked processing
Content Encryption
NGDP supports Salsa20 encryption for distributing content before its release date. Files can be pre-positioned on CDN while remaining inaccessible until decryption keys are provided.
Technical Specifications
-
Byte Order: Big-endian (network byte order)
-
Hash Algorithm: MD5 for content identification
-
Key Size: 128-bit (16 bytes)
-
Compression: zlib, lz4, and other algorithms per block
-
Encryption: Salsa20 stream cipher for content protection
Format Organization
NGDP/CASC formats are organized by their storage location and usage context:
1. CDN Formats (Network/Remote)
Formats served by Blizzard CDN servers via HTTP/HTTPS.
2. CASC Formats (Local/Client)
Formats created and managed by the Battle.net client on local storage.
3. Shared Formats
Formats used in both CDN and local contexts.
Component Documentation
Service Discovery
Service discovery components handle version information, CDN endpoint discovery, and product configuration metadata:
-
Ribbit Protocol - TCP-based discovery and version information API
-
BPSV Format - Blizzard Pipe-Separated Values format for API responses
CDN Formats
Configuration Files (Text)
-
Build Config - Build-specific settings (
/config/{hash}) -
CDN Config - CDN server and archive lists (
/config/{hash}) -
Product Config - Product settings and versions (
/config/{hash}) -
Patch Config - Differential patch information (
/config/{hash})
Content Files (Binary)
Immutable, content-addressed files served from CDN:
-
CDN Archives - BLTE containers with game content (
/data/{prefix}/{hash}.archive) -
CDN Indices - Maps keys to archive locations (
/data/{prefix}/{hash}.index) -
Encoding File - Maps content to encoding keys (
/data/{prefix}/{hash}) -
Root File - Maps files to content keys (
/data/{prefix}/{hash}) -
Install Manifest - Installation requirements (
/data/{prefix}/{hash}) -
Download Manifest - Download priorities (
/data/{prefix}/{hash}) -
Patch Archives - Delta patches (
/patch/{prefix}/{hash}.archive) -
Patch Indices - Patch archive index (
/patch/{prefix}/{hash}.index)
Modern Additions (WoW 8.2+)
- TVFS - Virtual file system manifest (via
vfs-*fields in BuildConfig)
CASC Local Formats
Client-side storage structures created and managed by Battle.net:
Local Indices
-
IDX Journal - Bucket-based local index (
Data/indices/{bucket}.idx) -
Archive Groups - Combined archive index (client-generated optimization)
-
Shadow Memory - Memory-mapped cache (
Data/shmem)
Local Archives
-
data.### - Combined CDN archives (
Data/data/data.###) -
patch.### - Combined patch archives (
Data/patch/patch.###)
Local Configuration
-
.build.info - Local build configuration (root directory), BPSV-formatted
-
DBCache - Hotfix database cache (
Cache/ADB/*.bin)
Shared Formats
Container Formats
-
BLTE Format - Block compression/encryption (all content storage)
-
ESpec Format - Encoding specifications (compression definitions)
Cryptographic
-
MD5 Keys - Content addressing (all key references)
-
Salsa20 Encryption - Stream cipher (content protection)
-
TACT Keys - Key management (decryption keys)
Supporting Systems
-
CDN Architecture - Content distribution network structure
-
CDN Mirroring - Historical preservation strategies
-
FileDataId - Persistent file identification across builds
Format Relationships
CDN Download Flow
flowchart TB
subgraph Discovery
Ribbit["Ribbit (BPSV)"]
ProductConfig["Product Config"]
CDNConfig["CDN Config"]
BuildConfig["Build Config"]
end
subgraph Content
Archives["CDN Archives + Indices"]
Encoding["Encoding File"]
Root["Root File"]
Manifests["Install/Download Manifests"]
end
Ribbit --> ProductConfig --> CDNConfig --> BuildConfig
BuildConfig --> Archives
Archives --> Encoding --> Root
Root --> Manifests
Content Resolution
flowchart LR
subgraph Input
File["Filename/FileDataId"]
end
subgraph Lookup
Root["Root File"]
CKey["Content Key"]
Encoding["Encoding File"]
EKey["Encoding Key + ESpec"]
end
subgraph Retrieval
Index["CDN Index"]
Location["Archive Location"]
Archive["CDN Archive"]
BLTE["BLTE Data"]
end
subgraph Output
Decompress["Decompression"]
Raw["Raw Content"]
end
File --> Root --> CKey
CKey --> Encoding --> EKey
EKey --> Index --> Location
Location --> Archive --> BLTE
BLTE --> Decompress --> Raw
Glossary
Key terms used throughout this documentation. If you’re coming from the MPQ/3.3.5a modding scene, pay attention to the “MPQ Equivalent” notes.
Content Identification
Content Key (CKey)
MD5 hash of a file’s uncompressed content. Used to identify files regardless of how they’re compressed or stored.
- Size: 16 bytes (128 bits)
- MPQ Equivalent: Similar to how MPQ uses filenames, but content-based
- Example:
a1b2c3d4e5f6...(32 hex characters)
Encoding Key (EKey)
MD5 hash of a file’s compressed/encoded BLTE data. Used to locate files on CDN and in archives.
- Size: 16 bytes (128 bits)
- Relationship: CKey → Encoding File → EKey
- Example: Files with identical content share a CKey but may have different EKeys
FileDataID (FDID)
Numeric identifier for a file, persistent across game versions. Replaced filename-based lookups in WoW 8.0+.
- Size: 4 bytes (32-bit integer)
- Range: 0 to ~4 million (as of 2024)
- MPQ Equivalent: None - MPQ used filenames exclusively
- Example:
1234567refers to a specific texture, model, or data file
Name Hash
Jenkins96 hash of a file’s path. Used in older builds (pre-8.0) to look up files by name.
- Algorithm: Jenkins96 (lookup3)
- MPQ Equivalent: Similar to MPQ’s hash table for filename lookup
- Note: Deprecated in favor of FileDataID in modern builds
File Formats
BLTE (Block Table Encoded)
Container format that wraps all CASC content. Provides compression and optional encryption.
- MPQ Equivalent: Similar to MPQ’s sector-based compression
- Key difference: BLTE supports multiple compression algorithms per file
- Compression: None, zlib, LZMA, LZ4, Zstd
- Encryption: Salsa20, ARC4 (older builds)
Encoding File
Maps CKeys to EKeys. The central lookup table for content resolution.
- Purpose: Find where a file’s compressed data lives
- MPQ Equivalent: None - MPQ stored files directly by name
Root File
Maps FileDataIDs (or name hashes) to CKeys. The entry point for file lookup.
- Purpose: Find what content hash a file has
- MPQ Equivalent: Combines MPQ’s hash table and block table functions
- Contains: FileDataID, locale flags, content flags, CKey
Install Manifest
Lists files required for a minimal installation (enough to launch the game).
- Purpose: Prioritize essential files for streaming installs
- MPQ Equivalent: None - MPQ required full downloads
Download Manifest
Prioritizes files for background downloading after initial install.
- Purpose: Order non-essential downloads by importance
- MPQ Equivalent: None
Storage Concepts
Archive
Large file containing many compressed files, identified by EKey.
- CDN archives: ~256 MB bundles served via HTTP
- Local archives:
data.xxxfiles in the Data folder - MPQ Equivalent: Similar to .mpq files, but content-addressed
Archive Index
Maps EKeys to offsets within an archive file.
- CDN index:
.indexfile paired with each archive - Local index:
.idxfiles inData/indices/ - MPQ Equivalent: Similar to MPQ’s block table
Archive Group
Combined index covering multiple archives. Optimization for faster lookups.
- Location: Generated locally by the client from downloaded archive indices
- Purpose: Single lookup instead of checking each archive index
- Note: Never downloaded from CDN - always client-generated
CASC (Content Addressable Storage Container)
The local storage system. Everything is identified by content hash.
- MPQ Equivalent: Replaces MPQ archives entirely
- Key difference: Files found by hash, not by name
Network Concepts
CDN (Content Delivery Network)
Servers that host game content. Blizzard uses Akamai, Level3, and others.
- Structure:
https://{cdn}/{product}/{type}/{hash[:2]}/{hash[2:4]}/{hash} - Types: config, data, patch
Ribbit
Protocol for querying product versions and CDN information.
- Port: 1119 (TCP) or HTTP
- Purpose: Discover what versions exist and where to download them
- MPQ Equivalent: None - MPQ versions were distributed manually
Agent
Local HTTP service (port 1120) that manages downloads and installations.
- Purpose: Background downloading, installation management
- MPQ Equivalent: None - MPQ required manual patching
Configuration
Build Config
Per-build settings including root/encoding file hashes and encryption keys.
- Location: CDN
/config/{hash} - Contains: Root CKey, encoding CKey, patch info, VFS info
CDN Config
Lists available CDN servers and archive hashes.
- Location: CDN
/config/{hash} - Contains: Archive list, server URLs, file groups
Product Config
Product-wide settings spanning multiple builds.
- Location: CDN
/config/{hash} - Contains: Decryption keys, feature flags
Encryption
TACT Key
Encryption key for protected content. Named keys are published, unnamed are secret.
- Size: 16 bytes
- Algorithm: Used with Salsa20 stream cipher
- Source: Community-maintained key databases
Salsa20
Stream cipher used for content encryption in modern builds.
- Key size: 256 bits (16-byte key + 16-byte name as nonce)
- Replaces: ARC4 (used in older builds)
MPQ to CASC Quick Reference
| MPQ Concept | CASC Equivalent |
|---|---|
| .mpq file | Archive (data.xxx) |
| Filename | FileDataID or CKey |
| Hash table | Root file |
| Block table | Archive index |
| Sector compression | BLTE blocks |
| Patch MPQ | Patch archives + encoding |
| listfile.txt | Community listfiles |
| Manual patching | Agent + CDN |
See Also
Encoding File Format
The encoding file is the gateway to all CASC content. It maps content keys (unencoded file hashes) to encoding keys (encoded/compressed file hashes) and provides essential metadata for content resolution.
Overview
The encoding file serves multiple critical functions:
- Content Resolution: Maps content keys to encoding keys for CDN retrieval
- Compression Metadata: Specifies ESpec encoding for each file
- Size Information: Tracks both compressed and decompressed sizes
- Multi-Version Support: Handles multiple encoding keys per content key
File Structure
The encoding file is BLTE-encoded and consists of:
[BLTE Container]
[Header] (22 bytes)
[ESpec Table] (variable)
[CKey Page Index] (variable)
[CKey Pages] (variable)
[EKey Page Index] (variable)
[EKey Pages] (variable)
[File ESpec] (variable) - The encoding file's own ESpec
Binary Format
Header (22 bytes)
struct EncodingHeader {
uint16_t magic; // 0x00: 'EN' (0x454E)
uint8_t version; // 0x02: Version (1)
uint8_t ckey_size; // 0x03: Content key size (16)
uint8_t ekey_size; // 0x04: Encoding key size (16)
uint16_t ckey_page_size; // 0x05: CKey page size in KB (BE)
uint16_t ekey_page_size; // 0x07: EKey page size in KB (BE)
uint32_t ckey_page_count; // 0x09: Number of CKey pages (BE)
uint32_t ekey_page_count; // 0x0D: Number of EKey pages (BE)
uint8_t flags; // 0x11: Flags (must be 0)
uint32_t espec_size; // 0x12: ESpec table size (BE)
};
ESpec String Table
Immediately follows the header. Contains null-terminated strings referenced by entries:
"z\0b:{0,4}\0b:{0,4},z\0b:{0,2},z:{0,6}\0...\0"
Common ESpec patterns:
-
z- ZLib compression -
n- No compression -
b:{start,size}- Block encoding (see ESpec) -
Empty string for uncompressed files
Page Index Tables
CKey Page Index
For each CKey page:
struct PageIndex {
uint8_t first_key[ckey_size]; // First key in the page
uint8_t page_hash[16]; // MD5 of the page data
};
EKey Page Index
Similar structure but uses ekey_size for the first key.
Content Key (CKey) Pages
Pages are sorted by content key for binary search. Each page contains multiple entries:
struct CKeyEntry {
uint8_t ekey_count; // Number of encoding keys
uint8_t file_size[5]; // Decompressed size (40-bit BE)
uint8_t ckey[ckey_size]; // Content key
uint8_t ekeys[ekey_size * ekey_count]; // Encoding keys
};
Entry layout (sizes from header):
[count:1] [size:5] [ckey:ckey_size] [ekey1:ekey_size] [ekey2:ekey_size] ...
Multiple EKeys: A single content key can map to multiple encoding keys, allowing:
-
Different compression algorithms for the same content
-
Regional variations with different encryption
-
Platform-specific optimizations
Encoding Key (EKey) Pages
Maps encoding keys to ESpec entries:
struct EKeyEntry {
uint8_t ekey[ekey_size]; // Encoding key
uint32_t espec_index; // Index into ESpec table (BE)
uint8_t file_size[5]; // Encoded file size (40-bit BE)
};
Padding Detection: EKey pages may contain padding entries that must be skipped. Two sentinel patterns indicate padding:
espec_index == 0xFFFFFFFF(Agent.exe sentinel)espec_index == 0with all key bytes0x00(zero-fill padding)
Content Resolution Process
-
Find CKey Entry:
- Binary search CKey page index for target page
- Linear search within page for content key
- Extract encoding key(s) and decompressed size
-
Find EKey Entry (optional):
- Binary search EKey page index
- Locate entry to get ESpec index and compressed size
-
Parse ESpec:
- Index into ESpec string table
- Parse encoding specification for compression details
Usage
Parsing
#![allow(unused)]
fn main() {
use cascette_formats::encoding::EncodingFile;
// From decompressed data
let encoding = EncodingFile::parse(&data)?;
// From BLTE-encoded CDN data
let encoding = EncodingFile::parse_blte(&blte_data)?;
}
Content Key Lookup
#![allow(unused)]
fn main() {
use cascette_crypto::ContentKey;
// Single lookup (binary search on page index, linear within page)
if let Some(ekey) = encoding.find_encoding(&content_key) {
println!("Encoding key: {:?}", ekey);
}
// Get all encoding keys for a content key
let ekeys = encoding.find_all_encodings(&content_key);
// Batch lookup (sort-merge across pages)
let results = encoding.batch_find_encodings(&content_keys);
}
EKey to ESpec Lookup
#![allow(unused)]
fn main() {
use cascette_crypto::EncodingKey;
if let Some(espec) = encoding.find_espec(&encoding_key) {
println!("Compression spec: {}", espec);
}
}
Building
#![allow(unused)]
fn main() {
use cascette_formats::encoding::{EncodingBuilder, CKeyEntryData, EKeyEntryData};
let mut builder = EncodingBuilder::new(); // 4KB pages
builder.add_ckey_entry(CKeyEntryData {
content_key,
file_size: 524_288,
encoding_keys: vec![encoding_key],
});
builder.add_ekey_entry(EKeyEntryData {
encoding_key,
espec: "z".to_string(),
file_size: 187_234,
});
let encoding_file = builder.build()?;
}
Page Structure
All pages are loaded eagerly. Each page preserves its original binary data for byte-exact round-trip reconstruction:
#![allow(unused)]
fn main() {
// Page<T> holds parsed entries and raw bytes
pub struct Page<T> {
pub entries: Vec<T>,
pub original_data: Vec<u8>,
}
// IndexEntry holds first key + MD5 checksum for integrity
pub struct IndexEntry {
pub first_key: [u8; 16],
pub checksum: [u8; 16],
}
}
All multi-byte header and page fields are big-endian.
ESpec Integration
The ESpec strings define how files are encoded:
Common Patterns
- Uncompressed: Empty string or
n - ZLib:
z - Partial compression:
b:{0,1000},z,b:{1000,500},n- Bytes 0-1000: ZLib compressed
- Bytes 1000-1500: Uncompressed
Parsing ESpec
#![allow(unused)]
fn main() {
enum ESpecOp {
None,
ZLib,
ByteRange { start: u32, size: u32 },
}
fn parse_espec(spec: &str) -> Vec<ESpecOp> {
if spec.is_empty() || spec == "n" {
return vec![ESpecOp::None];
}
spec.split(',')
.map(|part| match part {
"z" => ESpecOp::ZLib,
"n" => ESpecOp::None,
s if s.starts_with("b:") => {
// Parse "b:{start,size}"
let nums = parse_range(s);
ESpecOp::ByteRange {
start: nums.0,
size: nums.1
}
}
_ => ESpecOp::None,
})
.collect()
}
}
Multi-Version Support
Files can have multiple encoding keys (different compression/encryption):
#![allow(unused)]
fn main() {
struct CKeyEntry {
ekey_count: u8, // Usually 1, can be 2+
file_size: u64, // Same for all versions
ckey: [u8; 16], // Content key
ekeys: Vec<[u8; 16]>, // Multiple encoding keys
}
}
Use cases include different regional encryption and progressive quality levels.
Performance Considerations
Memory-Mapped Access
For large encoding files (100MB+):
#![allow(unused)]
fn main() {
use memmap2::MmapOptions;
struct EncodingFile {
mmap: Mmap,
header: EncodingHeader,
// ...
}
impl EncodingFile {
fn open(path: &Path) -> Result<Self> {
let file = File::open(path)?;
let mmap = unsafe { MmapOptions::new().map(&file)? };
// Parse header from mmap
let header = EncodingHeader::read(&mmap[..22])?;
Ok(Self { mmap, header })
}
}
}
Page Caching
Cache frequently accessed pages:
#![allow(unused)]
fn main() {
struct PageCache {
entries: LruCache<u32, Arc<CKeyPage>>,
}
}
Validation
Checksums
Each page has an MD5 checksum in the index:
#![allow(unused)]
fn main() {
fn validate_page(index: &PageIndex, data: &[u8]) -> bool {
let computed = md5::compute(data);
computed.0 == index.page_hash
}
}
Size Constraints
-
Page sizes must be > 0 (no power-of-2 requirement enforced)
-
Key sizes in range [1, 16] bytes
-
Page counts must be > 0
-
ESpec size must be > 0
-
File sizes use 40-bit integers (up to 1TB)
File’s Own ESpec
After all the data structures, the encoding file contains its own ESpec string describing how it’s compressed. This self-referential metadata is an intentional, documented feature of the NGDP format.
Official Documentation
The wowdev.wiki TACT specification explicitly lists this as the 5th component:
- Header
- Encoding specification data (ESpec)
- Content key → encoding key table
- Encoding key → encoding spec table
- “Encoding specification data for the encoding file itself”
Reference Implementation
TACT.Net explicitly handles this in EncodingFile.cs:
-
Line 151:
// remainder is an ESpec block for the file itself -
Implements
GetFileESpec()method to generate this when writing
Real-World Examples
wow_classic 5.5.0.62655 (60 bytes):
b:{22=n,76025=z,223424=n,28598272=n,146656=n,18771968=n,*=z}
wow_classic_era 1.15.7.61582 (55 bytes):
b:{22=n,2069=z,65536=n,8388608=n,43008=n,5505024=n,*=z}
Meaning:
-
22=n: Header (22 bytes) uncompressed -
76025=z: ESpec table compressed with ZLib -
223424=n: CKey index uncompressed -
28598272=n: CKey pages uncompressed -
146656=n: EKey index uncompressed -
18771968=n: EKey pages uncompressed -
*=z: Remainder (the file’s own ESpec) compressed
This self-referential design allows files to describe their own compression structure using the same ESpec format as all other files.
Common Issues
- Page Boundary Errors: Entries can span pages
- Endianness: All multi-byte values are big-endian
- ESpec Index: Zero-based into string table
- CKey Padding: Entries with
ekey_count = 0indicate end of page data - EKey Padding: Entries with
espec_index = 0xFFFFFFFFor all-zero keys indicate padding (see Padding Detection above) - File Size: Remember to account for the file’s own ESpec at the end
Real-World Example
Using wow_classic_era 1.15.7.61582:
Encoding file: bbf06e7476382cfaa396cff0049d356b
Header:
Magic: 0x454E ('EN')
Version: 1
CKey/EKey size: 16 bytes each
CKey pages: 4KB × 127 pages
EKey pages: 4KB × 127 pages
ESpec table: 1,234 bytes
Example CKey entry:
Content Key: 3ce96e7a9e3b6f5c9d99c8b4e0a4f3d2
EKey count: 1
File size: 524,288 bytes (512KB)
Encoding Key: 7f8a9b3c4d5e6f7081929a3b4c5d6e7f
Corresponding EKey entry:
Encoding Key: 7f8a9b3c4d5e6f7081929a3b4c5d6e7f
ESpec index: 1 (points to "z" - ZLib)
Compressed size: 187,234 bytes
This shows a typical game asset compressed from 512KB to 183KB using ZLib.
Implementation Flow
#![allow(unused)]
fn main() {
use cascette_formats::encoding::EncodingFile;
use cascette_crypto::ContentKey;
// 1. Parse encoding file from BLTE-encoded CDN data
let encoding = EncodingFile::parse_blte(&cdn_data)?;
// 2. Look up content by content key
let ekey = encoding.find_encoding(&content_key)
.ok_or("content key not found")?;
// 3. Optionally get the compression spec
let espec = encoding.find_espec(&ekey);
// 4. Fetch actual file from CDN using encoding key, then decompress
}
Version History
The Encoding file format currently has only one version:
Version 1 (Current)
- Header Size: 22 bytes
- Magic: “EN” (0x454E)
- Features:
- Content key to encoding key mapping
- Dual page index system (CKey and EKey pages)
- ESpec string table for compression metadata
- 40-bit file sizes (up to 1TB per file)
- Multiple encoding keys per content key support
- Page-based binary search
- MD5 page checksums for integrity
Version Detection
All known encoding files use version 1. The version field is at offset 2 in the header. If future versions are introduced, parsers should check this field after validating the “EN” magic bytes.
References
-
See ESpec Documentation for encoding specifications
-
See BLTE Format for container structure
-
See CDN Architecture for retrieval patterns
-
See Format Transitions for format evolution tracking
Root File Format
The Root file is the primary catalog of all files stored in CASC archives. It maps file paths or FileDataIDs to content keys, enabling game clients to locate and retrieve specific assets.
Overview
The Root file serves as the master index for all game content:
-
Maps FileDataIDs to content keys
-
Supports multiple locales and content flags
-
Groups files into blocks for efficient lookup
-
Handles both named and unnamed entries
File Structure
The Root file is BLTE-encoded and organized into blocks:
[BLTE Container]
[Header]
[Block 1]
[Block 2]
...
[Block N]
Binary Format
Version Detection
The Root file format has evolved significantly:
-
Pre-30080: No MFST magic, raw block data
-
Build 30080+ (v2): MFST magic with file counts
-
Build 50893+ (v3): Added header_size/version fields
-
Build 58221+ (v4): Extended content flags to 40 bits
Header Structures
Version 2 (Build 30080+)
struct RootHeaderV2 {
uint32_t magic; // 'MFST' (0x4D465354) or 'TSFM' (0x5453464D)
uint32_t total_file_count; // Total number of files
uint32_t named_file_count; // Number of named entries
};
Note: Some builds use ‘TSFM’ magic instead of ‘MFST’. This appears to be a little-endian representation. Both should be accepted as valid.
Version 3 (Build 50893+)
struct RootHeaderV3 {
uint32_t magic; // 'MFST' (0x4D465354) or 'TSFM' (0x5453464D)
uint32_t header_size; // Size of header (20 bytes)
uint32_t version; // Version (1)
uint32_t total_file_count; // Total number of files
uint32_t named_file_count; // Number of named entries
uint32_t padding; // Padding (0)
};
Note: Version 3 also uses TSFM magic in observed builds, maintaining consistency with Version 2.
Version Detection Heuristic: After reading the magic, check the next two u32 values. If the first value (header_size) is in range [16, 100) and the second value (version) is less than 10, the file is v3+. Otherwise treat the first value as total_file_count (v2). Version 1 maps to V2 block format.
Block Structure
Each block contains file entries for specific locale and content flag combinations. Important: The block header format changed significantly between V1 and V2+.
V1 Block Header (Pre-30080, 12 bytes)
V1 files have no MFST/TSFM magic and use a 12-byte block header with interleaved record format:
struct RootBlockHeaderV1 {
uint32_t num_records; // Number of records in block
uint32_t content_flags; // Content flags (32-bit)
uint32_t locale_flags; // Locale flags (language/region)
// FileDataID deltas (delta-encoded)
int32_t fileDataIDDeltas[num_records];
// Interleaved record data (content_key + name_hash per record)
RootRecordInterleaved records[num_records];
};
V2+ Block Header (Build 30080+, 17 bytes)
V2 and later versions have MFST/TSFM magic and use a 17-byte block header with separated arrays. Per wowdev.wiki documentation for Version 2 (11.1.0+):
#pragma pack(push, 1)
struct RootBlockHeaderV2 {
uint32_t num_records; // Number of records in block
uint32_t locale_flags; // Locale flags (MOVED - was third in V1!)
uint32_t content_flags; // Content flags (was second in V1)
uint32_t unk2; // Unknown field 2
uint8_t unk3; // Unknown field 3 (flags via bit-shift)
// FileDataID deltas (delta-encoded)
int32_t fileDataIDDeltas[num_records];
// Separated arrays (all content_keys, then all name_hashes)
uint8_t content_keys[num_records][16];
uint8_t name_hashes[num_records][8]; // Optional based on flags
};
#pragma pack(pop)
Critical Implementation Note: The field order change from V1 to V2+ is a
common source of parsing bugs. In V1, the order is num_records, content_flags, locale_flags. In V2+, the order is num_records, locale_flags, content_flags, unk2, unk3.
V4 Extended Content Flags
V4 (Build 58221+) extends content flags to 40 bits, increasing the block
header to 18 bytes (the content_flags field grows from 4 to 5 bytes).
The 40-bit value is read as a u32 (4 bytes) plus a u8 (1 byte):
uint32_t content_flags_low; // Bits 0-31
uint8_t content_flags_high; // Bits 32-39
// Combined: content_flags = content_flags_low | (content_flags_high << 32)
Record Formats
Old Format (Interleaved)
struct RootRecordOld {
uint8_t content_key[16]; // MD5 content key
uint8_t name_hash[8]; // Jenkins96 name hash (optional)
};
New Format (Separated)
struct RootRecordNew {
// Arrays stored separately
uint8_t content_keys[num_records][16];
uint8_t name_hashes[num_records][8]; // Optional
};
Content Flags
Content flags specify platform, architecture, and file attributes:
32-bit Flags (v2-v3)
Values match CascLib (CascLib.h), TACTSharp, and WoWDev wiki:
| Value | Flag | Description |
|---|---|---|
| 0x00000004 | Install | Install manifest entry |
| 0x00000008 | LoadOnWindows | Windows platform |
| 0x00000010 | LoadOnMacOS | macOS platform |
| 0x00000020 | x86_32 | 32-bit x86 architecture |
| 0x00000040 | x86_64 | 64-bit x86 architecture |
| 0x00000080 | LowViolence | Censored content |
| 0x00000100 | DoNotLoad | Skip file |
| 0x00000800 | UpdatePlugin | Launcher plugin |
| 0x00008000 | Arm64 | ARM64 architecture |
| 0x08000000 | Encrypted | Encrypted content |
| 0x10000000 | NoNameHash | No name hash in block |
| 0x20000000 | UncommonResolution | Non-standard resolution |
| 0x40000000 | Bundle | Bundled content |
| 0x80000000 | NoCompression | Uncompressed |
40-bit Flags (v4+)
Build 58221+ extends to 40 bits, stored as u32 + u8:
-
Bits 0-31: Standard content flags (same as v2/v3)
-
Bits 32-39: Extended flags (single byte, shifted left by 32)
Common combinations:
-
0x00000000: All platforms, default -
0x00000008: Windows only -
0x00000010: macOS only -
0x08000000: Encrypted content -
0x10000000: No name hash present
Locale Flags
32-bit field representing language/region:
| Value | Locale | Description |
|---|---|---|
| 0x00000002 | enUS | English (US) |
| 0x00000004 | koKR | Korean |
| 0x00000010 | frFR | French |
| 0x00000020 | deDE | German |
| 0x00000040 | zhCN | Chinese (Simplified) |
| 0x00000080 | esES | Spanish (Spain) |
| 0x00000100 | zhTW | Chinese (Traditional) |
| 0x00000200 | enGB | English (UK) |
| 0x00000400 | enCN | English (China) |
| 0x00000800 | enTW | English (Taiwan) |
| 0x00001000 | esMX | Spanish (Mexico) |
| 0x00002000 | ruRU | Russian |
| 0x00004000 | ptBR | Portuguese (Brazil) |
| 0x00008000 | itIT | Italian |
| 0x00010000 | ptPT | Portuguese (Portugal) |
| 0xFFFFFFFF | All | All locales |
FileDataID Delta Encoding
FileDataIDs use delta encoding for compression:
#![allow(unused)]
fn main() {
fn decode_file_data_ids(deltas: &[i32]) -> Vec<u32> {
let mut ids = Vec::new();
let mut current_id = 0u32;
for (i, &delta) in deltas.iter().enumerate() {
if i == 0 {
// First entry: direct value, not a delta
current_id = delta as u32;
} else {
// Subsequent entries: add delta to previous ID
current_id = (current_id as i32 + delta) as u32;
}
ids.push(current_id);
// Important: Increment for next iteration
current_id += 1;
}
ids
}
}
Note: The algorithm increments current_id by 1 after each entry, then applies the next delta. This handles sequential FileDataIDs efficiently.
Lookup Process
- Parse Root file: Decompress BLTE, read header and blocks
- Filter by flags: Select blocks matching desired locale/content
- Find FileDataID: Binary search or iterate through blocks
- Extract content key: Retrieve corresponding MD5 hash
- Resolve via encoding: Use content key to find encoding key
Name Hash Calculation
For named files, Jenkins96 hash (hashlittle2) is used:
#![allow(unused)]
fn main() {
fn jenkins96_hash(filename: &str) -> u64 {
// Normalize path: uppercase with backslashes (matching CascLib's
// NormalizeFileName_UpperBkSlash)
let normalized = filename.to_uppercase().replace('/', "\\");
let bytes = normalized.as_bytes();
// Jenkins hashlittle2 with pc=0, pb=0
let hash = Jenkins96::hash(bytes);
// Return (pc << 32) | pb directly (no word swap)
// Matches CascLib's CalcNormNameHash
hash.hash64
}
}
Important Jenkins96 Details:
-
Paths are normalized to uppercase with backslashes (not forward slashes)
-
The hash is 64-bit (8 bytes) not 96-bit despite the name
-
Some blocks have
NoNameHashflag, omitting name hashes entirely -
Uses Bob Jenkins’ lookup3.c algorithm (hashlittle2 function)
-
Processes data in 12-byte chunks with little-endian byte order
-
The 0xDEADBEEF constant is added during initialization
-
Python validation tool available in cascette-py project: https://github.com/wowemulation-dev/cascette-py
Example Hashes:
-
Empty string:
0xDEADBEEFDEADBEEF -
Interface\Icons\INV_Misc_QuestionMark.blp:0x9EB59E3C76124837
Implementation Example
#![allow(unused)]
fn main() {
struct RootFile {
header: RootHeader,
blocks: Vec<RootBlock>,
}
impl RootFile {
pub fn find_file(&self, file_data_id: u32) -> Option<MD5Hash> {
for block in &self.blocks {
// Check if block matches desired flags
if !self.matches_flags(block) {
continue;
}
// Search for FileDataID
if let Some(idx) = block.find_file_index(file_data_id) {
return Some(block.records[idx].content_key);
}
}
None
}
}
}
Version History
-
Build 18125 (6.0.1): Initial CASC Root format (V1)
- No magic header
- 12-byte block header:
num_records, content_flags, locale_flags - Interleaved record format:
(ckey, name_hash)per record
-
Build 30080 (8.2.0): Added MFST magic signature (V2)
- MFST/TSFM magic header with file counts
- 17-byte block header:
num_records, locale_flags, content_flags_1, content_flags_2, content_flags_3 - Field order changed:
locale_flagsmoved beforecontent_flags - Combined content flags:
content_flags_1 | content_flags_2 | (content_flags_3 << 17) - Separated array format: all ckeys, then all name_hashes
-
Build 50893 (10.1.7): Added header_size/version fields (V3)
- Extended header with
header_size,version,paddingfields - Same 17-byte block header format as V2
- Extended header with
-
Build 58221 (11.1.0): Extended content flags to 40 bits (V4)
- 18-byte block header (content_flags grows from 4 to 5 bytes)
- 40-bit content flags stored as
u32+u8
Version Detection Code
#![allow(unused)]
fn main() {
fn detect_root_version(data: &[u8]) -> RootVersion {
if data.len() < 4 {
return RootVersion::Invalid;
}
// Check for MFST or TSFM magic
let magic = &data[0..4];
if magic != b"MFST" && magic != b"TSFM" {
return RootVersion::V1; // Pre-30080, no magic
}
// Read the two u32 values after magic
let value1 = u32::from_le_bytes(data[4..8].try_into().unwrap());
let value2 = u32::from_le_bytes(data[8..12].try_into().unwrap());
// Heuristic: header_size in [16, 100) and version < 10
// indicates v3+ with explicit header_size/version fields
if (16..100).contains(&value1) && value2 < 10 {
match value2 {
4.. => RootVersion::V4,
_ => RootVersion::V3, // version 1-3 all use V2/V3 block format
}
} else {
RootVersion::V2 // 30080+, value1 is total_file_count
}
}
}
Parser Implementation Status
The Python parser (cascette-py) currently supports:
-
Version detection (MFST/TSFM magic)
-
Version 1-3 parsing
-
Block-based extraction
-
Content key retrieval
-
Delta encoding detection (identifies but doesn’t decode)
The parser can extract FileDataID to content key mappings from all current WoW root file versions.
See https://github.com/wowemulation-dev/cascette-py for the Python implementation.
Common Issues
-
V2 block header size: V2+ uses a 17-byte block header, not 12 bytes like V1. Using the wrong header size causes all subsequent parsing to fail with garbage FileDataIDs and content keys.
-
V2 field order change: V2+ swapped
locale_flagsandcontent_flagspositions. In V1:num_records, content_flags, locale_flags. In V2+:num_records, locale_flags, content_flags, unk2, unk3. -
Multiple matches: Same file may exist in multiple blocks with different locales
-
Missing entries: Not all FileDataIDs have corresponding entries
-
Flag interpretation: Game-specific flag meanings vary
-
Delta overflow: Large gaps in FileDataIDs can cause integer overflow
Implementation Notes
Version Detection Heuristic
The version detection uses value2 < 10 to identify extended headers, which is
broader than the strict matches!(value2, 1..=4) check. Version 1 is accepted
and maps to V2 block format (17-byte header, locale_flags first). This matches
CascLib and TACTSharp behavior. The heuristic may need tightening if future
versions use values in the 5-9 range for non-version purposes.
Block Header Dispatch
The current dispatch is verified correct:
- Plain V1 files (no MFST/TSFM magic) use the 12-byte header (content_flags first)
- All MFST/TSFM files (including Classic Era) use the 17-byte header (locale_flags first)
- V4 files use the 18-byte header (40-bit content flags)
The V2 17-byte format applies to all MFST/TSFM files regardless of the header version field value. The 12-byte format is only used for pre-magic V1 files.
References
-
See Encoding Documentation for content key resolution
-
See BLTE Format for container structure
-
See CDN Architecture for file retrieval
-
wowdev.wiki TACT documentation - Authoritative source for CASC/TACT format specifications including Root file structure
Install Manifest Format
The Install manifest tracks which game files should be installed on disk and manages file tags for selective installation based on system requirements and user preferences.
Overview
The Install manifest maps content keys to installation paths and uses a tag bitmap system for selective installation based on platform, architecture, and locale. File sizes in entries support installation size estimation.
File Structure
The Install manifest is BLTE-encoded and contains:
[BLTE Container]
[Header]
[Tag Section]
[File Entries]
Binary Format
Header
struct InstallHeader {
uint16_t magic; // 'IN' (0x494E)
uint8_t version; // Version (1 or 2)
uint8_t ckey_length; // Content key length in bytes (16)
uint16_t tag_count; // Number of tags (big-endian)
uint32_t entry_count; // Number of file entries (big-endian)
// Version 2+ fields (6 additional bytes, total 16 bytes)
uint8_t content_key_size; // Content key size (Agent.exe) / loose file type (CascLib)
uint32_t entry_count_v2; // Additional entry count (big-endian)
uint8_t unknown; // Unknown byte
};
For version 1, the content key size is derived as ckey_length + 4 (content key +
4-byte file size). Version 2 specifies content_key_size explicitly.
Tag Section
Tags categorize files for selective installation. Each tag consists of:
struct InstallTag {
char name[]; // Null-terminated tag name
uint16_t type; // Tag type (big-endian)
uint8_t bit_mask[]; // Bit mask ((entry_count + 7) / 8 bytes)
};
Important: The bit mask uses big-endian (MSB-first) bit ordering within each byte:
-
Bit 7 (MSB) corresponds to file index
byte_index * 8 + 0 -
Bit 0 (LSB) corresponds to file index
byte_index * 8 + 7 -
The mask for a given file index is
0x80 >> (file_index % 8)
File Entry
File entries follow the tag section:
struct InstallFileEntry {
char path[]; // Null-terminated file path
uint8_t content_key[16]; // MD5 content key
uint32_t file_size; // File size (big-endian)
};
Tag associations are determined by bit positions in each tag’s bit mask.
Tag System
Tag Types
| Type | Value | Description | Examples |
|---|---|---|---|
| Platform | 0x0001 | Operating system tags | Windows, OSX, Android, IOS |
| Architecture | 0x0002 | CPU architecture tags | x86_32, x86_64, arm64 |
| Locale | 0x0003 | Language/region tags | enUS, deDE, frFR |
| Category | 0x0004 | Content category tags | speech, text |
| Unknown | 0x0005 | Unknown tag type | (seen in manifests) |
| Component | 0x0010 | Component tags | game, launcher |
| Version | 0x0020 | Version tags | live, ptr, beta |
| Optimization | 0x0040 | Optimization tags | retail, debug |
| Region | 0x0080 | Region tags | US, EU, KR |
| Device | 0x0100 | Device tags | desktop, mobile |
| Mode | 0x0200 | Mode tags | online, offline |
| Branch | 0x0400 | Branch tags | main, experimental |
| Content | 0x0800 | Content tags | cinematics, audio |
| Feature | 0x1000 | Feature tags | graphics, physics |
| Expansion | 0x2000 | Expansion tags | base, expansion1 |
| Alternate | 0x4000 | Alternate content | Alternate, HighRes |
| Option | 0x8000 | Option tags | (optional features) |
Common Tags
Platform Tags:
- Windows, OSX, Android, IOS, Web
Architecture Tags:
- x86_32, x86_64, arm64
Locale Tags:
- enUS, enGB, deDE, frFR, esES, esMX, itIT,
ruRU, koKR, zhTW, zhCN, ptBR, ptPT
Category Tags:
- speech, text
Alternate Tags:
- Alternate, HighRes
Tag Mask Usage
Tags use bit masks to indicate which files they apply to:
#![allow(unused)]
fn main() {
fn should_install(
file_index: usize,
tag: &InstallTag,
selected: bool
) -> bool {
let byte_index = file_index / 8;
let bit_offset = file_index % 8;
if byte_index >= tag.bit_mask.len() {
return false;
}
// Big-endian (MSB-first) bit ordering within bytes: bit 0 = MSB
let has_tag = (tag.bit_mask[byte_index] & (0x80 >> bit_offset)) != 0;
has_tag && selected
}
}
Installation Planning
Size Calculation
Calculate installation size for selected tags:
#![allow(unused)]
fn main() {
fn calculate_install_size(
entries: &[InstallFileEntry],
selected_tags: u16
) -> u64 {
entries.iter()
.filter(|e| should_install(e, selected_tags))
.map(|e| e.file_size as u64)
.sum()
}
}
Path Resolution
Convert relative paths to absolute:
#![allow(unused)]
fn main() {
fn resolve_install_path(
base_dir: &Path,
entry: &InstallFileEntry
) -> PathBuf {
let relative_path = std::str::from_utf8(&entry.path).unwrap();
base_dir.join(relative_path)
}
}
File Categories
Essential Files
Files with tag mask 0x0000 or 0xFFFF:
-
Core executables
-
Essential libraries
-
Base configuration
-
Critical game data
Optional Content
Files with specific tag requirements:
-
High-resolution textures (HighResTextures tag)
-
Cinematics (Cinematics tag)
-
Additional languages (locale tags)
-
Developer tools (DevTools tag)
Implementation Example
#![allow(unused)]
fn main() {
struct InstallFile {
header: InstallHeader,
tags: Vec<InstallTag>,
entries: Vec<InstallFileEntry>,
}
impl InstallFile {
pub fn get_install_list(&self, tags: &[String]) -> Vec<InstallItem> {
let tag_mask = self.build_tag_mask(tags);
self.entries.iter()
.filter(|e| should_install(e, tag_mask))
.map(|e| InstallItem {
content_key: e.content_key,
install_path: String::from_utf8_lossy(&e.path).to_string(),
file_size: e.file_size,
})
.collect()
}
fn build_tag_mask(&self, tag_names: &[String]) -> u16 {
let mut mask = 0u16;
for name in tag_names {
if let Some(tag) = self.tags.iter().find(|t| t.name == name) {
mask |= 1 << tag.id;
}
}
mask
}
}
}
Selective Installation
Platform-Specific
Install only files for current platform:
#![allow(unused)]
fn main() {
fn get_platform_tags() -> Vec<String> {
let mut tags = vec!["Base".to_string()];
#[cfg(target_os = "windows")]
tags.push("Windows".to_string());
#[cfg(target_arch = "x86_64")]
tags.push("x64".to_string());
tags
}
}
Language Selection
Install specific language assets:
#![allow(unused)]
fn main() {
fn get_locale_tags(selected_locale: &str) -> Vec<String> {
vec![
"Base".to_string(),
selected_locale.to_string(),
]
}
}
Optimization Strategies
Parallel Installation
Install multiple files concurrently:
#![allow(unused)]
fn main() {
use rayon::prelude::*;
fn install_files(items: Vec<InstallItem>) {
items.par_iter()
.for_each(|item| {
download_and_install(item);
});
}
}
Incremental Updates
Track installed files for patching:
#![allow(unused)]
fn main() {
struct InstalledFiles {
entries: HashMap<PathBuf, InstalledFileInfo>,
}
struct InstalledFileInfo {
content_key: [u8; 16],
file_size: u32,
modified_time: SystemTime,
}
}
Validation
Post-Installation Verification
#![allow(unused)]
fn main() {
fn verify_installation(
install_dir: &Path,
install_file: &InstallFile,
selected_tags: u16
) -> Result<()> {
for entry in &install_file.entries {
if !should_install(entry, selected_tags) {
continue;
}
let path = install_dir.join(&entry.path);
// Verify file exists
if !path.exists() {
return Err("Missing file");
}
// Verify file size
let metadata = fs::metadata(&path)?;
if metadata.len() != entry.file_size as u64 {
return Err("Size mismatch");
}
}
Ok(())
}
}
Repair Process
Detect and repair corrupted installations:
#![allow(unused)]
fn main() {
fn repair_installation(
install_file: &InstallFile,
install_dir: &Path
) -> Vec<RepairAction> {
let mut actions = Vec::new();
for entry in &install_file.entries {
let path = install_dir.join(&entry.path);
if !path.exists() {
actions.push(RepairAction::Download(entry.content_key));
} else if !verify_file(&path, entry) {
actions.push(RepairAction::Redownload(entry.content_key));
}
}
actions
}
}
Common Issues
- Tag conflicts: Multiple tags may include same file
- Path separators: Handle platform-specific separators
- Case sensitivity: File systems vary in case handling
- Symlink support: Some platforms don’t support symlinks
- Permission issues: Installation may require elevation
Special Considerations
Shared Files
Files used by multiple products:
#![allow(unused)]
fn main() {
struct SharedFile {
content_key: [u8; 16],
products: Vec<String>,
ref_count: u32,
}
}
Uninstall Tracking
Track files for clean uninstall:
#![allow(unused)]
fn main() {
struct UninstallManifest {
files: Vec<PathBuf>,
directories: Vec<PathBuf>,
registry_keys: Vec<String>, // Windows only
}
}
Parser Implementation Status
Python Parser (cascette-py)
Status: Complete
Capabilities:
-
Version 1 header parsing with IN magic detection
-
Tag extraction with big-endian (MSB-first) bit ordering
-
Platform/architecture/locale tag type classification
-
File entry parsing with path, content key, and size
-
Tag-to-file association via bitmask resolution
-
BLTE decompression for compressed manifests
Verified Against:
-
WoW 11.0.5.57689 (242 entries, 28 tags)
-
Multiple WoW Classic builds
-
Cross-platform tag validation (Windows, OSX, mobile)
Known Issues: None
See https://github.com/wowemulation-dev/cascette-py for the Python implementation.
Version History
The Install manifest format has two versions:
Version 1
- Header Size: 10 bytes
- Magic: “IN” (0x494E)
- Entry Size: Derived as
ckey_length + 4 - Features:
- File path to content key mapping
- Tag-based selective installation
- Platform/architecture/locale filtering
- Bit mask system for tag associations
- Big-endian (MSB-first) bit ordering in tag masks
- Tag type classification (17 types from Platform through Option)
Version 2
- Header Size: 16 bytes (10 base + 6 additional)
- Added Fields:
content_key_size(1 byte),entry_count_v2(4 bytes BE),unknown(1 byte) - Features: All version 1 features plus explicit content key size
Version Detection
The version field is at offset 2 in the header. The agent accepts versions 1 and 2 (validates non-zero and <= 2).
Implementation Status
- cascette-formats: Full support for versions 1 and 2 with validation
- cascette-py: Complete parsing for version 1 with tag extraction
References
-
See Root File for file catalog
-
See Download Manifest for download prioritization
-
See Encoding Documentation for content resolution
-
See Format Transitions for format evolution tracking
Download Manifest Format
The Download manifest manages content streaming and prioritization during game installation and updates. It defines which files are essential for gameplay and their download order.
Overview
The Download manifest assigns a priority to each file entry so the client can download essential content first (enabling play before full download) and stream remaining content in the background. Tag bitmaps enable per-platform and per-locale filtering. File sizes in entries support progress estimation.
File Structure
The Download manifest is BLTE-encoded and contains:
[BLTE Container]
[Header]
[File Entries]
[Tag Section]
Binary Format
Header
struct DownloadHeader {
char magic[2]; // "DL" (0x44, 0x4C)
uint8_t version; // Version (1, 2, or 3)
uint8_t ekey_size; // Encoding key size in bytes (16)
uint8_t has_checksum; // Checksum presence flag
uint32_t entry_count; // Number of entries (big-endian)
uint16_t tag_count; // Number of tags (big-endian)
// Version 2+ fields (header grows to 12 bytes)
uint8_t flag_size; // Number of flag bytes per entry (max 4)
// Version 3+ fields (header grows to 16 bytes)
int8_t base_priority; // Base priority offset
uint8_t _reserved[3]; // Reserved (agent does not validate these)
};
Entry Order
The download manifest stores data in this order:
- Header
- All file entries
- All tags (appear after entries)
File Entry
struct DownloadEntry {
uint8_t ekey[16]; // Encoding key (variable size from header)
uint8_t file_size[5]; // 40-bit file size (big-endian)
int8_t priority; // Download priority (adjusted by base_priority)
// Optional fields
uint32_t checksum; // If has_checksum is true (big-endian)
uint8_t flags[N]; // If version >= 2, N = flag_size
};
Tag Entry
Tags appear after all file entries in the manifest:
struct DownloadTag {
char name[]; // Null-terminated tag name
uint16_t type; // Tag type (big-endian)
uint8_t bitmap[]; // Bit mask ((entry_count + 7) / 8 bytes)
};
Each bit in the bitmap corresponds to a file entry index. If bit N is set, entry N has this tag.
Priority System
Priority Calculation
In version 3+, priorities are adjusted:
final_priority = entry.priority - header.base_priority
Priority Levels
Lower values indicate higher priority:
| Priority | Category | Typical Content |
|---|---|---|
| < 0 | Critical | Must download before game starts |
| 0 | Essential | Required for basic gameplay |
| 1-2 | High | Important for full experience |
| 3-5 | Normal | Standard content |
| > 5 | Low | Optional/deferred content |
Priority-Based Download
#![allow(unused)]
fn main() {
fn get_download_order(entries: &[DownloadFileEntry]) -> Vec<&DownloadFileEntry>
{
let mut sorted = entries.iter().collect::<Vec<_>>();
sorted.sort_by_key(|e| (e.priority, e.file_size));
sorted
}
}
Streaming Strategy
Minimum Playable Set
Calculate minimum download for gameplay:
#![allow(unused)]
fn main() {
fn get_minimum_download(
download_file: &DownloadFile
) -> (Vec<DownloadFileEntry>, u64) {
let essential: Vec<_> = download_file.entries
.iter()
.filter(|e| e.priority <= 1) // Essential + Critical
.cloned()
.collect();
let total_size = essential.iter()
.map(|e| e.file_size as u64)
.sum();
(essential, total_size)
}
}
Progressive Download
Download in priority order while game runs:
#![allow(unused)]
fn main() {
struct DownloadManager {
queue: VecDeque<DownloadItem>,
active: Vec<DownloadTask>,
completed: HashSet<[u8; 16]>,
}
impl DownloadManager {
pub fn start_progressive_download(&mut self) {
// Sort by priority
self.queue.sort_by_key(|item| item.priority);
// Start downloading highest priority
while self.active.len() < MAX_CONCURRENT {
if let Some(item) = self.queue.pop_front() {
self.start_download(item);
}
}
}
}
}
Tag-Based Filtering
Platform-Specific Downloads
Tags are stored separately from entries. Each tag contains a bitmap indicating which entries it applies to. To filter by tag, find the tag by name and check its bitmap:
#![allow(unused)]
fn main() {
fn filter_by_tag<'a>(
manifest: &'a DownloadManifest,
tag_name: &str,
) -> Vec<(usize, &'a DownloadFileEntry)> {
let tag = match manifest.tags.iter().find(|t| t.name == tag_name) {
Some(t) => t,
None => return Vec::new(),
};
manifest.entries.iter().enumerate()
.filter(|(index, _)| tag.has_file(*index))
.collect()
}
}
Language Packs
#![allow(unused)]
fn main() {
fn get_language_pack<'a>(
manifest: &'a DownloadManifest,
locale: &str,
) -> Vec<&'a DownloadFileEntry> {
let tag = match manifest.tags.iter().find(|t| t.name == locale) {
Some(t) => t,
None => return Vec::new(),
};
manifest.entries.iter().enumerate()
.filter(|(index, _)| tag.has_file(*index))
.map(|(_, entry)| entry)
.collect()
}
}
Download Optimization
Bandwidth Management
#![allow(unused)]
fn main() {
struct BandwidthManager {
max_bandwidth: u64, // Bytes per second
current_usage: u64,
priority_limits: Vec<u64>, // Per-priority limits
}
impl BandwidthManager {
pub fn allocate_bandwidth(&mut self, priority: u8) -> u64 {
let priority_limit = self.priority_limits[priority as usize];
let available = self.max_bandwidth - self.current_usage;
std::cmp::min(priority_limit, available)
}
}
}
Chunk-Based Downloads
For large files, download in chunks:
#![allow(unused)]
fn main() {
struct ChunkedDownload {
encoding_key: [u8; 16],
total_size: u64,
chunk_size: u64,
chunks_completed: Vec<bool>,
}
impl ChunkedDownload {
pub fn get_next_chunk(&self) -> Option<(u64, u64)> {
for (idx, &completed) in self.chunks_completed.iter().enumerate() {
if !completed {
let offset = idx as u64 * self.chunk_size;
let size = std::cmp::min(
self.chunk_size,
self.total_size - offset
);
return Some((offset, size));
}
}
None
}
}
}
Progress Tracking
Download Statistics
#![allow(unused)]
fn main() {
struct DownloadProgress {
total_files: u32,
completed_files: u32,
total_bytes: u64,
downloaded_bytes: u64,
current_speed: f64,
eta_seconds: u64,
}
impl DownloadProgress {
pub fn update(&mut self, bytes_downloaded: u64) {
self.downloaded_bytes += bytes_downloaded;
self.current_speed = self.calculate_speed();
self.eta_seconds = self.calculate_eta();
}
pub fn completion_percentage(&self) -> f32 {
(self.downloaded_bytes as f32 / self.total_bytes as f32) * 100.0
}
}
}
Implementation Example
#![allow(unused)]
fn main() {
struct DownloadFile {
header: DownloadHeader,
priorities: Vec<DownloadPriority>,
tags: Vec<DownloadTag>,
entries: Vec<DownloadFileEntry>,
}
impl DownloadFile {
pub fn get_download_plan(
&self,
tags: &[String],
max_priority: u8
) -> DownloadPlan {
let tag_mask = self.build_tag_mask(tags);
let files: Vec<_> = self.entries
.iter()
.filter(|e| e.priority <= max_priority)
.filter(|e| (e.tag_mask & tag_mask) != 0)
.cloned()
.collect();
let total_size = files.iter()
.map(|f| f.file_size as u64)
.sum();
DownloadPlan {
files,
total_size,
estimated_time: self.estimate_time(total_size),
}
}
}
}
On-Demand Streaming
Asset Request Handling
#![allow(unused)]
fn main() {
struct OnDemandManager {
download_file: DownloadFile,
cache: LruCache<[u8; 16], Vec<u8>>,
}
impl OnDemandManager {
pub async fn get_asset(&mut self, encoding_key: &[u8; 16]) -> Result<Vec<u8>> {
// Check cache first
if let Some(data) = self.cache.get(encoding_key) {
return Ok(data.clone());
}
// Find in download manifest
if let Some(entry) = self.find_entry(encoding_key) {
// Download with high priority
let data = self.download_immediate(entry).await?;
self.cache.put(*encoding_key, data.clone());
return Ok(data);
}
Err("Asset not found")
}
}
}
Verification
Checksum Validation
#![allow(unused)]
fn main() {
fn verify_download(
data: &[u8],
entry: &DownloadFileEntry
) -> bool {
if entry.checksum != [0; 16] {
let computed = md5::compute(data);
computed.0 == entry.checksum
} else {
true // No checksum to verify
}
}
}
Common Issues
- Priority conflicts: Multiple systems requesting same file
- Bandwidth throttling: ISP or network limitations
- Incomplete downloads: Handle partial file recovery
- Cache corruption: Verify cached files periodically
- Tag mismatches: Platform detection errors
Special Features
Differential Downloads
Download only changed portions:
#![allow(unused)]
fn main() {
struct DifferentialDownload {
old_version: [u8; 16],
new_version: [u8; 16],
patches: Vec<PatchInfo>,
}
}
Peer-to-Peer Support
Share downloaded content locally:
#![allow(unused)]
fn main() {
struct P2PManager {
local_peers: Vec<PeerInfo>,
shared_files: HashSet<[u8; 16]>,
}
}
Parser Implementation Status
Python Parser (cascette-py)
Status: Complete
Capabilities:
-
Version 1-3 header parsing with DL magic detection
-
40-bit big-endian compressed size parsing
-
Priority system with base priority adjustment (v3)
-
Tag parsing with bitmap support (tags stored after all entries)
-
Platform/architecture tag identification with type classification
-
Sample entry display (first 100 entries)
-
Format evolution tracking across versions
-
BLTE decompression for compressed manifests
-
Correct entry/tag ordering (entries first, then tags)
Verified Against:
-
WoW 11.0.5.57689 (2.4M entries, 28 tags)
-
WoW 9.0.2.37176 (Shadowlands)
-
WoW 7.3.5.25848 (Legion)
-
WoW Classic builds
Known Issues: None
See https://github.com/wowemulation-dev/cascette-py for the Python implementation.
Version History
The Download manifest format has evolved through 3 versions:
Version 1 (Initial)
- Header Size: 11 bytes
- Features: Basic download prioritization with encoding keys, file sizes, optional checksums
- Fields: magic, version, ekey_size, has_checksum, entry_count, tag_count
Version 2 (Flag Support)
- Header Size: 12 bytes
- Added Features: Entry-level flags for additional metadata
- New Fields: flag_size (number of flag bytes per entry, max 4)
- Use Cases: Platform-specific flags, content type markers
Version 3 (Priority System)
- Header Size: 16 bytes
- Added Features: Base priority adjustment for dynamic prioritization
- New Fields: base_priority (signed adjustment), reserved (3 bytes)
- Priority Calculation:
final_priority = entry.priority - header.base_priority
Version Detection
Parsers detect version by reading the version field at offset 2 in the header. All versions use the same “DL” magic bytes and big-endian encoding.
Implementation Status
- cascette-formats: Full support for versions 1-3 with version-aware parsing
- cascette-py: Complete parsing for versions 1-3 with validation
References
-
See Install Manifest for installation management
-
See Encoding Documentation for key resolution
-
See CDN Architecture for download sources
-
See Format Transitions for version evolution timeline
Size Manifest Format
The Size manifest maps encoding keys to estimated file sizes (eSize). It is used when compressed size (cSize) is unavailable, allowing the agent to estimate disk space requirements and report download progress for content that has not yet been downloaded.
Overview
The Size manifest provides:
-
Estimated file sizes for pre-download space allocation
-
Progress bar calculations during installation
-
Disk space requirement checks
-
Fallback sizing when compressed size is unknown
The agent log message “Loose files will estimate using eSize instead of cSize” indicates when this manifest is active.
Build Configuration Reference
The Size manifest is referenced by the size key in build configuration files:
size = d1d9e612a645cc7a7e4b42628bde21ce 0d5704735f4985e555907a7e7647099a
size-size = 3637629 3076687
The first hash is the content key, the second is the encoding key used for CDN
fetch. The size-size field contains the unencoded and encoded sizes. Like other
manifests, the Size manifest is BLTE-encoded on CDN.
The config key .tact:size_manifest also references this manifest in the agent’s
internal configuration.
Community Documentation
This format is documented on wowdev.wiki as the
“Download Size” manifest. The wiki documents version 1 from an older Agent build
(6700). The TACT 3.13.3 agent binary supports versions 1 and 2. The wiki’s
“EKey Size” byte at offset 3 corresponds to the flags field described below.
The version 2 format with its 40-bit total size field is not documented on the
wiki.
File Structure
The Size manifest is BLTE-encoded and contains:
[BLTE Container]
[Header]
[Entries]
Binary Format
All multi-byte integers are big-endian.
Header
struct SizeManifestHeader {
char magic[2]; // "DS" (0x44, 0x53)
uint8_t version; // Version (1 or 2)
uint8_t flags; // Flags byte
uint32_t entry_count; // Number of entries (big-endian)
uint16_t key_size_bits; // Key size in bits (big-endian)
// Version-specific fields follow
};
Version 1 Header Extension (offset 10)
struct SizeManifestHeaderV1 {
// ... base header fields above ...
uint64_t total_size; // Total size across all entries (big-endian)
uint8_t esize_bytes; // Byte width of eSize per entry (1-8)
};
// Total header size: 19 bytes (0x13)
The esize_bytes field determines how many bytes each entry’s size value
occupies. Valid values are 1 through 8. Invalid values produce: “Invalid eSize
byte count ‘%u’ in size manifest header.”
Version 2 Header Extension (offset 10)
struct SizeManifestHeaderV2 {
// ... base header fields above ...
uint8_t total_size[5]; // Total size as 40-bit big-endian integer
};
// Total header size: 15 bytes (0x0F)
Version 2 fixes esize_bytes at 4 (32-bit sizes per entry). The total size
uses a 40-bit integer (5 bytes), reducing header size compared to version 1.
Minimum Size Validation
The parser validates two minimum sizes:
- 15 bytes (0x0F) – enough to read magic, version, entry_count, and key_size_bits
- 19 bytes (0x13) – full version 1 header (version 2 headers are shorter and pass this check)
If the data is too small: “Detected truncated size manifest. Only got %u bytes, but minimum header size is %u bytes.”
Entry Format
Entries are stored sequentially after the header:
struct SizeManifestEntry {
uint8_t key[]; // Encoding key, null-terminated
uint16_t key_hash; // 16-bit hash/identifier (big-endian)
uint8_t esize[]; // Estimated size (esize_bytes width, big-endian)
};
The key field length in bytes is (key_size_bits + 7) / 8, which rounds the
bit count up to the nearest byte. The key is stored as a null-terminated byte
string within this field.
Key Hash Validation
The 2-byte key_hash field after the key is validated. Values 0x0000 and
0xFFFF are treated as invalid sentinel values and cause the parser to reject
the entry.
Entry Size Field
The esize field width depends on the version:
| Version | esize width | Source |
|---|---|---|
| 1 | esize_bytes from header (1-8) | Variable |
| 2 | 4 bytes (fixed) | Hardcoded |
Version History
| Version | Header size | esize width | total_size width | Notes |
|---|---|---|---|---|
| 1 | 19 bytes | Variable (1-8) | 64-bit | Original format, documented on wowdev.wiki |
| 2 | 15 bytes | Fixed (4) | 40-bit | Compact header, undocumented on wiki |
Relationship to Other Manifests
The Size manifest is one of six manifest types in TACT:
| Config key | Magic | Format |
|---|---|---|
encoding | EN | Content key to encoding key mapping |
root | (varies) | Path to content key mapping |
install | IN | Install manifest with file tags |
download | DL | Download manifest with priorities |
patch | PA | Patch manifest for delta updates |
size | DS | Size manifest (this format) |
Validation
The parser validates manifests at parse time and via an explicit validate()
method:
- Entry count matches the header’s
entry_countfield - Sum of all entry esize values matches the header’s
total_sizefield key_size_bitsmust be > 0- Key hash sentinel values (0x0000, 0xFFFF) are rejected
Error Messages
| Condition | Message |
|---|---|
| Truncated data | “Detected truncated size manifest. Only got %u bytes, but minimum header size is %u bytes.” |
| Bad magic | “Invalid magic string in size manifest.” |
| Bad version | “Unsupported size manifest version: %u. This client only supports non-zero versions <= %u” |
| Bad esize width | “Invalid eSize byte count ‘%u’ in size manifest header.” |
| Zero key size | “Invalid key size: key_size_bits must be > 0” |
| Bad key hash | “Invalid key hash sentinel value: 0x{value:04X}” |
| Entry count mismatch | “Entry count mismatch: header says {expected}, found {actual}” |
| Total size mismatch | “Total size mismatch: header says {expected}, sum of esizes is {actual}” |
Implementation Status
Implemented in cascette-formats crate (crates/cascette-formats/src/size/).
The implementation provides:
- Parser and builder for both version 1 and version 2 formats
- Manual
BinRead/BinWriteimplementations for headers and entries - Variable-width esize field support (1-8 bytes for V1, fixed 4 bytes for V2)
- 40-bit total_size handling for V2 headers
- Key hash sentinel validation (rejects 0x0000 and 0xFFFF)
CascFormattrait implementation for round-trip support- Builder pattern for constructing manifests
Archive Files and Indices
CASC/TACT archives are container files that store game content in a packed format. They work with index files to enable efficient content retrieval without unpacking entire archives. The system uses different formats for network (TACT) and local storage (CASC).
Overview
The archive system provides:
-
Bulk storage of game assets in
.archivefiles -
Index files for fast content location
-
Support for partial downloads via HTTP range requests
-
Deduplication through content addressing
Archive Files
CDN Archives vs Local Archives
CDN Archives (TACT - served over HTTP):
-
Named using 32-character hash keys (e.g.,
86b6b0daf3d8ef68271b15567c37300c) -
Accessed via URL path:
/tpr/wow/data/{hash[:2]}/{hash[2:4]}/{hash} -
Paired with Archive Index files (
.index) for content location -
Single BLTE-encoded container format
-
Part of TACT (Tooling for Archive Content Transfer) protocol
Local Client Archives (CASC - stored on disk):
-
Named with numeric indices:
data.001,data.002, etc. -
Use IDX Journal files (
.idx) for local content access -
Multiple BLTE files concatenated together
-
Part of CASC (Content Addressable Storage Container) system
-
Optimized for memory-mapped access
CDN Archive Structure
CDN archives are single BLTE-encoded containers, while local archives contain multiple BLTE files:
CDN Archive Format (TACT): Local Archive Format (CASC):
┌──────────────────┐ ┌──────────────────┐
│ BLTE Container │ │ BLTE File 1 │
├──────────────────┤ ├──────────────────┤
│ Header & Blocks │ │ BLTE File 2 │
├──────────────────┤ ├──────────────────┤
│ Content Blocks │ │ BLTE File 3 │
│ (concatenated) │ │ ... │
└──────────────────┘ └──────────────────┘
Verified Archive Characteristics
Based on examination of sample archives:
-
File sizes: Range from ~7MB to 268MB when compressed
-
Compression ratios: 4.9x to 190x compression achieved via BLTE
-
Content types: WDB Cache files (WDC3), textures, models, and other game assets
-
Decompressed content: Much smaller than archive size (1-2MB typical)
-
Access pattern: Content addressed via hash keys in index files
CRITICAL: Two Completely Different Index Systems
⚠️ CDN Archive Index (.index) vs Local Storage Index (.idx)
NEVER CONFUSE THESE TWO FORMATS - THEY ARE COMPLETELY DIFFERENT:
- CDN Archive Index Files (.index): TACT format with 28-byte footer, variable-length encoding keys
- Local Storage Index Files (.idx): CASC format with header, fixed 9-byte content key buckets
These systems serve different purposes and use entirely different formats, key types, and data structures.
CDN Archive Index Format (TACT Protocol)
File Extension: .index
Location: Downloaded from CDN
Purpose: Maps variable-length encoding keys to CDN archive locations
Key Type: Encoding keys (from Encoding file)
Key Length: Variable, as specified in footer’s ekey_length field
(typically 16 bytes, sometimes 9)
Implementation: cascette-formats/src/archive/index.rs
Archive Index Files (.index) - TACT Protocol
Based on analysis of actual CDN index files from various WoW builds.
CDN archive indexes use a chunk-based format with footer metadata:
Archive Index Structure
Index File Layout:
┌────────────────┐
│ Data Chunks │ <- 4KB chunks containing entries
│ (4096 bytes) │
├────────────────┤
│ ... │
├────────────────┤
│ Last Chunk │ <- Table of contents + entries
├────────────────┤
│ Footer │ <- Metadata (variable length)
└────────────────┘
CDN Index Entry Format (Variable Length)
struct CDNArchiveIndexEntry {
uint8_t ekey[ekey_length]; // Encoding key (variable length from footer)
uint32_t encoded_size; // BLTE encoded size (big-endian)
uint32_t archive_offset; // Offset in archive (big-endian)
};
Entry Size: Variable = ekey_length + size_bytes + offset_bytes (from footer)
Typical Sizes:
- With 16-byte keys:
16 + 4 + 4 = 24 bytesper entry - With 9-byte keys:
9 + 4 + 4 = 17 bytesper entry
Key Properties:
- Encoding key length specified in footer’s
ekey_lengthfield - All multi-byte fields use big-endian encoding
- NEVER assume fixed 9-byte keys - always read from footer
Archive Index Footer (TACT)
Archive Index files use a 28-byte footer at the end of the file:
struct ArchiveIndexFooter { // 28 bytes total
uint8_t toc_hash[8]; // MD5(toc_keys || block_hashes)[:footer_hash_bytes]
uint8_t version; // Must be 0 or 1
uint8_t reserved[2]; // Must be [0, 0]
uint8_t page_size_kb; // Must be 4 (4KB pages)
uint8_t offset_bytes; // Archive offset field size (4, 5, or 6)
uint8_t size_bytes; // Compressed size field size (always 4)
uint8_t ekey_length; // EKey length in bytes (16 for full MD5)
uint8_t footer_hash_bytes; // Footer hash length (always 8)
uint32_t element_count; // Number of entries (little-endian - special case!)
uint8_t footer_hash[8]; // MD5 footer validation (first 8 bytes)
};
Verified Footer Properties:
-
Standard values: offset_bytes=4, size_bytes=4, ekey_length=16 (1-16 valid)
-
offset_bytes can be 4 (regular archives), 5 (archives >4GB), or 6 (archive-groups: 2-byte archive index + 4-byte offset)
-
Page/chunk size consistently 4096 bytes
-
Item length consistently 24 bytes (0x18)
-
Archive filename = MD5 hash of the footer
-
Footer validation uses MD5 hashing (first 8 bytes of hash)
-
Mixed endianness: element_count field is little-endian while all other
multi-byte fields are big-endian
-
TOC hash field is present but not validated in practice. No known reference implementation (CascLib, TACT.Net, rustycasc) validates this field. Testing against real files shows the stored values do not match any standard hash algorithm applied to the TOC data
Implementation Notes:
-
Extended Block Offsets: The agent logs “Archive w/ Extended Block Offset Found” for archive index entries that use larger-than-4-byte offsets (for archives exceeding 4GB)
-
Archive Count Limit: The agent has a
casc_supports_1023_archivesconfiguration flag, indicating a maximum of 1023 archives per CASC storage
Sample Analysis Results
File Sizes Observed:
-
Small indexes: ~8KB (few hundred entries)
-
Medium indexes: ~50-200KB (thousands of entries)
-
Large indexes: ~300KB+ (tens of thousands of entries)
Index Distribution (from sample builds):
-
WoW retail: 400-1400+ archives per build
-
WoW Classic: 1000-1400+ archives per build
-
Beta builds: 400-800 archives per build
Chunk Structure:
-
All indexes use 4KB chunks. Max entries per chunk =
4096 / (ekey_length + offset_bytes + size_bytes). With default 16+4+4 fields: 170 entries per chunk. -
Table of contents (TOC) is stored after data chunks and contains two sections:
- Last encoding key of each data chunk (for binary search)
- Per-block MD5 hash of each data chunk (truncated to
footer_hash_bytes)
-
TOC hash =
MD5(toc_keys || block_hashes)[:footer_hash_bytes] -
Chunk structure enables streaming and memory-efficient processing
-
Chunks are padded with zeros to maintain 4KB alignment
Archive Index Access Pattern
CDN URL Format:
https://cdn.domain.com/tpr/wow/data/{hash[:2]}/{hash[2:4]}/{hash}.index
Lookup Process:
- Get archive content key from CDN configuration
- Append ‘.index’ to form index URL
- Fetch and parse index file
- Search entries for target EKey
- Use offset/size to retrieve from corresponding .archive file
Self-Referential Naming:
The archive index filename (hash) is the MD5 of its own footer structure, providing a unique identifier that validates the index contents.
Local Storage Index Format (.idx files)
File Extension: .idx
Location: Client-side storage directory (Data/data/)
Purpose: Maps content keys to local data file locations using bucket algorithm
Key Type: Content keys (MD5 hashes from Root file)
Key Length: ALWAYS 9 bytes (truncated for space efficiency in local storage)
Implementation: cascette-client-storage/src/index.rs
See the comparison table at the end of this document for a full side-by-side comparison.
IDX Journal Files (.idx) - CASC Local Storage
Local CASC storage uses IDX Journal files for indexing:
IDX Journal Structure
struct IDXJournalHeader { // 18 bytes + block table
uint32_t data_size; // Size of header data
uint32_t data_hash; // Jenkins hash validation
uint16_t version; // Journal version
uint8_t bucket; // Bucket ID (0x00-0xFF)
uint8_t unused; // Padding
uint8_t length_size; // Size field bytes
uint8_t location_size; // Location field bytes (5 = 1 archive + 4 offset)
uint8_t key_size; // Key field bytes (9 or 16)
uint8_t segment_bits; // Segment size bits
// Followed by block table entries
};
Key Differences from Archive Indexes:
-
Bucket-based structure (256 buckets, 00-FF)
-
Jenkins hash validation instead of footer hash
-
Fixed key sizes (not truncated)
-
Header at start instead of footer at end
-
One journal file per bucket
Loose Files Index
For files not in archives:
struct LooseFilesIndex {
uint32_t magic; // 'LIDX'
uint32_t version;
uint32_t entry_count;
struct Entry {
uint8_t encoding_key[16];
uint32_t file_size;
uint8_t file_hash[16]; // For verification
} entries[];
};
Archive Lookup Process
- Get encoding key: From encoding file lookup
- Check indices: Search all index files for key
- Locate in archive: Extract offset and size
- Retrieve data: Read from archive at offset
- Decompress: Process BLTE container
Implementation Example
#![allow(unused)]
fn main() {
struct ArchiveIndex {
header: ArchiveIndexHeader,
entries: Vec<ArchiveIndexEntry>,
}
impl ArchiveIndex {
pub fn find_file(&self, encoding_key: &[u8]) -> Option<(u64, u32)> {
// Truncate search key to index key size
let search_key = &encoding_key[..self.header.key_size as usize];
// Binary search entries (sorted by key)
let idx = self.entries.binary_search_by_key(
&search_key,
|e| &e.key[..]
).ok()?;
let entry = &self.entries[idx];
Some((entry.offset, entry.size))
}
}
}
HTTP Range Requests
For CDN retrieval without downloading entire archives:
GET /data/5e/16/5e16b6ff530b1816c7b32296e0875ed4 HTTP/1.1
Host: cdn.example.com
Range: bytes=1048576-2097151
Response:
HTTP/1.1 206 Partial Content
Content-Range: bytes 1048576-2097151/134217728
Content-Length: 1048576
Archive Creation
When building archives:
- Group related files: Minimize seeks during loading
- Align boundaries: 4KB alignment for efficient I/O
- Order by access: Frequently accessed files first
- Compress individually: Each file is BLTE-encoded
- Update indices: Generate index entries
Optimization Strategies
Memory Mapping
For local archives:
#![allow(unused)]
fn main() {
use memmap2::MmapOptions;
struct ArchiveReader {
mmap: Mmap,
}
impl ArchiveReader {
pub fn read_file(&self, offset: u64, size: u32) -> &[u8] {
let start = offset as usize;
let end = start + size as usize;
&self.mmap[start..end]
}
}
}
Index Caching
Keep frequently used indices in memory:
#![allow(unused)]
fn main() {
struct IndexCache {
indices: HashMap<String, Arc<ArchiveIndex>>,
lru: LruCache<String, ()>,
}
}
Archive Validation
Checksum Verification
When checksums are present:
#![allow(unused)]
fn main() {
fn verify_file(data: &[u8], expected_checksum: &[u8; 16]) -> bool {
let computed = md5::compute(data);
computed.0 == *expected_checksum
}
}
Size Validation
Always verify extracted size matches expected:
#![allow(unused)]
fn main() {
if decompressed.len() != expected_size as usize {
return Err("Size mismatch");
}
}
Common Issues
- Key collisions: Truncated keys may collide (handle gracefully)
- Archive corruption: Verify checksums when available
- Missing indices: Some files may only exist as loose files
- Version mismatches: Handle different index versions
- Alignment padding: Account for alignment bytes
Archive Groups
Archive Groups are client-generated mega-indices that combine multiple CDN archive
indices into a single lookup structure, reducing search time from scanning hundreds
of individual .index files to a single binary search. They use 6-byte offset
fields (2-byte archive index + 4-byte offset) and are identified by archive-group
and patch-archive-group fields in CDN config.
See Archive-Groups for the full format specification.
File Organization
Typical CASC repository structure:
data/
├── config/ # Configuration files
├── data/ # Archive files
│ ├── 00/
│ │ ├── 00/{hash}.archive
│ │ └── ...
│ └── ff/
│ └── ff/{hash}.archive
├── indices/ # Index files
│ ├── {hash}.index
│ └── ...
└── patch/ # Patch archives
Version History
CDN Archive Index Format (.index files)
The CDN Archive Index format currently has only one version:
Version 1 (Current)
- Footer Size: 28 bytes
- Location: End of file
- Features:
- Variable-length encoding keys (footer’s
ekey_lengthfield) - 4KB chunk-based structure with table of contents
- MD5 hash validation (footer hash and TOC hash)
- Self-referential naming (filename = MD5 of footer)
- Mixed endianness (element_count is little-endian, others big-endian)
- Typical entry size: 24 bytes (16-byte key + 4-byte size + 4-byte offset)
- Variable-length encoding keys (footer’s
Version Detection
The version field is at offset 8 in the 28-byte footer. All known CDN archive indices use version 1.
Implementation Status
- cascette-formats: Full support for version 1 with parser
- Archive-groups: Client-side mega-indices combine multiple CDN indices (6-byte offset variant)
Local Storage Index Format (.idx files)
The Local Storage Index (IDX Journal) format currently has only one version:
Version 7 (Current - IDX Journal v7)
- Header Size: 16 bytes
- Location: Start of file
- Features:
- Fixed 9-byte truncated content keys (space optimization)
- 18-byte entries (9-byte key + 5-byte location + 4-byte size)
- 256 bucket-based organization (0x00-0xFF)
- Packed 5-byte location field (10-bit archive ID + 30-bit offset)
- Jenkins hash validation
- Mixed endianness (header little-endian, entries mixed)
- Bucket algorithm: XOR first 9 bytes, then XOR nibbles
- Filename format:
{bucket:02x}{version:06x}.idx
Version Detection
The version field is at offset 8 in the header (16-bit little-endian). The implementation validates version equals 7 and warns on unexpected versions.
Implementation Status
- cascette-client-storage: Full support for version 7 with parser and builder
- No earlier versions documented (version 7 is standard for modern CASC)
Key Differences Between Index Systems
| Feature | CDN Index (.index) | Local Index (.idx) |
|---|---|---|
| Version | 1 (footer-based) | 7 (header-based) |
| Protocol | TACT (network) | CASC (local storage) |
| Key Type | Encoding keys | Content keys |
| Key Length | Variable (16 typical) | Fixed 9-byte truncated |
| Structure | Sequential chunks | Bucket algorithm |
| Validation | MD5 hash | Jenkins hash |
| Endianness | Mixed (mostly big) | Mixed (header little) |
| Entry Size | Variable (24 typical) | Fixed 18 bytes |
| Location | CDN download | Client Data/ directory |
| Crate | cascette-formats | cascette-client-storage |
References
-
See Encoding Documentation for key lookup
-
See BLTE Format for archive content structure
-
See CDN Architecture for remote retrieval
-
See Format Transitions for format evolution tracking
Archive-Groups
Archive-groups are locally generated mega-indices that combine multiple CDN archive indices into a single unified lookup structure. They are created client-side by merging downloaded archive index files, never downloaded directly from the CDN. They are essential for Battle.net client compatibility and enable efficient content resolution.
Format Specification
Archive-groups use the same binary format as regular CDN archive indices with one critical difference:
| Field | Regular Index | Archive-Group |
|---|---|---|
| Encoding Key | Variable (9-16 bytes) | 16 bytes |
| Offset | 4 bytes | 6 bytes |
| Size | 4 bytes | 4 bytes |
The 6-byte offset field contains:
- Bytes 0-1: Archive index (big-endian u16)
- Bytes 2-5: Offset within archive (big-endian u32)
Critical Findings - SOLVED
Archive Index Mapping Uses Hash-Based Assignment
CONFIRMED: ALL archive-groups use the full u16 range (0-65535) for archive indices:
archive_index = hash(encoding_key) % 65536
This explains why:
- All archive-groups use indices 0-65535 despite only ~606 CDN archives existing
- Archive 0 consistently receives 6-8% of entries (hash distribution)
- The pattern is universal across all Battle.net installations
- Archive-groups are generated locally using this deterministic hash-based assignment algorithm
CDN Configuration
Archive-groups are referenced in CDN config files by their hash:
archive-group = 6d08c5f69f6a2cf70a50cd40efdcd2fb
patch-archive-group = a5fb3ed088333348d93983d7e8693956
These hashes identify the locally generated archive-group files stored in
Data/indices/. The client generates these files locally and stores them using
the computed hash as the filename.
Size Characteristics
Archive-groups are significantly larger than regular indices:
- Regular CDN indices: 4KB - 2MB
- Archive-groups: 50MB - 150MB
- Entry count: 2-5 million entries
Growth over time (WoW Classic):
- Version 1.13.2: 54MB, 2.1M entries
- Version 1.14.0: 73MB, 2.8M entries
- Version 1.15.2: 126MB, 5.0M entries
Archive Index Distribution
Due to hash-based assignment, archive indices follow a predictable distribution:
- Archive 0: ~6-8% of entries (150K-350K entries)
- Archive 1: ~0.6% of entries (13K entries)
- Archive 2-65535: Distributed based on hash function
This distribution is consistent across all Battle.net installations.
Implementation Requirements
Detection
#![allow(unused)]
fn main() {
fn is_archive_group(data: &[u8]) -> bool {
if data.len() < 28 {
return false;
}
// Check offset_bytes field at position -16 from end
data[data.len() - 16] == 6
}
}
Parsing
#![allow(unused)]
fn main() {
// For archive-groups with 6-byte offsets
let archive_index = u16::from_be_bytes([data[pos], data[pos + 1]]);
let offset = u32::from_be_bytes([data[pos + 2], data[pos + 3], data[pos + 4], data[pos + 5]]);
}
Content Resolution
When resolving content in a Battle.net-compatible installation:
- Look up encoding key in archive-group
- Extract 2-byte archive index from entry
- Map archive index to actual CDN archive (requires mapping table)
- Read content from archive at specified offset
Implementation Strategy for Cascette
To achieve binary-identical Battle.net installations:
Required Actions
-
Generate Archive-Groups Locally
- Parse CDN config to find all individual archive index hashes
- Download all individual
.indexfiles from CDN - Merge them locally into unified archive-group structures
- Store generated archive-groups in
Data/indices/using computed hash as filename
-
Implement Hash-Based Archive Assignment
- Use deterministic algorithm:
archive_index = hash(encoding_key) % 65536 - Ensure identical results to Battle.net client generation
- Apply to all entries during archive-group creation
- Use deterministic algorithm:
-
Implement Archive Index Mapping
- Create mapping table:
archive_group_index -> actual_cdn_archive_hash - The 65536 virtual indices map to ~606 actual CDN archives
- Use for content resolution when accessing actual archive data
- Create mapping table:
-
Support Both Types
- Generate regular archive-group for main content from base archive indices
- Generate patch-archive-group for patch content from patch archive indices
- Both use same local generation process with 6-byte offsets
Why Binary-Identical Matters
For cascette to be a trustworthy Battle.net replacement:
- Trust: Users need confidence we produce EXACTLY what Battle.net would
- Compatibility: Some third-party tools may depend on exact format
- Verification: Binary matching allows easy validation
- Completeness: Understanding the full algorithm proves our analysis
Footer Structure
Archive-groups are identified by the offset_bytes field in the footer:
Footer (28 bytes):
[0:8] TOC hash: MD5(toc_keys || block_hashes)[:footer_hash_bytes]
[8] Version (always 1)
[9:11] Reserved
[11] Page size in KB
[12] Offset bytes (4 for regular, 6 for archive-group)
[13] Size bytes (always 4)
[14] Key bytes (16 for archive-groups)
[15] Footer hash bytes
[16:20] Entry count (little-endian u32)
[20:28] Footer hash
Example Archive-Group Entry
Entry from 6d08c5f69f6a2cf70a50cd40efdcd2fb.index:
Key: 000003bafc39011c91accae47b94fb2d (16 bytes)
Archive: 0 (from first 2 bytes of offset field)
Offset: 0x5dfd00d7 (from last 4 bytes of offset field)
Size: 92,211,754 bytes
This entry indicates:
- Content is in archive index 0
- Starts at offset 0x5dfd00d7 in that archive
- Compressed size is 92,211,754 bytes
Validation
Archive-groups contain entries for all game content:
- Every encoding key should be findable
- Archive indices use full u16 range (0-65535)
- Entries are sorted by encoding key for binary search
- Total entries match the entry_count in footer
Battle.net Client Behavior
The Battle.net client:
- Downloads individual archive index files during installation
- Generates archive-group locally by merging multiple archive indices
- Stores generated archive-group in
Data/indices/{hash}.index - Uses hash-based assignment algorithm for consistent archive index mapping
- Uses archive-group for all subsequent content lookups
Common Issues
Incorrect Detection
- Checking file size alone is insufficient
- Must verify
offset_bytes == 6in footer - Some patch archives are large but not archive-groups
Index Mapping Confusion
- Archive index in archive-group ≠ CDN archive position
- Indices 0-65535 map to ~600 actual archives
- Mapping requires modulo or lookup table
Parser Assumptions
- Never hardcode 9-byte keys for archive-groups
- Archive-groups always use 16-byte keys
- Respect the
key_bytesfield in footer
References
- Analysis of WoW Classic installations (1.13.2 through 1.15.2)
- wowdev.wiki Archive documentation
- Empirical testing with cascette-py parser
TVFS (TACT Virtual File System)
TVFS is the virtual file system introduced in WoW 8.2 (CASC v3), providing a unified interface for managing content across multiple products and build configurations. It replaces direct file path mappings with a more flexible namespace-based system.
How TVFS is Accessed
TVFS manifests are referenced through vfs-* fields in BuildConfig files:
- BuildConfig contains
vfs-rootand numberedvfs-1throughvfs-Nfields - Each VFS field contains two hashes: content key and encoding key
- The encoding key (second hash) is used to fetch the TVFS manifest from CDN
- The manifest is BLTE-encoded and must be decompressed
- Once decoded, the manifest describes the virtual file system structure
Example from BuildConfig:
vfs-root = fd2ea24073fcf282cc2a5410c1d0baef 14d8c981bb49ed169e8558c1c4a9b5e5
vfs-root-size = 50071 33487
Modern builds contain 1,500+ VFS entries for different product/region/platform combinations.
Overview
TVFS organizes content into namespaces rather than per-build file trees. This allows multiple products and regions to share common assets through a single content-addressed storage layer, with deduplication across products.
Architecture
Namespace Hierarchy
TVFS Root
├── Product Namespace (e.g., "wow")
│ ├── Build Namespace (e.g., "1.15.7.61582")
│ │ ├── Root Files
│ │ └── Content Trees
│ └── Shared Namespace
│ └── Common Assets
└── Global Namespace
└── Cross-Product Assets
File Structure
TVFS manifest is BLTE-encoded:
[BLTE Container]
[Header]
[Namespace Definitions]
[Directory Entries]
[File Entries]
[Content Mappings]
Binary Format
Based on analysis of 5 TVFS samples from WoW builds 11.0.2.56313 through 11.2.0.62748.
TVFS Header
struct TvfsHeader { // 38 bytes minimum, 46 with EST table
uint8_t magic[4]; // "TVFS" (0x54564653)
uint8_t format_version; // Format version (1; agent accepts <= 1)
uint8_t header_size; // Header size (not read by agent parser)
uint8_t ekey_size; // EKey size (always 9)
uint8_t pkey_size; // PKey size (always 9)
uint32_t flags; // Format flags (big-endian)
uint32_t path_table_offset; // Offset to path table (big-endian)
uint32_t path_table_size; // Size of path table (big-endian)
uint32_t vfs_table_offset; // Offset to VFS table (big-endian)
uint32_t vfs_table_size; // Size of VFS table (big-endian)
uint32_t cft_table_offset; // Offset to container file table (big-endian)
uint32_t cft_table_size; // Size of container file table (big-endian)
uint16_t max_depth; // Maximum path depth
// Optional EST fields (only present if TVFS_FLAG_ENCODING_SPEC is set)
uint32_t est_table_offset; // Encoding spec table offset
uint32_t est_table_size; // Encoding spec table size
};
Verified Header Properties:
-
Magic bytes: Always “TVFS” (0x54564653) in ASCII
-
Format version: Always 1 across all samples
-
Header size: 38 bytes minimum, 46 with EST table
-
EKey size: 9 bytes (TACT standard)
-
PKey size: 9 bytes (TACT standard)
-
All multi-byte integer fields are big-endian (NGDP standard)
Format Flags (Implementation Details):
#![allow(unused)]
fn main() {
// TVFS format flags
const TVFS_FLAG_INCLUDE_CKEY: u32 = 0x01; // Include content keys
const TVFS_FLAG_ENCODING_SPEC: u32 = 0x02; // Encoding spec table (EST) present
const TVFS_FLAG_PATCH_SUPPORT: u32 = 0x04; // Patch support enabled
}
-
Value 7 (0x7): Include C-key + Encoding spec + Patch support (all features)
-
EST Table Present: When bit 1 (0x02) is set. The agent checks
flags & 2for encoding specifier presence. -
Header Size: 38 bytes minimum (without EST), 46 bytes with EST table fields
Sample Analysis Results:
-
File sizes: 49,896 - 50,844 bytes (decompressed)
-
All files use identical header format
-
Table offsets and sizes are consistent with file structure
-
Two retail builds (11.2.0.62706 and 11.2.0.62748) are byte-identical
Table Structure
Path Table (PathTableOffset + PathTableSize):
Recursive prefix tree (trie) encoding file paths. Each entry has:
- Optional
0x00path separator bytes (before/after name fragments) - Length-prefixed name fragment (1-byte length + N bytes)
0xFFmarker followed by 4-byte big-endian NodeValue:- Bit 31 set: folder node, lower 31 bits = folder data length (includes the 4-byte NodeValue). Children are inline within that byte range.
- Bit 31 clear: file node, value = byte offset into the VFS table.
Maximum depth is tracked in the header.
VFS Table (VfsTableOffset + VfsTableSize):
Span-based entries addressed by byte offset from path table NodeValues. Each entry has:
span_count(1 byte): 1-224 = file entry, 225-254 = other, 255 = deleted- Per span (repeated
span_counttimes):file_offset(4 bytes BE): offset within the referenced contentspan_length(4 bytes BE): content size of this spancft_offset(CftOffsSize bytes BE): byte offset into the CFT
CftOffsSize is computed from cft_table_size using GetOffsetFieldSize:
>0xFFFFFF = 4 bytes, >0xFFFF = 3 bytes, >0xFF = 2 bytes, else 1 byte.
Container File Table (CftTableOffset + CftTableSize):
Fixed-stride entries addressed by byte offset from VFS span cft_offset
values. Entry layout depends on header flags:
EKey(ekey_size bytes): encoding keyEncodedSize(4 bytes BE): encoded (compressed) sizeCKey(pkey_size bytes): content key (ifTVFS_FLAG_INCLUDE_CKEY)est_index(EstOffsSize bytes BE): EST entry index (ifTVFS_FLAG_ENCODING_SPEC)patch_offset(CftOffsSize bytes BE): patch entry offset (ifTVFS_FLAG_PATCH_SUPPORT)
EstOffsSize is computed from est_table_size using the same
GetOffsetFieldSize function as CftOffsSize.
Encoding Specifier Table (EST) (Optional, if encoding spec flag is set):
-
Contains null-terminated encoding spec strings (same format as the ESpec table in the encoding file)
-
Only present if flag bit 1 (0x02) is set
-
Required for writing files to underlying storage
-
Parsed from
est_table_offsetforest_table_sizebytes
Sample Table Sizes (Build 11.2.0.62748):
Path Table: Offset 46, Size 11,814 bytes
VFS Table: Offset 41,527, Size 9,317 bytes
Container Table: Offset 11,882, Size 29,645 bytes
Format Analysis Status
Verified against CascLib and CDN data (WoW Retail, Classic, Classic Era):
- Header format, magic bytes, flags, and table offsets
- Path table recursive prefix tree with 0xFF NodeValue markers
- VFS span-based entries with variable-width CFT offsets
- CFT fixed-stride entries with flag-dependent fields
- EST null-terminated encoding spec strings
- Round-trip parse/build produces structurally equivalent output
Usage
Parsing a TVFS Manifest
#![allow(unused)]
fn main() {
use cascette_formats::tvfs::TvfsFile;
// From decompressed data
let tvfs = TvfsFile::parse(&data)?;
// From BLTE-encoded CDN data
let tvfs = TvfsFile::load_from_blte(&blte_data)?;
}
Enumerating Files
#![allow(unused)]
fn main() {
// All file entries from the path table
for file in &tvfs.path_table.files {
println!("{} -> VFS offset {}", file.path, file.vfs_offset);
}
// With VFS entry details
for (file, vfs_entry) in tvfs.enumerate_files() {
if let Some(entry) = vfs_entry {
for span in &entry.spans {
println!("{}: offset={}, length={}, cft_offset={}",
file.path, span.file_offset, span.span_length, span.cft_offset);
}
}
}
}
Resolving a Path
#![allow(unused)]
fn main() {
// Resolve path -> VFS entry -> CFT entry (EKey)
if let Some(container_entry) = tvfs.resolve_path("path/to/file") {
println!("EKey: {}", container_entry.ekey_hex());
if let Some(ckey) = container_entry.content_key_hex() {
println!("CKey: {}", ckey);
}
}
}
Building a TVFS Manifest
#![allow(unused)]
fn main() {
use cascette_formats::tvfs::TvfsBuilder;
let mut builder = TvfsBuilder::with_flags(0x07); // CKEY + EST + PATCH
builder.add_est_spec("b:256K*=z".to_string());
builder.add_file(
"path/to/file".to_string(),
[0x01; 9], // ekey
1024, // encoded_size
2048, // content_size
Some([0x02; 16]), // content_key
);
let data = builder.build()?;
}
References
-
See Root File for legacy file mapping
-
See Encoding Documentation for content resolution
-
See Archives for storage details
NGDP Configuration File Formats
This document describes the configuration file formats used in NGDP for managing product versions, CDN endpoints, and content distribution.
Overview
NGDP uses five configuration file types:
- Build Configuration - Defines build metadata and system file references
- CDN Configuration - Lists CDN servers and available archives
- Patch Configuration - Contains delta update information
- Keyring Configuration - Encryption keys for Salsa20 decryption
- Product Configuration - Client installation and platform metadata
Configuration File Access
Configuration files are accessed through CDN endpoints using content-addressed paths derived from hashes returned by the Ribbit API.
Path Structure
Configuration files use a two-level directory structure for efficient CDN distribution:
http://<cdn-host>/<path>/<type>/<hash[0:2]>/<hash[2:4]>/<full-hash>
Where:
-
<cdn-host>: CDN server hostname -
<path>: Base path from CDN response (e.g.,tpr/wow) -
<type>: Content type (config,data,patch) -
<hash[0:2]>: First 2 characters of hash -
<hash[2:4]>: Characters 3-4 of hash (positions 2-3 in 0-indexed) -
<full-hash>: Complete hash value
Example:
# Build config for wow_classic_era 1.15.7.61582
# Hash: ae66faee0ac786fdd7d8b4cf90a8d5b9
# Note: hash[0:2] = "ae", hash[2:4] = "66"
http://cdn.arctium.tools/tpr/wow/config/ae/66/ae66faee0ac786fdd7d8b4cf90a8d5b9
Build Configuration
Build configurations define build-specific metadata and reference all system files required for a build.
Format
Key-value pairs, one per line, with = delimiter (space-equals-space).
Values may contain multiple space-separated tokens (e.g., content key + encoding key).
Common Keys
| Key | Description | Example |
|---|---|---|
root | Root file content key (NOT for direct CDN fetch) | ea8aefdebdbd6429da905c8c6a2b1813 |
install | Install manifest: content key + encoding key | 54c189d60033f93f42e7b91165e7de1c a9dcee49ab3f952d69441eb3fd91c159 |
encoding | Encoding file: content key + encoding key (use 2nd for CDN) | b07b881f4527bda7cf8a1a2f99e8622e bbf06e7476382cfaa396cff0049d356b |
encoding-size | Sizes for encoding file versions | 14004322 14003043 |
download | Download manifest: content key + encoding key | 42a7bb33cd1e9a7b72bef6ee14719b58 53ba96f0965adc306d2d0cf3b457949c |
size | Size manifest: content key + encoding key | d1d9e612a645cc7a7e4b42628bde21ce 0d5704735f4985e555907a7e7647099a |
patch | Patch file content key | 658506593cf1f98a1d9300c418ee5355 |
patch-config | Patch configuration hash (fetch separately) | 17f5bbcb7eae2fc8fb3ea545c65f74d4 |
patch-index | Patch index files | 3806f4c7b1f179ce976d7685f9354025 eb5758bd78805f0aabac15cf44ea767c |
patch-size | Size of patch file | 22837 |
build-name | Human-readable build identifier | WOW-55646patch1.15.3_ClassicRetail |
build-uid | Unique build identifier | wow_classic_era |
build-product | Product identifier | WoW |
build-playbuild-installer | Installer build number | ngdp:wow_classic_era:55646 |
build-partial-priority | Partial download priorities | Space-separated list |
build-num | Build number | 61582 |
build-num-retail | Retail build number | 61582 |
build-attributes | Build attribute metadata | Attribute string |
build-file-db | File database for containerless builds | Hash value |
build-file-db-size | Size of file database | Size in bytes |
client-version | Client version string | Version string |
feature-placeholder | Feature placeholder flag | true or absent |
feature-use-hardlinks | Enable hard link support | true or absent |
no-frame-encoding | Disable frame encoding (sets v3.0.0) | true or absent |
vfs-root-espec | ESpec for VFS root manifest | ESpec string |
install-high-ver | High-version install manifest hash | Hash value |
install-high-ver-size | Size of high-version install | Size in bytes |
key-layout-index-bits | Static key layout index bits | Numeric value |
VFS (Virtual File System) Keys
Modern WoW builds (8.2+) include VFS fields that reference TVFS (TACT Virtual File System) manifests:
| Key | Description | Example |
|---|---|---|
vfs-root | Main TVFS manifest: content key + encoding key | fd2ea24073fcf282cc2a5410c1d0baef 14d8c981bb49ed169e8558c1c4a9b5e5 |
vfs-root-size | Sizes for TVFS root manifest | 50071 33487 |
vfs-1 through vfs-N | Additional TVFS manifests for different products/regions | Same format as vfs-root |
vfs-N-size | Size for corresponding VFS manifest | Same format as vfs-root-size |
vfs-N-espec | Encoding spec for corresponding VFS manifest | ESpec string |
Important: Each vfs-* field points to a TVFS manifest file that contains
the virtual file system structure. These manifests are BLTE-encoded and
fetched using the encoding key (second hash). See TVFS documentation
for manifest format details.
Modern builds can have 1,500+ VFS fields representing different:
-
Product variants (retail, PTR, beta)
-
Language/region combinations
-
Platform-specific configurations
-
Feature flags and optional content
Example
# Build Configuration for wow_classic_era 1.15.7.61582
# URL: http://cdn.arctium.tools/tpr/wow/config/ae/66/ae66faee0ac786fdd7d8b4cf90a8d5b9
root = ea8aefdebdbd6429da905c8c6a2b1813
install = 54c189d60033f93f42e7b91165e7de1c a9dcee49ab3f952d69441eb3fd91c159
install-size = 23038 22281
download = 42a7bb33cd1e9a7b72bef6ee14719b58 53ba96f0965adc306d2d0cf3b457949c
download-size = 5606744 4818287
size = d1d9e612a645cc7a7e4b42628bde21ce 0d5704735f4985e555907a7e7647099a
size-size = 3637629 3076687
encoding = b07b881f4527bda7cf8a1a2f99e8622e bbf06e7476382cfaa396cff0049d356b
encoding-size = 14004322 14003043
patch-index = 5472ee24b5b9d148acfd2a436fc514be 76ce88ecb704dc93849def9fb489a6fb
patch-index-size = 16783 6591
patch = 4f185b4a837d4a363b2490432aaef092
patch-size = 11017
patch-config = 474b9630df5b46df5d98ec27c5f78d07
build-name = WOW-61582patch1.15.7_ClassicRetail
build-uid = wow_classic_era
build-product = WoW
build-playbuild-installer = ngdptool_casc2
Critical Implementation Note
ENCODING KEY VS CONTENT KEY:
-
Most build config entries have TWO hashes:
<content-key> <encoding-key> -
The content key (first hash) is the unencoded file identifier
-
The encoding key (second hash) is what you use for CDN fetches
-
EXCEPTION: The encoding file itself can be fetched directly using its
encoding key
File Fetch Process:
- Fetch encoding file using its encoding key:
bbf06e7476382cfaa396cff0049d356b - Parse encoding file to find encoding keys for other files
- Use those encoding keys to fetch files from CDN
- The root file CANNOT be fetched using
ea8aefdebdbd6429da905c8c6a2b1813directly
Notes
-
Multiple encoding/size entries support different compression levels
-
Patch-config reference enables delta updates between builds
-
Build-partial-priority lists files for streaming installation
Static Key Layouts
Build configs can contain key-layout-<number> entries that define static
data layout schemes. Each key layout has sub-fields:
Chunk Bits: Number of bits for chunk addressingArchive Bits: Number of bits for archive addressingOffset Bits: Number of bits for offset addressingAlignment: Data alignment requirement
The key-layout-index-bits field in the build config specifies the number
of index bits for the static key layout system.
Chunk System
Build configs can reference chunk-<number> entries. Chunks are associated
with archives and use a bits-based addressing system. The agent validates
that chunk identifiers follow the chunk-<number> naming pattern.
Hard Link Entries
Build configs can contain hard link entries. The agent validates the format of these entries and uses them for storage optimization on file systems that support hard links.
Manifest Validation
The agent validates that each manifest type (download, install, size, encoding) has matching C-Key/C-Size and E-Key/E-Size pairs. If a size field is specified, the corresponding key must also be present.
CDN Configuration
CDN configurations list available CDN servers and archive files.
CDN Configuration Format
Key-value pairs with special handling for multi-value keys.
Keys
| Key | Description | Format |
|---|---|---|
archives | List of archive hashes | Space-separated |
archive-group | Group identifier for archives | Single hash |
patch-archives | List of patch archive hashes | Space-separated |
patch-archive-group | Group identifier for patch archives | Single hash |
file-index | File index hash | Single hash |
file-index-size | Size of file index | Integer |
patch-file-index | Patch file index hash | Single hash |
patch-file-index-size | Size of patch file index | Integer |
archives-index-size | Sizes of archive index files | Space-separated integers |
archive-group-index-size | Size of archive group index | Integer |
patch-archives-index-size | Sizes of patch archive index files | Space-separated integers |
patch-archive-group-index-size | Size of patch archive group index | Integer |
builds | Reference to builds using this CDN config | Space-separated |
CDN Configuration Example
# CDN Configuration for wow_classic_era 1.15.7.61582
# URL: http://cdn.arctium.tools/tpr/wow/config/63/ee/63eee50d456a6ddf3b630957c024dda0
# (Showing first 10 archives of 1000+)
archives = 0017a402f556fbece46c38dc431a2c9b 003b147730a109e3a480d32a54280955 \
00b79cc0eebdd26437c7e92e57ac7f5c 00e43d6a55fe497ebaecece75c464913 \
00f71443fef647344027dd37beda651f 0105f03cb8b8faceda8ea099c2f2f476 \
0128ec2c42df9e7ac7b58a54ad902147 01794f476dce0d0adeb975eaff4ff850 \
01df479cca2ad2a8991bac020db5287e 01f0908f6ece2f26d918d1665f919222
archive-group = 58a3c9e02c964b0ec9dd6c085df99a77
patch-archives = 01c87e5f5e87ffc088c3fe20a7e332ce
0239bc973b31a4e52e8c96652a14b9e0 \
034e2e6e0e5cdecb0f0bc07e87f0e074 04f8e6c8cbfbd6e9fd3e9ccbcd95e53a \
0662e1cf69dbd0c6c10e7e3e6303b8cf 0bffd45f01e8ad33731f973bb96f3db1 \
0d17c61fa98e6db91e14e0b24c8bc9f9 0d47f019c36e88c00fc43b3fe973f3d1 \
101e4f7b592c12bf3c436d3b95e38b8f 1027ab37f63c039a8a3dd8a039e43e81
patch-archive-group = de09c9cd5f93c4e4f6f1f0f4a8edb9c0
file-index = fb37bc7303bae99d6c57e96a079e2c77
file-index-size = 34236152
patch-file-index = eb99f93d5c8dbdbb652f1d71da9c7de6
patch-file-index-size = 5015068
builds = ae66faee0ac786fdd7d8b4cf90a8d5b9
Archive Management
-
Archives are immutable once created
-
New content creates new archives
-
Archive-group combines multiple archives for efficient access
-
File-index provides fast lookups across all archives
Patch Configuration
Patch configurations define delta updates between builds. They are referenced
within build configurations using the patch-config field and contain detailed
patch entry definitions.
Access Pattern
Patch configs are accessed through:
- Fetch build config
- Extract
patch-confighash from build config - Fetch patch config using standard config path structure
Patch Configuration Format
Text format with metadata and multiple patch-entry lines.
Patch Entry Format
patch-entry = <type> <content-key> <size> <encoding-key> <encoded-size>
[compression-info] [additional-keys...]
Fields
| Field | Description |
|---|---|
type | File type (encoding, install, download, size, vfs:*) |
content-key | Target content key |
size | Target file size |
encoding-key | Encoded version key |
encoded-size | Encoded file size |
compression-info | Compression blocks (e.g., b:{11=n,4813402=n,793331=z}) |
additional-keys | Alternative encoding keys and sizes |
The agent validates patch-entry fields including target ESpec validation.
Patch config parsing uses structured per-entry validation.
Patch Configuration Example
# Patch Configuration for wow_classic 1.13.7.38631
# URL: http://cdn.arctium.tools/tpr/wow/config/17/f5/17f5bbcb7eae2fc8fb3ea545c65f74d4
# (Showing metadata and sample entries)
# Patch Configuration
patch = 658506593cf1f98a1d9300c418ee5355
patch-size = 22837
patch-entry = download 6d616efdfd334916898276805f043927 6113132 \
64332f9899b6d42a939fa3e02080bf33 5528795 b:{16=n,5524659=n,588457=z} \
0a45352357be8ddca09749ec421bbb48 6112126 50ac209d796a11818da1429d6cb69c60
12502
patch-entry = encoding fcf166e21580ee48497b4d85e433b900 13084283 \
716906f960db61ea62f07f7e9697127d 13082541
b:{22=n,2574=z,61216=n,7835648=n,40192=n,5144576=n,*=z} \
5905362dbda48cebbea7c80d05ef6c60 13084283 ce2c3294ca7e37aa3be1f227bdc9072a
89156
patch-entry = install 179088c6b3495b1a9dec3715e77834e1 15565 \
a75d4aa7e38dff6a1ddc59bd80c2ad3c 15197 b:{610=z,14955=n} \
f66d038c20f580be307f4645c7b5d3f2 15633 072a9339d594a00c884ffea987381883 486
patch-entry = size 5841844a1a1ad48eaeb756c716869bf5 3248493 \
d06fc7a7e4b5d8fb138a2ee27f54674f 2878957 b:{15=n,588457=z,64K*=n} \
2061f6427c842d01d9445d1bcc58d65b 3247949 daccd8bf9f2719ea9dbbb57991a03ed7
452303
Compression Info Format
The b:{...} notation describes block compression:
-
n= uncompressed block -
z= zlib compressed block -
Numbers indicate block sizes or offsets
-
*= all remaining blocks -
64K*= 64KB blocks
Entry Types
Patch configs commonly include:
-
System files:
download,encoding,install,size,patch-index -
VFS entries:
vfs:*with hexadecimal identifiers (e.g.,vfs:000000040000::) -
Metadata:
patchandpatch-sizefields for the patch file itself
Availability
Patch configs are found in:
-
Classic WoW builds (1.13.x through 5.5.x)
-
Older retail builds (pre-8.0)
-
Rarely in modern builds (mostly replaced by direct patching)
Keyring Configuration
Keyring configurations contain encryption keys for decrypting protected CASC content. Each entry maps an 8-byte key ID to a 16-byte Salsa20 encryption key.
Discovery
Keyring config hashes are in the Ribbit versions response KeyRing column,
NOT in build configs. The config is fetched from CDN using the standard config
path structure.
Format
Same key-value format as other configs, with = delimiter. Each entry uses
the key- prefix followed by a hex-encoded key ID.
key-{KEY_ID_HEX} = {KEY_VALUE_HEX}
Where:
KEY_ID_HEX: 16 hex characters (8 bytes) identifying the encryption keyKEY_VALUE_HEX: 32 hex characters (16 bytes) Salsa20 encryption key
Example
key-4eb4869f95f23b53 = c9316739348dcc033aa8112f9a3acf5d
Validation
Agent.exe (tact::ConfigReader::ValidateKeyringConfig at 0x6e7020) requires
at least one key entry. Duplicate key IDs with different values produce a
warning and the duplicate is ignored (first entry wins).
Usage
Keys are loaded into a hash map by tact::KeyGetter::LoadKeyring. During BLTE
decryption, the 8-byte key ID from the encrypted block header is used to look
up the 16-byte Salsa20 decryption key.
Distribution
Keyring sizes vary by product:
- WoW Retail: 1 key entry
- Call of Duty (Odin): 1 key entry
- Overwatch 2: 63 key entries (largest observed)
- WoW Classic products: empty keyrings (no KeyRing column in versions response)
Product Configuration
Product configurations contain Battle.net client metadata for installation and platform requirements.
Note: Product config hashes are present in Ribbit/Wago data, and the actual
config files are accessible via CDN using the /tpr/configs/data/ path
structure
as demonstrated in the examples below.
Product Configuration Format
JSON object with nested configuration sections.
Structure
{
"all": {
"config": {
// Global configuration
}
},
"platform": {
"win": { /* Windows-specific */ },
"mac": { /* macOS-specific */ }
},
"<locale>": {
"config": {
// Locale-specific configuration
}
}
}
Product Configuration Example
// Product Configuration for WoW 11.2.0.62748
// URL: http://cdn.arctium.tools/tpr/configs/data/53/02/53020d32e1a25648c8e1eafd5771935f
{
"all": {
"config": {
"product": "WoW",
"update_method": "ngdp",
"data_dir": "Data/",
"supports_multibox": true,
"supports_offline": false,
"supported_locales": ["enUS", "esMX", "ptBR", "deDE", "esES", "frFR"],
"display_locales": ["enUS", "esMX", "ptBR", "frFR", "deDE", "esES"],
"shared_container_default_subfolder": "_retail_",
"enable_block_copy_patch": true
}
},
"platform": {
"win": {
"config": {
"binaries": {
"game": {
"relative_path": "WoW.exe",
"relative_path_arm64": "Wow-ARM64.exe"
}
},
"min_spec": {
"default_required_cpu_speed": 2600,
"default_required_ram": 2048,
"default_requires_64_bit": true
}
}
},
"mac": {
"config": {
"binaries": {
"game": {
"relative_path": "World of Warcraft.app"
}
},
"min_spec": {
"default_required_cpu_speed": 2200,
"default_required_ram": 2048
}
}
}
},
"enus": {
"config": {
"install": [{
"start_menu_shortcut": {
"link": "%commonstartmenu%World of Warcraft/World of Warcraft.lnk",
"target": "%shortcutpath%",
"description": "Click here to play World of Warcraft."
}
}]
}
}
// ... additional locales ...
}
Global Configuration Keys
| Key | Description | Type |
|---|---|---|
product | Product identifier | String |
update_method | Update protocol | “ngdp” |
data_dir | Data directory path | String |
supported_locales | Available languages | Array |
display_locales | UI languages | Array |
launch_arguments | Default launch args | Array |
supports_multibox | Multiple instances | Boolean |
supports_offline | Offline play | Boolean |
enable_block_copy_patch | Block-level patching | Boolean |
shared_container_default_subfolder | Shared data path | String |
Platform Configuration
{
"platform": {
"win": {
"config": {
"binaries": {
"game": {
"relative_path": "WoWClassic.exe",
"relative_path_arm64": "WowClassic-arm64.exe",
"launch_arguments": []
}
},
"min_spec": {
"default_required_cpu_cores": 1,
"default_required_cpu_speed": 2600,
"default_required_ram": 2048,
"default_requires_64_bit": true,
"required_osspecs": {
"6.1": { "required_subversion": 0 }
}
},
"form": {
"game_dir": {
"default": "Program Files",
"required_space": 11500000000,
"space_per_extra_language": 2000000000
}
}
}
}
}
}
Locale Configuration
{
"enus": {
"config": {
"install": [{
"desktop_shortcut": {
"link": "%desktoppreference%World of Warcraft Classic.lnk",
"target": "%shortcutpath%",
"description": "Click here to play World of Warcraft.",
"args": "--productcode=wow_classic_era"
}
}]
}
}
}
Installation Variables
Product configs use variables resolved by Battle.net:
| Variable | Description |
|---|---|
%installpath% | Game installation directory |
%binarypath% | Executable path |
%shortcutpath% | Launcher path |
%desktoppreference% | User desktop path |
%commonstartmenu% | Start menu path |
%titlepath% | Product root directory |
%game% | Game data directory |
%locale% | Current locale |
%uid% | Unique installation ID |
Parser Implementation Status
Python Parser (cascette-py)
Status: Complete
Capabilities:
-
Fetches patch configs from build config references
-
Parses patch entry format with compression info
-
Analyzes entry types (system files, VFS entries)
-
Supports both patch and product config examination
-
Handles standard CDN path structure
Verified Against:
-
WoW Classic 1.13.7.38631 patch config
-
WoW Classic 4.4.2.60142 patch config (205 entries)
-
WoW Classic 5.5.0.62655 patch config
Known Issues:
-
None identified - both product and patch configs successfully fetched
-
Requires fetching build config first to get patch-config hash
See https://github.com/wowemulation-dev/cascette-py for the Python implementation.
Product Configuration Status Summary
ProductConfig contains product-specific metadata and installation parameters. These are referenced in Ribbit responses and are accessible via CDN.
Status: Available via CDN using /tpr/configs/data/ path structure
Format: JSON
Purpose: Product metadata, platform settings, feature flags
Known Fields (from Ribbit)
-
Product configuration hash (16 bytes hex)
-
Associated with specific product versions
-
May be embedded in client or launcher
Configuration Discovery Flow
- Ribbit Query: Get version and CDN information
- Version Lookup: Find build configuration hash and keyring hash
- Build Config: Fetch build metadata and system files
- CDN Config: Get archive lists and CDN servers
- Keyring Config: Fetch encryption keys (if KeyRing column present)
- Patch Config: Retrieve update paths (rarely available)
- Product Config: Client installation metadata (may not be accessible)
Implementation Considerations
Parsing
-
Build/CDN/Patch/Keyring configs: Simple key-value parser
-
Product config: JSON parser
-
Handle comments (lines starting with
#) -
Support multi-value fields (comma or space separated)
Caching
-
Configuration files are immutable (content-addressed)
-
Cache indefinitely once fetched
-
Validate using content hash
Error Handling
-
Retry failed fetches with exponential backoff
-
Fall back to alternate CDN servers
-
Validate configuration completeness
Security
-
Verify content hashes match expected values
-
Use HTTPS when available
-
Validate file sizes before download
NGDP/TACT Patch System
The NGDP patch system enables incremental updates between game versions using differential patches.
Patch System Architecture
The patch system uses a multi-tier structure:
-
Patch Manifests (PA files in /patch/): Index files listing patches
between builds
-
Patch Archives (ZBSDIFF files in /patch/): Actual differential patch data
-
Intermediate Results (in /data/): Results of applying patches in a chain
Patch File Locations
According to wowdev.wiki, the directories are:
-
/config/: Build configs, CDN configs, and Patch configs -
/data/: Archives, indexes, and unarchived files (binaries, media, root,install, download)
-
/patch/: Patch manifests, patch files, patch archives, patch indexes
Specifically:
- Patch Manifests:
https://cdn.host/tpr/wow/patch/{hash[:2]}/{hash[2:4]}/{hash}- PA (Patch Archive) format files containing patch entry indices
- Referenced by
patchfield in build configs
- Patch Archives:
https://cdn.host/tpr/wow/patch/{hash[:2]}/{hash[2:4]}/{hash}- ZBSDIFF1 format differential patch files stored in archives
- Found in patch-entry lines (the patch_hash values)
- Stored in archives just like regular data files
- Patch Archive Indices:
https://cdn.host/tpr/wow/patch/{hash[:2]}/{hash[2:4]}/{hash}.index- Index files for patch archives using the same format as data archive indices
- Map content hashes to locations within patch archives
- Referenced by
patch-archives-indexfield in CDN configs - Use IndexType::Patch (offset_bytes = 0) in the footer
- Patch Results:
https://cdn.host/tpr/wow/data/{hash[:2]}/{hash[2:4]}/{hash}- Intermediate or final results of applying patches
- BLTE-encoded files with DL/EN/IN signatures for manifest types
- Patch Configurations:
https://cdn.host/tpr/wow/config/{hash[:2]}/{hash[2:4]}/{hash}- Text configs with patch-entry lines describing patch chains
- Referenced by
patch-configfield in build configs
Patch Manifest Format
Patch manifests use the PA (Patch Archive) format. All numeric fields are big-endian throughout (header, block table, and block data).
Header Structure (10 bytes)
struct PatchArchiveHeader { // 10 bytes, big-endian
uint8_t magic[2]; // "PA" (0x5041)
uint8_t version; // Format version (1-2)
uint8_t file_key_size; // Target file CKey size (1-16, typically 16)
uint8_t old_key_size; // Source file EKey size (1-16, typically 16)
uint8_t patch_key_size; // Patch EKey size (1-16, typically 16)
uint8_t block_size_bits; // Block size as power of 2 (range [12, 24])
uint16_t block_count; // Number of blocks (big-endian)
uint8_t flags; // Format flags (see below)
};
Flags:
- Bit 0 (0x01): Plain data mode (informational, Agent.exe logs but does not reject)
- Bit 1 (0x02): Extended header present with encoding info. All known CDN patch manifests have this flag set.
Extended Header (when flags & 0x02)
Present immediately after the 10-byte header. Contains encoding file metadata for the patch manifest:
struct PatchArchiveEncodingInfo {
uint8_t encoding_ckey[file_key_size]; // Encoding file CKey
uint8_t encoding_ekey[file_key_size]; // Encoding file EKey
uint32_t decoded_size; // Decoded size (big-endian)
uint32_t encoded_size; // Encoded size (big-endian)
uint8_t espec_length; // Length of ESpec string
uint8_t espec[espec_length]; // ESpec (length-prefixed, NOT null-terminated)
};
Block Table
Follows the header (or extended header if present). Each entry has a
fixed size of file_key_size + 20 bytes:
struct BlockTableEntry { // file_key_size + 20 bytes per entry
uint8_t last_file_ckey[file_key_size]; // Last (highest) CKey in this block
uint8_t block_md5[16]; // MD5 hash of block data
uint32_t block_offset; // Absolute byte offset (big-endian)
};
The block table is sorted by last_file_ckey. Agent.exe validates sort
order using _memcmp during parsing.
Block Data
At each block_offset, file entries are stored as variable-length records
terminated by a 0x00 sentinel byte:
// Repeat until num_patches == 0:
struct FileEntry {
uint8_t num_patches; // 0 = end of block
uint8_t target_ckey[file_key_size]; // Target file CKey
uint8_t decoded_size[5]; // uint40, big-endian
// Followed by num_patches patch records:
struct {
uint8_t source_ekey[old_key_size]; // Source file EKey
uint8_t source_decoded_size[5]; // uint40, big-endian
uint8_t patch_ekey[patch_key_size]; // Patch data EKey
uint32_t patch_size; // Patch data size (big-endian)
uint8_t patch_index; // Ordering hint
} patches[num_patches];
};
uint8_t end_marker = 0; // Sentinel byte
Decoded sizes use uint40 (5-byte big-endian) to support files up to ~1 TB.
Compression Info Format
The compression info string describes byte ranges and their compression:
-
Format:
{offset=method,offset=method,...,*=default} -
Methods:
n(none),z(zlib) -
Example:
{22=n,10044521=z,734880=n,*=z}
Build Config References
Build configurations reference patches through:
-
patch: Main patch manifest hash -
patch-size: Size of patch manifest -
patch-index: Patch index files -
patch-config: Patch configuration hash
Patch Configuration
Patch configs contain patch-entry lines describing patch chains between file
versions.
Patch Entry Format
patch-entry = type old_hash old_size new_hash new_size compression_info
[result_hash result_size patch_hash patch_size]+
Components:
-
type: Manifest type (download, encoding, install, size, vfs:, etc.) -
old_hash: MD5 of original file content -
old_size: Size of original file -
new_hash: MD5 of final patched content -
new_size: Size of final file -
compression_info: Compression specification (e.g.,b:{11=n,8183230=n,1255589=z}) -
Followed by repeating groups of:
result_hash: MD5 of intermediate/final result (stored in /data/)result_size: Size of result filepatch_hash: MD5 of ZBSDIFF patch file (stored in /patch/)patch_size: Size of patch file
Patch Chain Example
patch-entry = download 6afd6862... 9438830 d29e5263... 8190785 b:{...} \
557b46d1... 15384969 08c046c8... 1623773 \
4ebf89a1... 15384925 e960d26b... 1623636
This describes a chain:
- Apply patch
08c046c8to original6afd6862→ result557b46d1 - Apply patch
e960d26bto result557b46d1→ result4ebf89a1 - Continue until reaching final
d29e5263
ZBSDIFF1 Format (Zlib-compressed Binary Differential)
ZBSDIFF1 is the binary differential patch format used by NGDP/TACT for efficient file updates:
Header (32 bytes, little-endian)
struct ZbsdiffHeader {
uint8_t signature[8]; // "ZBSDIFF1"
int64_t control_size; // Size of compressed control block (little-endian)
int64_t diff_size; // Size of compressed diff block (little-endian)
int64_t output_size; // Size of final output file (little-endian)
};
Three-Block Structure
-
Control Block (zlib-compressed):
- Triple sequences: (diff_size, extra_size, seek_offset)
- Instructions for applying differences and inserting new data
- All values are signed 64-bit integers
-
Diff Block (zlib-compressed):
- Byte differences to apply to old data
- Applied by XOR operation: new[i] = old[i] + diff[i]
-
Extra Block (zlib-compressed):
- New data to insert at specified positions
- Copied directly to output
Streaming Application
ZBSDIFF1 supports streaming application without loading entire files:
#![allow(unused)]
fn main() {
// Streaming patch application
let mut old_pos = 0;
let mut new_pos = 0;
let mut control_entries = decompress_control_block(&patch.control_data)?;
while let Some((diff_size, extra_size, seek_offset)) = control_entries.next()? {
// Copy diff_size bytes with differences
copy_with_diff(&old_data[old_pos..], &diff_data, &mut new_data[new_pos..], diff_size);
old_pos += diff_size;
new_pos += diff_size;
// Copy extra_size bytes of new data
copy_extra(&extra_data, &mut new_data[new_pos..], extra_size);
new_pos += extra_size;
// Seek in old data
old_pos += seek_offset;
}
}
Format Characteristics
-
Little-Endian Header: All header fields use little-endian byte order (verified against Agent.exe
tact::BsPatch::ParseHeaderat 0x6fbd1c) -
Signed Integers: Control block uses signed 64-bit little-endian integers for sizes and offsets
-
Zlib Compression: All data blocks compressed independently
-
Memory Efficient: Can process large files with minimal RAM usage
-
Error Detection: Header validation and decompression errors detected
Patch Archive Storage
Patch data is stored in archives just like regular game data:
-
Patch Archives: Large files containing multiple patch data blobs
- Located in
/patch/directory on CDN - Contain BLTE-encoded ZBSDIFF1 patches
- Named with content hashes like regular archives
- Located in
-
Patch Archive Indices: Map patch hashes to archive locations
- Use the same
.indexformat as data archives - Footer uses IndexType::Patch (offset_bytes = 0)
- Allow CDN to locate specific patches within archives
- Use the same
-
Patch Archive Groups: Client-side optimization structures
- Use the same Archive Group format as data archives
- Group related patches for efficient client caching
- Located in client’s local CASC storage (not on CDN)
- Referenced in
.idxfiles with grouped archive information
-
CDN Config References:
patch-archives: List of patch archive hashespatch-archives-index: Corresponding index file hashespatch-archives-index-size: Size of each index file
This completely mirrors the structure used for data archives:
-
archives→patch-archives -
archives-index→patch-archives-index -
Archive Groups → Patch Archive Groups
-
Same formats, just in
/patch/directory instead of/data/
Patch Chain Building and Validation
Patch Chain Construction
Patches can form chains from one content version to another with cycle detection:
#![allow(unused)]
fn main() {
pub fn build_patch_chain(
&self,
start_key: &[u8; 16],
end_key: &[u8; 16]
) -> Option<PatchChain> {
let mut chain = Vec::new();
let mut current_key = *start_key;
let mut visited = HashSet::new();
while current_key != *end_key {
// Cycle detection
if visited.contains(¤t_key) {
return None; // Cycle detected
}
visited.insert(current_key);
let patch_entry = self.find_patch_for_content(¤t_key)?;
current_key = patch_entry.new_content_key;
chain.push(patch_entry.clone());
// Safety limit: prevent infinite chains
if chain.len() > 10 {
return None; // Chain too long
}
}
Some(PatchChain { steps: chain, start_key: *start_key, end_key: *end_key })
}
}
Safety Validations
-
Cycle Detection: Prevents infinite loops in patch chains
-
Chain Length Limits: Maximum 10 steps to prevent excessive processing
-
Size Validation: Output size must match header specification
-
Checksum Verification: Content keys validated after patch application
-
Stream Bounds Checking: Prevents buffer overflows during streaming
Size Limits and Memory Management
#![allow(unused)]
fn main() {
// ZBSDIFF1 size limits for safety
const MAX_PATCH_SIZE: usize = 100 * 1024 * 1024; // 100MB max patch
const MAX_OUTPUT_SIZE: usize = 1024 * 1024 * 1024; // 1GB max output
const MAX_CONTROL_ENTRIES: usize = 1_000_000; // Prevent memory exhaustion
impl ZbsdiffHeader {
pub fn validate(&self) -> Result<(), ZbsdiffError> {
if self.output_size > MAX_OUTPUT_SIZE as u64 {
return Err(ZbsdiffError::OutputTooLarge(self.output_size));
}
if self.control_size + self.diff_size > MAX_PATCH_SIZE as u64 {
return Err(ZbsdiffError::PatchTooLarge);
}
Ok(())
}
}
}
Patch Application Process
- Fetch patch manifest from CDN using patch hash from build config
- Parse manifest to find patch entry for target file
- Validate patch chain: Check for cycles and reasonable length
- Look up patch in patch archive index to find archive and offset
- Download patch data from archive using index information
- Validate patch size limits before processing
- Decode BLTE wrapper and extract ZBSDIFF1 patch
- Apply patch using streaming algorithm with bounds checking
- Verify result size and hash match expectations
Implementation Notes
-
Patches are not BLTE-encoded at the manifest level
-
Individual patch data files may be BLTE-encoded
-
Block size is typically 64KB (2^16 bytes)
-
Version 2 is the current patch format version
-
Patches enable efficient updates without re-downloading entire files
BPSV Format Specification
BPSV (Blizzard Pipe-Separated Values) is a structured data serialization format, similar to CSV but using pipes (|) as delimiters with Blizzard-specific schemas. It’s used in Ribbit API responses, configuration files, and version manifests. BPSV is a data format, not a network protocol.
Format Structure
BPSV files contain three components:
- Header line (required)
- Sequence number line (optional)
- Data rows (zero or more)
graph TD
A[BPSV File] --> B[Header Line]
A -.-> C[Sequence Number Line]
A --> D[Data Rows]
B --> E["FieldName!TYPE:length|FieldName2!TYPE:length"]
C -.-> F["seqn = {number}"]
D --> G["value1|value2|value3"]
style A stroke-width:4px
style B stroke-width:3px
style C stroke-width:2px,stroke-dasharray:5 5
style D stroke-width:3px
style E stroke-width:2px
style F stroke-width:2px
style G stroke-width:2px
Header Line Format
The header line defines field structure using pipe-separated field definitions:
FieldName!TYPE:length|FieldName2!TYPE:length|FieldName3!TYPE:length
Each field definition contains:
-
Field name (case-sensitive)
-
Exclamation mark separator
-
Field type (case-insensitive)
-
Colon separator
-
Length specification
Sequence Number
The optional sequence number appears on a separate line:
## seqn = 12345
Properties:
-
Always starts with
## seqn -
Supported separators:
=,:, or space -
Contains integer value
-
Used for version tracking and cache invalidation
-
Maximum one per file
Accepted formats:
-
## seqn = 12345(equals with spaces) -
## seqn: 12345(colon separator) -
## seqn 12345(space only) -
Extra whitespace is trimmed
Field Types
BPSV supports three field types:
STRING:length
Text data with length constraints:
-
Length 0: unlimited characters
-
Length > 0: maximum character count
-
Type names: STRING, String, string (case-insensitive)
-
UTF-8 encoding
HEX:length
Binary data encoded as hexadecimal:
-
Length specifies bytes in binary form
-
Requires exactly length × 2 hexadecimal characters
-
Valid characters: 0-9, a-f, A-F
-
Empty values always valid
-
Common usage: HEX:16 for MD5 hashes (32 hex chars)
DEC:length
Decimal integers:
-
Length indicates storage size (4 = uint32, 8 = uint64)
-
Length not enforced during parsing
-
Supports full int64 range
-
Type names: DEC, Dec, dec (case-insensitive)
Data Rows
Data rows contain pipe-separated values matching header field definitions:
-
Column count must match header field count
-
Empty values allowed for all field types
-
Values parsed according to field type specifications
Parsing Flow
flowchart TD
A[Start Parsing] --> B[Read First Line]
B --> C[Parse Header Fields]
C --> D[Read Next Line]
D --> E{"Line starts with '## seqn ='?"}
E -->|Yes| F[Parse Sequence Number]
E -->|No| G[Parse as Data Row]
F --> H[Read Next Line]
H --> I{More Lines?}
G --> J[Validate Column Count]
J --> K[Parse Field Values by Type]
K --> L[Store Data Row]
L --> I
I -->|Yes| M[Read Next Line]
I -->|No| N[Parsing Complete]
M --> G
style A stroke-width:4px
style N stroke-width:4px
style C stroke-width:3px
style E stroke-width:3px,stroke-dasharray:5 5
style I stroke-width:2px,stroke-dasharray:5 5
style F stroke-width:2px
style J stroke-width:2px
style K stroke-width:2px
Usage Context
BPSV is a data serialization format used in multiple contexts:
-
Ribbit API Responses: Structured data returned by Ribbit protocol
-
Product Configuration Files:
.productfiles with version information -
Version Manifests: Build and CDN configuration references
-
CDN Configuration: Server URLs and path mappings
-
Background Downloads: Download priority information
Note: BPSV is the data format; Ribbit is the protocol that transmits BPSV data.
Implementation Requirements
Type Validation
Parsers must validate field values according to type specifications:
-
STRING fields accept any UTF-8 text
-
HEX fields require valid hexadecimal characters and exact length
-
DEC fields must parse as valid integers
-
Empty values are valid for all field types
Parsing Architecture
Implementations may use different parsing strategies:
-
Zero-copy parsing: Borrow from original string for efficiency
-
Owned parsing: Copy data for serialization/storage
-
Lazy parsing: Keep raw strings until typed values requested
-
Schema validation: Enforce field uniqueness and type compatibility
Error Handling
Common parsing errors:
-
Column count mismatch between header and data rows
-
Invalid characters in HEX fields
-
Incorrect HEX field length (must be exactly length × 2 chars)
-
Non-numeric values in DEC fields
-
Multiple sequence number lines
-
Duplicate field names in schema
Performance Considerations
-
Typical file size: < 10MB
-
Typical row count: < 10,000
-
UTF-8 encoding recommended
-
Both Unix (\n) and Windows (\r\n) line endings accepted
Format Examples
Basic Product Configuration
Region!STRING:4|BuildConfig!HEX:16|CDNConfig!HEX:16
## seqn = 98765
us|a1b2c3d4e5f6789012345678|f1e2d3c4b5a69870123456789abcdef0
eu|b2c3d4e5f6789012345678a1|e2d3c4b5a69870123456789abcdef0f1
CDN Server List
Name!STRING:0|Path!STRING:0|Hosts!STRING:0
## seqn = 54321
us|tpr/wow|us.patch.battle.net level3.blizzard.com
eu|tpr/wow|eu.patch.battle.net level3.blizzard.com
Version Information
Product!STRING:10|Seqn!DEC:4|Flags!HEX:4
wow|12345|0001
wowt|12346|0002
Type Casing Examples
Field types accept case variations:
# All valid type specifications
Name!STRING:50|ID!DEC:4|Hash!HEX:16
Name!String:50|ID!Dec:4|Hash!Hex:16
Name!string:50|ID!dec:4|Hash!hex:16
Empty Value Handling
Empty values preserve semantic meaning:
Product!STRING:10|Version!STRING:10|Hash!HEX:16
wow|8.3.0|a1b2c3d4e5f6789012345678
wowt||b2c3d4e5f6789012345678a1
The second row contains an empty version field, which differs from a missing field.
Implementation Status
Rust Implementation (cascette-formats)
BPSV parser and builder:
-
Schema parsing - Field name, type, and size validation (complete)
-
Document parsing - Multi-row data with sequence numbers (complete)
-
Type support - STRING, HEX, and DEC field types (complete)
-
Round-trip validation - parse(build(data)) == data guarantee (complete)
-
Case-insensitive types - Accepts STRING, String, string variations (complete)
-
Builder support - Programmatic BPSV file creation (complete)
Validation Status:
-
Byte-for-byte round-trip validation
-
Integration tests with real Ribbit API responses
-
Handles empty values, comments, and sequence numbers
-
Validated against real Battle.net BPSV files
Analysis and Usage
BPSV format is used throughout the NGDP system for configuration and version data.
NGDP/CASC Format Transitions
This document summarizes verified format transitions discovered through systematic analysis of WoW builds from 2014-2025, starting with CASC’s introduction in Warlords of Draenor (6.0.x) which replaced the MPQ system.
Verification Methodology
Format transitions were identified through:
- Strategic Build Analysis: Examining key builds across WoW versions using
tools/examine_build.py - Chronological Comparison: Tracking format changes between adjacent builds
- Cross-Product Validation: Comparing wow, wow_classic, wow_classic_era, wow_classic_titan, and wow_anniversary
- Automated Verification: Using Python scripts to validate format assumptions
Discovered Format Transitions
Root File Format Evolution
The Root file format has evolved since CASC’s introduction in Warlords of Draenor:
Version 1 (Early CASC, 2014-2021)
-
Magic: None initially, later MFST (big-endian)
-
First Seen: Warlords of Draenor (6.0.x) - CASC introduction
-
Structure: Basic content key mapping with file flags
-
Features:
- FileDataID to content key mapping
- Basic content/locale flags (32-bit)
- Jenkins96 hash for named files
-
Note: This is the first CASC Root format, replacing the MPQ system
Version 2 (Transitional CASC, 2021)
-
Magic: TSFM (little-endian)
-
First Seen: Shadowlands (9.0.2)
-
Structure: Added size fields and magic signature
-
Features:
- TSFM magic signature introduction
- Size fields for validation
- Maintained v1 data structures
Version 3 (Modern CASC, 2021-Present)
-
Magic: TSFM (little-endian standard)
-
First Seen: Shadowlands late patches
-
Structure: Enhanced metadata and extended flags
-
Features:
- Extended content flags (40-bit total)
- Improved compression efficiency
- Better locale targeting
Version 4 (Current CASC, 2023-Present)
-
Magic: TSFM
-
First Seen: Dragonflight (10.x)
-
Structure: Further optimizations
-
Features:
- Additional metadata fields
- VFS integration improvements
Verified Transition Points
Based on build examination across retail and Classic:
WoW Retail (wow) Format Evolution:
| Version | Build Date | Root Version | Magic | Config Fields | Key Changes |
|---|---|---|---|---|---|
| 6.0.1.18125 | 2014-06-20 | 1 | None | 13 | CASC introduction, replacing MPQ |
| 7.3.5.25848 | 2018-01-16 | 1 | None | 15 | Still using v1 format |
| 9.0.2.37176 | 2021-01-13 | 2 | TSFM | 17 | Major transition: TSFM magic, size fields added |
| 10.1.5.51130 | 2023-08-31 | 3 | TSFM | 1,623 | VFS expansion: 1,600+ virtual file system fields added |
| 11.2.0.62748 | 2025-08-22 | 3 | TSFM | 1,716 | Current retail standard with extended features |
WoW Classic (wow_classic) Format Evolution:
| Version | Build Date | Root Version | Magic | Config Fields | Key Changes |
|---|---|---|---|---|---|
| 1.13.0.28211 | 2018-10-23 | 1 | None | 13 | Classic launch using CASC v1 |
| 2.5.2.39926 | 2021-08-31 | 1 | None | 16 | Patch fields added |
| 3.4.2.50063 | 2023-06-20 | 1 | None | 756 | VFS adoption: 740+ VFS fields |
| 3.4.4.61075 | 2025-05-28 | 3 | TSFM | 758 | Format jump: Skipped v2, went directly to v3 |
| 5.5.0.62655 | 2025-08-19 | 3 | TSFM | 905 | Current Classic standard |
Classic Format Lag Pattern
Classic follows retail with significant delays:
-
Root v1→v2/v3: Retail (2021) → Classic (2025) = 4 years behind
-
VFS Introduction: Retail (2023) → Classic (2023) = 18 months behind
-
TSFM Magic: Retail (2021) → Classic (2025) = 4 years behind
Classic skipped Root v2 entirely, jumping directly from v1 to v3, demonstrating selective adoption of retail improvements.
Parser Compatibility Matrix
Based on verified transitions, parsers must support:
| Product | Supported Root Versions | Magic Detection | VFS Support | Timeframe |
|---|---|---|---|---|
| wow_classic_era | v3 only | TSFM | Modern | 2021+ (uses retail backend) |
| wow_classic | v1, v3 | None, TSFM | Legacy → Modern | 2018-2025 |
| wow_classic_titan | v3 only | TSFM | Modern | 2025+ (CN only, WotLK 3.80.x) |
| wow_anniversary | v3 only | TSFM | Modern | 2025+ (TBC 2.5.x) |
| wow | v1, v2, v3 | None, TSFM | Legacy → Modern | 2018-2025 |
Implementation Recommendation: Always attempt v3 parsing first with TSFM magic detection, then fall back to v1 legacy format. Root v2 is retail-specific and uncommon.
Build Configuration Evolution
Build configurations have evolved to support new file types and compression methods:
Early CASC (6.0.x)
root = <content_key>
encoding = <content_key> <encoding_key>
install = <content_key> <encoding_key>
download = <content_key> <encoding_key>
Modern CASC (11.x)
root = <content_key>
encoding = <content_key> <encoding_key>
install = <content_key> <encoding_key>
download = <content_key> <encoding_key>
patch = <patch_key>
size = <content_key> <encoding_key>
Evolution Pattern:
-
Root field simplified to single content key
-
New fields added (patch, size) for enhanced functionality
-
Encoding/install/download maintain dual-key format
BLTE Format Evolution
BLTE (Block Table Encoded) compression has remained stable but usage patterns evolved:
Compression Type Usage by Era
| Era | None (N) | ZLIB (Z) | Encrypted (E) | Frame (F) |
|---|---|---|---|---|
| Early CASC | 20% | 75% | 0% | 5% |
| Modern CASC | 15% | 60% | 5% | 20% |
Key Changes:
-
Increased use of Frame compression for nested compression
-
Introduction of encrypted blocks for sensitive data
-
ZLIB remains primary compression method
Block Structure Evolution
-
Single Block: Simpler files, configuration data
-
Multi Block: Large files, game assets
-
Trend: Growing use of multi-block for better streaming
Verification Scripts
Format verification tools have been moved to the cascette-py project: https://github.com/wowemulation-dev/cascette-py
The Python implementation includes:
- Cache management for downloaded files
- Root file version detection testing
- Build configuration evolution tracking
- BLTE compression pattern analysis
- Complete format verification suite
See the cascette-py documentation for setup and usage instructions.
Implementation Impact
For Rust Implementation
Based on verified format evolution across retail and Classic:
-
Root File Parser:
- Primary Support: Root v1 (legacy) and v3 (modern) formats
- Limited Support: Root v2 (retail-only transition format)
- Magic Detection: TSFM (little-endian) and None (legacy)
- Version Strategy: Try v3+TSFM first, fall back to v1+None
-
Configuration Parser:
- Early Builds: 13-17 fields (simple key=value)
- VFS Era: 756-1,716 fields (massive vfs-* expansion)
- Feature Support: Handle
feature-placeholderand VFS fields - Backwards Compatibility: Support both v1 (legacy) and v3 (modern) formats
-
Product-Specific Logic:
- wow_classic_era: Always modern format (v3, TSFM)
- wow_classic: Dual format support with clear transition point (2025)
- wow_classic_titan: Modern format only (v3, TSFM), 368 VFS entries, CN region only
- wow_anniversary: Modern format only (v3, TSFM), 325 VFS entries, all regions
- wow retail: Full format evolution support (2018-2025)
-
BLTE Decoder: All compression types (N, Z, E, F) with consistent usage
patterns across all product lines
Key Architectural Decisions
-
Version Detection Strategy:
#![allow(unused)] fn main() { // Recommended parsing order if has_tsfm_magic() { try_root_v3_format() } else { try_root_v1_format() } } -
Configuration Parsing:
- VFS Detection: Fields starting with
vfs-indicate modern builds - Feature Detection:
feature-placeholderindicates latest builds - Backwards Compatibility: Always support minimal 13-field format
- VFS Detection: Fields starting with
-
Product Detection:
- Use Wago.tools build database for version context
- Classic Era assumes modern format post-2021
- Classic has explicit v1→v3 transition in May 2025
-
Testing Strategy: Verify against all transition points with real build data
Future Analysis
Formats not yet tracked for transitions:
- Encoding file table structure changes
- Install/Download tag system evolution
- Archive index format stability
- Patch file introduction timeline
References
Last Updated: 2025-08-23 Verification Status: Automated verification scripts created and tested Next Review: After implementing Rust parsers based on verified formats
BLTE (Block Table Encoded) Format
BLTE is NGDP’s container format for compressed and optionally encrypted content. It provides block-based compression, encryption support, and efficient streaming capabilities for game data delivery.
Overview
BLTE files wrap game content with:
-
Optional multi-block structure for large files
-
Per-block compression (none, zlib, or others)
-
Optional encryption (Salsa20 or ARC4)
-
MD5 checksums for integrity verification
Binary Format
File Structure
BLTE File Layout:
┌─────────────────────────┐
│ BLTE Header (8 bytes) │
├─────────────────────────┤
│ Extended Header │ (optional, if header_size > 0)
│ - Flags (1 byte) │
│ - Chunk Count (3 bytes) │
├─────────────────────────┤
│ Chunk Info Table │ (24 bytes per chunk)
│ - Compressed Size │
│ - Decompressed Size │
│ - MD5 Checksum │
├─────────────────────────┤
│ Data Block 1 │
│ - Encoding Type (1 byte)│
│ - Compressed Data │
├─────────────────────────┤
│ Data Block 2 │
│ ... │
└─────────────────────────┘
Header Format
#![allow(unused)]
fn main() {
// Primary BLTE header (always 8 bytes)
struct BlteHeader {
magic: [u8; 4], // "BLTE" (0x424C5445 in big-endian)
header_size: u32, // Big-endian, total header size including these 8 bytes
}
}
Header Size Values
-
header_size == 0: Single chunk file, no extended header -
header_size > 0: Multi-chunk file with extended header
Extended Header
Present only when header_size > 0:
#![allow(unused)]
fn main() {
struct ExtendedHeader {
flags: u8, // 0x0F = standard, 0x10 = extended
chunk_count: [u8; 3], // 24-bit big-endian chunk count
}
}
Chunk Information Table
Standard Format (flags = 0x0F)
Each chunk has a 24-byte entry:
#![allow(unused)]
fn main() {
struct ChunkInfo {
compressed_size: u32, // Big-endian
decompressed_size: u32, // Big-endian
checksum: [u8; 16], // MD5 of compressed chunk data
}
}
Extended Format (flags = 0x10)
Each chunk has a 40-byte entry:
#![allow(unused)]
fn main() {
struct ExtendedChunkInfo {
compressed_size: u32, // Big-endian
decompressed_size: u32, // Big-endian
checksum: [u8; 16], // MD5 of compressed chunk data
decompressed_checksum: [u8; 16], // MD5 of decompressed chunk data
}
}
This extended format provides additional integrity checking with MD5 checksums of both compressed and decompressed data.
Formula Validation
For standard chunks (flags = 0x0F):
header_size = 12 + (chunk_count * 24)
For extended chunks (flags = 0x10):
header_size = 12 + (chunk_count * 40)
Where:
-
12 = 8 (BLTE header) + 1 (flags) + 3 (chunk count)
-
24 = size of standard ChunkInfo entry
-
40 = size of extended ChunkInfo entry
The header_size field includes the 8-byte BLTE header (“BLTE” magic + header_size u32). Data starts at offset header_size from the beginning of the file.
Encoding Types
Each data block starts with a single-byte encoding type:
| Byte | Character | Type | Description |
|---|---|---|---|
| 0x4E | ‘N’ | None | Uncompressed data |
| 0x5A | ‘Z’ | ZLib | ZLib compressed (deflate) |
| 0x34 | ‘4’ | LZ4 | LZ4HC high compression |
| 0x45 | ‘E’ | Encrypted | Encrypted data block |
| 0x46 | ‘F’ | Frame | Recursive BLTE (deprecated) |
Compression Formats
None (0x4E)
Uncompressed data follows immediately after the encoding byte:
[0x4E] [raw data...]
ZLib (0x5A)
Standard zlib compression:
[0x5A] [2-byte zlib header] [deflate stream...]
Important: Most implementations skip the zlib header and use raw deflate.
LZ4 (0x34)
LZ4HC (high compression) format:
[0x34] [decompressed_size:8] [compressed_lz4_data...]
-
decompressed_size: 64-bit little-endian size -
Data following the prefix is a single LZ4 block (no sub-blocks)
-
Provides ~200-300 MB/s decompression speed
Format discrepancy: The WoWDev wiki describes a different LZ4 format with
headerVersion (1 byte), 64-bit big-endian size, blockShift (1 byte, range
5-16), and multiple sub-blocks of 1 << blockShift bytes each. Agent.exe
3.13.3 uses the 8-byte LE prefix + single block format documented above.
tact::Codec::DecodeLZ4 at 0x6f5fdb is a stub in Agent.exe 3.13.3 (returns
error 5), so the LZ4 format cannot be fully verified from this binary version.
cascette-rs matches the Agent.exe format. The wiki format may apply to a newer
protocol version or a different product.
Encryption Format
Encrypted Block Structure
[0x45] [key_name_size:1] [key_name:8] [iv_size:1] [iv:4] [type:1]
[encrypted_data...]
Fields:
-
key_name_size: Usually 8 -
key_name: 64-bit key identifier -
iv_size: Usually 4 -
iv: Initialization vector -
type: 0x53 (‘S’) for Salsa20, 0x41 (‘A’) for ARC4 (legacy, not used in TACT 3.13.3+)
IV Extension and Modification for Chunks
The IV (typically 4 bytes) is zero-padded to 8 bytes for the Salsa20 nonce:
#![allow(unused)]
fn main() {
let mut nonce = [0u8; 8]; // zero-initialized
nonce[..iv_size].copy_from_slice(&iv);
// Remaining bytes stay zero (NOT duplicated)
}
For multi-chunk files, the IV is XORed with the chunk index before extension:
#![allow(unused)]
fn main() {
fn modify_iv(iv: &mut [u8], chunk_index: usize) {
for i in 0..4 {
iv[i] ^= ((chunk_index >> (i * 8)) & 0xFF) as u8;
}
}
}
Parsing Algorithm
Step 1: Read BLTE Header
#![allow(unused)]
fn main() {
let magic = read_u32_be(); // Must be 0x424C5445 ("BLTE")
let header_size = read_u32_be();
}
Step 2: Determine Structure
#![allow(unused)]
fn main() {
if header_size == 0 {
// Single chunk file
// Data starts at offset 8
// Chunk size = file_size - 8 - 1 (encoding byte)
} else {
// Multi-chunk file
// Read extended header and chunk table
// Note: Data offset calculation varies by format!
}
}
The data offset for multi-chunk files is always header_size from the start
of the file. The header_size field includes the 8-byte BLTE header.
Step 3: Read Extended Header (if present)
#![allow(unused)]
fn main() {
let flags = read_u8(); // 0x0F for standard, 0x10 for extended
let chunk_count = read_u24_be(); // 24-bit big-endian
// Read chunk information table
let chunks = Vec::with_capacity(chunk_count);
for _ in 0..chunk_count {
chunks.push(ChunkInfo {
compressed_size: read_u32_be(),
decompressed_size: read_u32_be(),
checksum: read_bytes(16),
});
}
}
Step 4: Process Data Blocks
#![allow(unused)]
fn main() {
let mut output = Vec::new();
let mut offset = header_size;
for chunk_info in chunks {
// Read chunk data
let chunk_data = &data[offset..offset + chunk_info.compressed_size];
// Optionally verify MD5 checksum (not done automatically during parsing)
// let hash = md5::compute(chunk_data);
// assert_eq!(hash.0, chunk_info.checksum);
// Decompress based on encoding type
let decompressed = decompress_chunk(chunk_data);
output.extend_from_slice(&decompressed);
offset += chunk_info.compressed_size;
}
}
Decompression Implementation
#![allow(unused)]
fn main() {
fn decompress_chunk(data: &[u8]) -> Result<Vec<u8>> {
if data.is_empty() {
return Err("Empty chunk");
}
match data[0] {
0x4E => {
// None - return raw data
Ok(data[1..].to_vec())
},
0x5A => {
// ZLib - decompress using deflate
// Skip: [0x5A] [78 9C] (zlib header)
let deflate_data = &data[3..];
decompress_deflate(deflate_data)
},
0x34 => {
// LZ4 - high compression
let decompressed_size = u64::from_le_bytes(
data[1..9].try_into()?
);
let compressed_data = &data[9..];
decompress_lz4(compressed_data, decompressed_size as usize)
},
0x45 => {
// Encrypted - requires key
decrypt_chunk(&data[1..])
},
0x46 => {
// Frame - recursive BLTE
let inner_blte = &data[1..];
parse_blte(inner_blte)
},
_ => Err("Unknown encoding type"),
}
}
}
Real-World Example
Let’s examine the encoding file we fetched earlier:
00000000: 424c 5445 0000 00b4 0f00 0007 0000 0017 BLTE............
^^^^^^^^^ ^^^^^^^^^ ^^ ^^^^^^^ ^^^^^^^^^
Magic Hdr Size F Count CompSize
Breaking down the header:
- Magic: 0x424C5445 = "BLTE"
- Header Size: 0x000000B4 = 180 bytes
- Flags: 0x0F (required value)
- Chunk Count: 0x000007 = 7 chunks
- First Chunk Compressed Size: 0x00000017 = 23 bytes
This indicates:
-
Multi-chunk file (header_size > 0)
-
7 chunks total
-
Extended header size = 12 + (7 * 24) = 180 bytes
Performance Characteristics
Compression Mode Comparison
| Mode | Compression Speed | Decompression Speed | Compression Ratio | Memory Usage |
|---|---|---|---|---|
| None | ~500 MB/s | ~500 MB/s | 1.0x | Minimal |
| LZ4 | ~200 MB/s | ~300 MB/s | 2-4x | ~64 KB |
| ZLib | ~50-150 MB/s | ~100-200 MB/s | 3-8x | ~256 KB |
Data Type Recommendations
| Data Type | Recommended Mode | Reasoning |
|---|---|---|
| Text/Config | ZLib (level 6-9) | High compressibility, access infrequent |
| Textures | LZ4 or None | Often pre-compressed, need fast access |
| Audio | None or LZ4 | Poor compressibility, streaming required |
| Models | ZLib (level 3-6) | Structured data compresses well |
| Temporary | None | Speed critical, short-lived |
Special Cases
Headerless Files
When header_size == 0:
-
Single chunk only
-
No chunk information table
-
Data starts immediately at offset 8
-
Entire remaining file is one compressed block
Empty Chunks
Some chunks may have:
-
compressed_size == 0 -
decompressed_size == 0 -
Usually placeholders or removed content
Large Files
Multi-chunk structure enables parallel decompression and partial/resumable downloads, allowing streaming installation of large files.
Error Handling
Critical checks:
- Verify BLTE magic number
- Validate flags == 0x0F for extended headers
- Check chunk count > 0 when header_size > 0
- MD5 checksums are available via
verify_checksum()on each chunk (not verified automatically during parsing) - Handle unknown encoding types gracefully
- Ensure decompressed size matches expected
- Enforce maximum decompression size (1 GB) to prevent decompression bombs
Implementation Considerations
- Process chunks incrementally rather than loading entire files into memory
- Decompress chunks in parallel where possible
- Checksum verification is a separate step from parsing (call
verify_checksum()on chunk data) - Maximum decompression size is 1 GB (
MAX_DECOMPRESSION_SIZE). Chunks claiming a larger decompressed size are rejected
Integration with NGDP
BLTE files in NGDP context:
- Fetched using encoding keys from CDN
- May be stored in archives or as loose files
- Encoding file maps content keys to BLTE-encoded versions
- Archive indices point to BLTE data within archives
Debugging Tips
Identifying BLTE Files
# Check for BLTE magic
xxd -l 4 file.bin
# Should show: 424c 5445 (BLTE)
# Check header size
xxd -s 4 -l 4 -e file.bin
# Big-endian u32 value
Common Issues
- Wrong endianness: BLTE uses big-endian, not little-endian
- Skipping zlib header: Most implementations skip bytes 1-2 after 0x5A
- IV modification: Remember to XOR IV with chunk index for encryption
- Checksum validation: Use MD5 of compressed data, not decompressed
Implementation Status
Rust Implementation (cascette-formats)
BLTE parser and builder:
-
None (N) - Uncompressed passthrough (complete)
-
ZLib (Z) - Deflate compression using flate2 (complete)
-
LZ4 (4) - LZ4 compression with proper size headers (complete)
-
Encrypted (E) - Salsa20 and ARC4 encryption with multi-chunk support (complete)
-
Frame (F) - Recursive BLTE support (not implemented, deprecated format)
-
Extended Format - Full support for 0x10 format with dual checksums (complete)
Validation Status:
-
Byte-for-byte round-trip validation with real WoW files
-
Successfully processes encoding, root, install, and download files
-
Integration tests with WoW Classic Era production data
-
Builder support for creating valid BLTE files programmatically
-
Both standard (0x0F) and extended (0x10) chunk formats supported
Python Tools (cascette-py)
Analysis and decompression tool supports:
-
None (N), ZLib (Z), Frame (F) modes
-
LZ4 (4) - Analysis only, decompression requires Rust implementation
-
Encrypted (E) - Detection and metadata extraction
See https://github.com/wowemulation-dev/cascette-py for the Python implementation.
References
- wowdev.wiki BLTE documentation
- See ESpec Format for encoding specification strings
- See Salsa20 Encryption for encrypted block details
ESpec (Encoding Specification) Documentation
Overview
ESpec is a domain-specific language used throughout NGDP for specifying BLTE encoding instructions. It defines how content blocks are compressed, encrypted, and structured within BLTE containers. ESpec appears in patch configurations, encoding files, and BLTE block headers.
Grammar Components
ESpec uses single-character identifiers for encoding operations:
Basic Encodings
-
n: Plain/uncompressed data
-
z: Zlib compression
-
e: Encryption
-
b: Block-based encoding
-
c: BCPack compression
-
g: GDeflate compression
Encoding Combinations
ESpec supports nested and sequential encoding operations through composition.
Block Syntax
Size Specifications
Block sizes support unit suffixes:
-
K: Kilobytes (1024 bytes)
-
M: Megabytes (1024 * 1024 bytes)
-
No suffix: Bytes
Count Specifications
Block counts can be:
-
Exact number: Specific block count (e.g.,
3) -
Variable: Asterisk (
*) for variable block count -
Dynamic sizing: Block count of zero with an average block size. Block boundaries are determined dynamically based on content. Distinct from variable (
*) block count.
Block Format
b:{size[*count]=encoding}
Components:
-
size: Block size with optional unit suffix
-
count: Block count (optional, defaults to 1)
-
encoding: Encoding specification for blocks
Grammar Reference
Simple Encodings
plain := "n"
zlib := "z" [ ":" ( level | "{" zlib_params "}" ) ]
zlib_params := ( level | variant ) [ "," ( variant | window_bits ) ] [ "," window_bits ]
encryption := "e" ":" "{" key "," iv "," content_encoding "}"
Zlib supports multiple syntax forms: z, z:9, z:{9}, z:{9,mpq},
z:{9,15}, z:{9,mpq,15}, z:{mpq}, z:{mpq,15}. The second parameter
can be either a variant name or a numeric window_bits value.
Block Encoding
block := "b" ":" ( "{" block_spec { "," block_spec } "}" | encoding )
block_spec := size [ "*" count ] "=" encoding
size := number [ unit ]
unit := "K" | "M"
count := number | "*"
A block table can omit braces when it contains a single encoding with no size
specification: b:z is equivalent to a single block with no explicit size.
Complex Encodings
encoding := plain | zlib | encryption | block | bcpack | gdeflate
bcpack := "c" [ ":" "{" bcn "}" ]
gdeflate := "g" [ ":" "{" level "}" ]
Examples
Simple Block Encoding
b:{495=z,9673=n}
This specifies:
-
First block: 495 bytes, zlib compressed
-
Second block: 9673 bytes, uncompressed
Variable Block Sizes
b:{16K*=z}
This specifies:
-
Variable number of 16KB blocks
-
All blocks use zlib compression
Encrypted Blocks
b:{256K*=e:{key,iv,z}}
This specifies:
-
Variable number of 256KB blocks
-
Each block is encrypted with specified key and IV
-
Content is zlib compressed before encryption
Compression Levels
b:{16K*=z:{6,mpq}}
This specifies:
-
Variable number of 16KB blocks
-
Zlib compression level 6
-
MPQ-compatible compression settings
Mixed Block Types
b:{1K=n,4K*=z,2K=n}
This specifies:
-
First block: 1KB uncompressed
-
Variable number of 4KB zlib-compressed blocks
-
Final block: 2KB uncompressed
Zlib Compression Levels
Level Specification
Zlib compression supports level, variant, and window bits parameters:
z:{level}
z:{level,window_bits}
z:{level,variant}
z:{level,variant,window_bits}
Standard Levels
Valid levels are 1-9:
-
1: Fastest compression
-
6: Default compression (balance of speed/size)
-
9: Maximum compression
Level 0 is not accepted.
Variant Specifications
-
mpq: MPQ-compatible compression settings
-
zlib: Standard zlib settings
-
lz4hc: LZ4HC-compatible compression settings
Window Bits
Zlib window bit count can be specified in range [8, 15]. Two values can be provided (must match). Default is 15.
Compression Examples
z:{1} # Fast compression
z:{9} # Maximum compression
z:{6,mpq} # MPQ-compatible level 6
z:{6,zlib,15} # Zlib variant with explicit window bits
Encryption Specification
Format
e:{key,iv,content_encoding}
Components
-
key: Encryption key identifier or value
-
iv: Initialization vector
-
content_encoding: Encoding applied before encryption
Key Format
Keys must be exactly 16 hex characters (8 bytes):
e:{0123456789abcdef,fedcba98,z}
This specifies:
-
Encryption key:
0123456789abcdef(16 hex chars, 8 bytes) -
IV:
fedcba98(8 hex chars, 4 bytes) -
Content: zlib compressed before encryption
The parser rejects keys that are not exactly 16 hex characters. The IV must be exactly 8 hex characters (4 bytes).
BCPack Compression
BCPack Usage
c
c:{3}
BCPack compression uses a proprietary algorithm optimized for specific content types. An optional BCn (block compression number) parameter selects the mode, in range [1, 7]:
bcpack := "c" [ ":" "{" bcn "}" ]
Block-Based BCPack
b:{64K*=c}
b:{64K*=c:{5}}
Variable 64KB blocks using BCPack compression.
GDeflate Compression
GDeflate Usage
g
g:{6}
GDeflate is a GPU-accelerated deflate variant designed for DirectStorage. An optional compression level parameter can be specified in range [1, 12]:
gdeflate := "g" [ ":" "{" level "}" ]
Block-Based GDeflate
b:{32K*=g}
b:{32K*=g:{8}}
Variable 32KB blocks using GDeflate compression.
Usage Contexts
PatchConfig Files
ESpec appears in patch-entry lines:
patch-entry = source_hash target_hash size espec
Example:
patch-entry = 1234567890abcdef abcdef1234567890 524288 b:{16K*=z}
Encoding Files
Encoding files use ESpec for content encoding specifications:
content_key encoded_key size espec
BLTE Data Blocks
BLTE headers contain ESpec for block processing instructions:
graph TD
A[BLTE Header] --> B[Block Count]
A --> C[ESpec]
C --> D[Block 1 Processing]
C --> E[Block 2 Processing]
C --> F[Block N Processing]
Parser Implementation
Tokenization
ESpec parsing requires tokenization of:
- Identifiers: Single characters (n, z, e, b, c, g)
- Numbers: Decimal integers
- Units: Size suffixes (K, M)
- Delimiters: Braces, colons, commas, equals, asterisks
Grammar Rules
#![allow(unused)]
fn main() {
// Example parser structure
enum ESpec {
Plain,
Zlib { level: Option<u8>, variant: Option<String> },
Encryption { key: String, iv: String, content: Box<ESpec> },
Block { specs: Vec<BlockSpec> },
BCPack,
GDeflate,
}
struct BlockSpec {
size: u64,
count: BlockCount,
encoding: ESpec,
}
enum BlockCount {
Exact(u32),
Variable,
}
}
Error Handling
Common parsing errors:
-
Invalid identifier characters
-
Malformed block specifications
-
Missing required parameters
-
Invalid size or count values
-
Unbalanced braces or parentheses
Validation Rules
Size Constraints
-
Block sizes must be positive integers
-
Maximum block size typically limited to several MB
-
Minimum block size typically 1 byte
Count Constraints
-
Block counts must be positive integers when specified
-
Variable count (
*) requires size specification -
Total content size must be consistent
Encoding Constraints
-
Encryption requires valid key and IV lengths
-
Compression levels must be within algorithm-specific ranges
-
Nested encodings must be logically valid
Performance Considerations
Block Size Selection
Block sizes depend on usage:
-
Small blocks (1-4KB): Better for streaming, higher overhead
-
Medium blocks (16-64KB): Balanced performance
-
Large blocks (256KB+): Better compression ratios, higher memory usage
Compression Algorithm Selection
Algorithm characteristics:
-
zlib: Universal compatibility, good compression
-
BCPack: Optimized for specific content types
-
GDeflate: Fast compression with good ratios
-
None (n): Maximum speed, no space savings
Memory Usage
#![allow(unused)]
fn main() {
// Example memory-efficient processing
fn process_blocks(espec: &ESpec, data: &[u8]) -> Result<Vec<u8>> {
match espec {
ESpec::Block { specs } => {
let mut output = Vec::new();
let mut offset = 0;
for spec in specs {
let block_data = &data[offset..offset + spec.size as usize];
let processed = process_encoding(&spec.encoding, block_data)?;
output.extend(processed);
offset += spec.size as usize;
}
Ok(output)
}
// Other encoding types...
}
}
}
Common Patterns
Streaming-Optimized
b:{16K*=z}
Small, consistent block sizes for streaming applications.
Storage-Optimized
b:{1M*=z:{9}}
Large blocks with maximum compression for storage efficiency.
Mixed Content
b:{4K=n,64K*=z,4K=n}
Headers and footers uncompressed, bulk content compressed.
Encrypted Streaming
b:{32K*=e:{key,iv,z:{6}}}
Moderate block sizes with encryption and balanced compression.
Debugging and Validation
ESpec Validation
#![allow(unused)]
fn main() {
fn validate_espec(espec: &str) -> Result<ESpec, ESpecError> {
let parsed = parse_espec(espec)?;
validate_constraints(&parsed)?;
Ok(parsed)
}
fn validate_constraints(espec: &ESpec) -> Result<(), ESpecError> {
match espec {
ESpec::Zlib { level: Some(level), .. } if *level > 9 => {
Err(ESpecError::InvalidCompressionLevel(*level))
}
ESpec::Block { specs } if specs.is_empty() => {
Err(ESpecError::EmptyBlockSpec)
}
// Additional validation rules...
_ => Ok(())
}
}
}
Round-Trip Testing
#![allow(unused)]
fn main() {
#[test]
fn test_espec_round_trip() {
let original = "b:{16K*=z:{6}}";
let parsed = parse_espec(original).unwrap();
let serialized = serialize_espec(&parsed);
assert_eq!(original, serialized);
}
}
Integration Examples
BLTE Block Processing
#![allow(unused)]
fn main() {
fn process_blte_block(espec: &ESpec, input: &[u8]) -> Result<Vec<u8>> {
match espec {
ESpec::Plain => Ok(input.to_vec()),
ESpec::Zlib { level, .. } => decompress_zlib(input),
ESpec::Encryption { key, iv, content } => {
let decrypted = decrypt(input, key, iv)?;
process_blte_block(content, &decrypted)
}
ESpec::Block { specs } => process_block_specs(specs, input),
}
}
}
Patch Application
#![allow(unused)]
fn main() {
fn apply_patch_with_espec(
source: &[u8],
patch: &[u8],
espec: &ESpec
) -> Result<Vec<u8>> {
let processed_patch = process_blte_block(espec, patch)?;
apply_binary_patch(source, &processed_patch)
}
}
Reference Implementation
Complete Parser
#![allow(unused)]
fn main() {
use nom::{
branch::alt,
bytes::complete::tag,
character::complete::{alphanumeric1, char, digit1},
combinator::{map, opt},
multi::separated_list0,
sequence::{delimited, preceded, separated_pair, tuple},
IResult,
};
pub fn parse_espec(input: &str) -> IResult<&str, ESpec> {
alt((
parse_plain,
parse_zlib,
parse_encryption,
parse_block,
parse_bcpack,
parse_gdeflate,
))(input)
}
fn parse_plain(input: &str) -> IResult<&str, ESpec> {
map(char('n'), |_| ESpec::Plain)(input)
}
fn parse_zlib(input: &str) -> IResult<&str, ESpec> {
map(
tuple((
char('z'),
opt(preceded(
char(':'),
delimited(
char('{'),
separated_pair(
digit1,
opt(char(',')),
opt(alphanumeric1)
),
char('}')
)
))
)),
|(_, params)| match params {
Some((level, variant)) => ESpec::Zlib {
level: level.parse().ok(),
variant: variant.map(|s| s.to_string()),
},
None => ESpec::Zlib { level: None, variant: None },
}
)(input)
}
}
Implementation Status
Rust Implementation (cascette-formats)
ESpec parser:
- Plain (n) - Uncompressed content
- ZLib compression (z) - Level [1,9], variant (mpq/zlib/lz4hc), window bits [8,15]; all optional, 3-param syntax supported
- Encryption (e) - Key, IV, and nested content encoding
- Block-based (b) - Variable and fixed block specifications
- BCPack (c) - Optional BCn version [1,7]; bare
caccepted - GDeflate (g) - Optional level [1,12]; bare
gaccepted
Parser Features:
- Safe integer casting with
try_fromto prevent truncation - Display trait implementation for round-trip string conversion
- Test suite covering production ESpec patterns and edge cases
- Integration with BLTE and Encoding file processing
Analysis and Validation
ESpec patterns are validated across all CASC formats to ensure correct parsing and processing of compression and encryption specifications.
Salsa20 Encryption in CASC
Salsa20 is the primary stream cipher used for encrypting sensitive content in CASC archives. It provides fast, secure encryption for game assets while maintaining streaming capabilities.
Overview
CASC uses Salsa20 with 128-bit (16-byte) keys and the tau (“expand 16-byte k”) constants. Each encrypted BLTE block specifies a 64-bit key name for key store lookup and a 4-byte IV that is extended to 8 bytes by zero-padding.
Algorithm Details
Salsa20 Core
Salsa20 is a stream cipher designed by Daniel J. Bernstein:
-
Key size: 128 bits (16 bytes) in CASC; 256 bits (32 bytes) in standard Salsa20
-
Nonce/IV size: 64 bits (8 bytes)
-
Block size: 512 bits (64 bytes)
-
Rounds: 20 (reduced variants use 8 or 12)
Core Function
#![allow(unused)]
fn main() {
fn salsa20_core(input: &[u32; 16]) -> [u32; 16] {
let mut x = *input;
// 20 rounds (10 double-rounds)
for _ in 0..10 {
// Column round
quarter_round(&mut x, 0, 4, 8, 12);
quarter_round(&mut x, 5, 9, 13, 1);
quarter_round(&mut x, 10, 14, 2, 6);
quarter_round(&mut x, 15, 3, 7, 11);
// Row round
quarter_round(&mut x, 0, 1, 2, 3);
quarter_round(&mut x, 5, 6, 7, 4);
quarter_round(&mut x, 10, 11, 8, 9);
quarter_round(&mut x, 15, 12, 13, 14);
}
// Add input to output
for i in 0..16 {
x[i] = x[i].wrapping_add(input[i]);
}
x
}
fn quarter_round(x: &mut [u32; 16], a: usize, b: usize, c: usize, d: usize) {
x[b] ^= (x[a].wrapping_add(x[d])).rotate_left(7);
x[c] ^= (x[b].wrapping_add(x[a])).rotate_left(9);
x[d] ^= (x[c].wrapping_add(x[b])).rotate_left(13);
x[a] ^= (x[d].wrapping_add(x[c])).rotate_left(18);
}
}
CASC Implementation
BLTE Encryption Block
In BLTE files, encrypted blocks use format:
[0x45] [key_name_size:1] [key_name:8] [iv_size:1] [iv:4] [type:1]
[encrypted_data...]
Where:
-
0x45: ‘E’ marker for encrypted block -
key_name: 64-bit key identifier -
iv: Initialization vector (1-8 bytes, typically 4) -
type: 0x53 (‘S’) for Salsa20. 0x41 (‘A’) for ARC4 in legacy CASC versions (not used in TACT 3.13.3+)
Key Lookup
CASC uses a 64-bit key name to look up the 16-byte encryption key from a key store. The agent calls a key getter callback with the key name; there is no key derivation in the encryption path.
#![allow(unused)]
fn main() {
struct CASCKeyManager {
keys: HashMap<u64, [u8; 16]>, // key_name -> 16-byte key
}
impl CASCKeyManager {
pub fn get_key(&self, key_name: u64) -> Option<[u8; 16]> {
self.keys.get(&key_name).copied()
}
}
}
IV Modification for Chunks
For multi-chunk BLTE files, the IV is modified per chunk:
#![allow(unused)]
fn main() {
fn modify_iv_for_chunk(base_iv: u32, chunk_index: usize) -> u32 {
let mut iv_bytes = base_iv.to_le_bytes();
// XOR with chunk index
for i in 0..4 {
iv_bytes[i] ^= ((chunk_index >> (i * 8)) & 0xFF) as u8;
}
u32::from_le_bytes(iv_bytes)
}
}
Salsa20 State Setup
State Initialization
#![allow(unused)]
fn main() {
struct Salsa20State {
state: [u32; 16],
counter: u64,
}
impl Salsa20State {
pub fn new(key: &[u8; 16], nonce: &[u8; 8]) -> Self {
let mut state = [0u32; 16];
// Tau constants "expand 16-byte k" (CASC uses 16-byte keys)
state[0] = 0x61707865; // "expa"
state[5] = 0x3120646e; // "nd 1"
state[10] = 0x79622d36; // "6-by"
state[15] = 0x6b206574; // "te k"
// 16-byte key placed at positions 1-4 and duplicated at 11-14
for i in 0..4 {
let word = u32::from_le_bytes([
key[i * 4],
key[i * 4 + 1],
key[i * 4 + 2],
key[i * 4 + 3],
]);
state[1 + i] = word;
state[11 + i] = word; // Duplicate for 16-byte key mode
}
// Counter (initially 0)
state[8] = 0;
state[9] = 0;
// Nonce
state[6] = u32::from_le_bytes([nonce[0], nonce[1], nonce[2], nonce[3]]);
state[7] = u32::from_le_bytes([nonce[4], nonce[5], nonce[6], nonce[7]]);
Salsa20State { state, counter: 0 }
}
}
}
Encryption/Decryption
Stream Generation
#![allow(unused)]
fn main() {
impl Salsa20State {
pub fn generate_keystream(&mut self, output: &mut [u8]) {
let mut pos = 0;
while pos < output.len() {
// Generate next block
let block = salsa20_core(&self.state);
// Convert to bytes
let block_bytes = unsafe {
std::slice::from_raw_parts(
block.as_ptr() as *const u8,
64
)
};
// Copy to output
let copy_len = std::cmp::min(64, output.len() - pos);
output[pos..pos + copy_len]
.copy_from_slice(&block_bytes[..copy_len]);
// Increment counter
self.increment_counter();
pos += copy_len;
}
}
fn increment_counter(&mut self) {
self.counter += 1;
self.state[8] = (self.counter & 0xFFFFFFFF) as u32;
self.state[9] = (self.counter >> 32) as u32;
}
}
}
Decryption Process
#![allow(unused)]
fn main() {
pub fn decrypt_salsa20(
ciphertext: &[u8],
key: &[u8; 32],
nonce: &[u8; 8]
) -> Vec<u8> {
let mut state = Salsa20State::new(key, nonce);
let mut keystream = vec![0u8; ciphertext.len()];
state.generate_keystream(&mut keystream);
// XOR ciphertext with keystream
let mut plaintext = Vec::with_capacity(ciphertext.len());
for i in 0..ciphertext.len() {
plaintext.push(ciphertext[i] ^ keystream[i]);
}
plaintext
}
}
CASC-Specific Usage
BLTE Decryption
#![allow(unused)]
fn main() {
fn decrypt_blte_chunk(
chunk_data: &[u8],
chunk_index: usize,
key_manager: &CASCKeyManager
) -> Result<Vec<u8>> {
// Parse encryption header
let key_name_size = chunk_data[0] as usize;
let key_name = u64::from_le_bytes(
chunk_data[1..1 + key_name_size].try_into()?
);
let iv_offset = 1 + key_name_size;
let iv_size = chunk_data[iv_offset] as usize;
let base_iv = u32::from_le_bytes(
chunk_data[iv_offset + 1..iv_offset + 1 + iv_size].try_into()?
);
let cipher_type = chunk_data[iv_offset + 1 + iv_size];
if cipher_type != 0x53 { // 'S' for Salsa20
return Err("Not Salsa20 encrypted");
}
// Get encryption key
let key = key_manager.get_key(key_name)
.ok_or("Key not found")?;
// Modify IV for chunk
let iv = modify_iv_for_chunk(base_iv, chunk_index);
let mut nonce = [0u8; 8];
nonce[..4].copy_from_slice(&iv.to_le_bytes());
// Decrypt data
let encrypted_offset = iv_offset + 1 + iv_size + 1;
let ciphertext = &chunk_data[encrypted_offset..];
Ok(decrypt_salsa20(ciphertext, &key, &nonce))
}
}
Known Keys
CASC uses various encryption keys for different content:
#![allow(unused)]
fn main() {
// Example key names (actual keys not included for legal reasons)
const CINEMATIC_KEY: u64 = 0xFAC5C7F366D20C85;
const ACHIEVEMENT_KEY: u64 = 0x0123456789ABCDEF;
const PVP_KEY: u64 = 0xDEADBEEFCAFEBABE;
}
Performance Optimization
SIMD Implementation
Using SIMD for parallel processing:
#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")]
use std::arch::x86_64::*;
unsafe fn salsa20_core_simd(input: &[u32; 16]) -> [u32; 16] {
// Load state into SIMD registers
let mut row0 = _mm_loadu_si128(input[0..4].as_ptr() as *const __m128i);
let mut row1 = _mm_loadu_si128(input[4..8].as_ptr() as *const __m128i);
let mut row2 = _mm_loadu_si128(input[8..12].as_ptr() as *const __m128i);
let mut row3 = _mm_loadu_si128(input[12..16].as_ptr() as *const __m128i);
// Perform rounds using SIMD operations
// ... (implementation details)
// Store results
let mut output = [0u32; 16];
_mm_storeu_si128(output[0..4].as_mut_ptr() as *mut __m128i, row0);
_mm_storeu_si128(output[4..8].as_mut_ptr() as *mut __m128i, row1);
_mm_storeu_si128(output[8..12].as_mut_ptr() as *mut __m128i, row2);
_mm_storeu_si128(output[12..16].as_mut_ptr() as *mut __m128i, row3);
output
}
}
Buffered Decryption
For large files:
#![allow(unused)]
fn main() {
struct BufferedSalsa20 {
state: Salsa20State,
buffer: [u8; 4096],
buffer_pos: usize,
}
impl BufferedSalsa20 {
pub fn decrypt_stream<R: Read, W: Write>(
&mut self,
input: &mut R,
output: &mut W
) -> Result<()> {
let mut cipher_buffer = [0u8; 4096];
loop {
let bytes_read = input.read(&mut cipher_buffer)?;
if bytes_read == 0 {
break;
}
self.state.generate_keystream(&mut self.buffer[..bytes_read]);
for i in 0..bytes_read {
self.buffer[i] ^= cipher_buffer[i];
}
output.write_all(&self.buffer[..bytes_read])?;
}
Ok(())
}
}
}
Security Considerations
- IV Uniqueness: IVs must not be reused with the same key (CASC handles this via chunk index XOR)
- Side Channels: Use constant-time operations for key comparison
- Key Storage: CASC encryption keys are static and community-maintained;
the
TactKeyStorekeeps them in memory with redacted debug output
Testing
Test Vectors
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
#[test]
fn test_salsa20_encryption() {
let key = [0u8; 32];
let nonce = [0u8; 8];
let plaintext = b"Hello, World!";
let ciphertext = encrypt_salsa20(plaintext, &key, &nonce);
let decrypted = decrypt_salsa20(&ciphertext, &key, &nonce);
assert_eq!(plaintext, &decrypted[..]);
}
}
}
cascette-crypto API
The cascette-crypto crate provides CASC-specific Salsa20 implementation.
Basic Decryption
#![allow(unused)]
fn main() {
use cascette_crypto::salsa20::{decrypt_salsa20, Salsa20Cipher};
// CASC uses 16-byte keys and 4-byte IVs
let key: [u8; 16] = [0x01; 16];
let iv: [u8; 4] = [0x02, 0x03, 0x04, 0x05];
let block_index = 0; // First block in BLTE file
let ciphertext = &[/* encrypted data */];
let plaintext = decrypt_salsa20(ciphertext, &key, &iv, block_index)
.expect("decryption failed");
}
In-Place Processing
#![allow(unused)]
fn main() {
use cascette_crypto::Salsa20Cipher;
let key: [u8; 16] = [0x42; 16];
let iv: [u8; 4] = [0x11, 0x22, 0x33, 0x44];
let mut cipher = Salsa20Cipher::new(&key, &iv, 0)
.expect("cipher creation failed");
let mut data = vec![0u8; 1024];
cipher.apply_keystream(&mut data);
}
TACT Key Management
#![allow(unused)]
fn main() {
use cascette_crypto::{TactKeyStore, TactKey};
// Create store with hardcoded WoW keys
let store = TactKeyStore::new();
// Look up key by ID
let key_id = 0xFA505078126ACB3E_u64;
if let Some(key) = store.get(key_id) {
// Use key for decryption
println!("Found key: {:02X?}", key);
}
// Add custom key
let mut store = TactKeyStore::empty();
let key = TactKey::from_hex(
0x1234567890ABCDEF,
"0123456789ABCDEF0123456789ABCDEF"
).expect("invalid key hex");
store.add(key);
// Load keys from string content (file I/O is caller's responsibility)
let csv_content = "FA505078126ACB3E,BDC51862ABED79B2DE48C8E7E66C6200";
store.load_from_csv(csv_content);
let txt_content = "FA505078126ACB3E BDC51862ABED79B2DE48C8E7E66C6200";
store.load_from_txt(txt_content);
}
Custom Storage Backends
The TactKeyProvider trait allows implementing custom key storage:
#![allow(unused)]
fn main() {
use cascette_crypto::{TactKeyProvider, TactKey, CryptoError};
// Implement for keyring, database, encrypted files, etc.
struct MyKeyStore { /* ... */ }
impl TactKeyProvider for MyKeyStore {
fn get_key(&self, id: u64) -> Result<Option<[u8; 16]>, CryptoError> {
// Look up key from your storage backend
todo!()
}
fn add_key(&mut self, key: TactKey) -> Result<(), CryptoError> {
// Store key in your backend
todo!()
}
// ... other trait methods
}
}
ARC4 (Legacy)
#![allow(unused)]
fn main() {
use cascette_crypto::Arc4Cipher;
// ARC4 used in older BLTE encrypted blocks
let key = b"encryption_key";
let mut cipher = Arc4Cipher::new(key)
.expect("cipher creation failed");
let encrypted = cipher.encrypt(b"plaintext");
// Decrypt requires fresh cipher instance
let mut cipher = Arc4Cipher::new(key)
.expect("cipher creation failed");
let decrypted = cipher.decrypt(&encrypted);
}
Implementation Details
CASC-Specific Differences
The CASC Salsa20 variant differs from standard Salsa20:
| Aspect | Standard Salsa20 | CASC Salsa20 |
|---|---|---|
| Key size | 32 bytes | 16 bytes (duplicated internally) |
| IV/Nonce size | 8 bytes | 4 bytes (extended internally) |
| Constants | “expand 32-byte k” | “expand 16-byte k” |
| Block index | Counter-based | XORed with IV |
Key Duplication
CASC uses 16-byte keys with the “expand 16-byte k” (tau) constants:
#![allow(unused)]
fn main() {
// Tau constants for 16-byte keys
state[0] = 0x61707865; // "expa"
state[5] = 0x3120646e; // "nd 1"
state[10] = 0x79622d36; // "6-by"
state[15] = 0x6b206574; // "te k"
// Key bytes 0-15 placed at positions 1-4
// Key bytes 0-15 repeated at positions 11-14
}
IV Extension
The IV modification and zero-padding algorithm is documented in the CASC Implementation section above.
Validation Status
-
Integration tests with real WoW encryption keys
-
Test suite validates against known BLTE ‘E’ mode samples
-
Zero-allocation keystream generation for performance
Note: CascLib duplicates the IV (same bug as was in cascette-rs before the fix). The correct behavior is zero-padding.
TACT Key Coverage
The cascette-crypto crate includes hardcoded TACT keys for major WoW expansions:
- Battle for Azeroth, Shadowlands, The War Within, Classic Era
Keys are stored with redacted debug output to prevent accidental logging.
References
-
See BLTE Format for encryption in BLTE blocks
-
See Archives for encrypted content storage
CDN Architecture Documentation
Overview
NGDP uses a Content Delivery Network (CDN) architecture for distributing game content. The system provides geographical distribution of content through HTTP/HTTPS endpoints, with automatic failover and load balancing capabilities.
Note: Code examples in this document illustrate concepts. For working implementations, see the
cascetteCLI or the cascette-protocol crate.
Discovery and Access Flow
Product Discovery
Product discovery begins with a v1/summary query to the Ribbit TCP service:
sequenceDiagram
participant Client
participant Ribbit
participant CDN
Client->>Ribbit: v1/summary (TCP)
Ribbit-->>Client: Available products
Client->>Ribbit: v2/versions/{product}
Ribbit-->>Client: Version manifests
Client->>Ribbit: v2/cdns/{product}
Ribbit-->>Client: CDN configurations
Client->>CDN: HTTP GET config files
CDN-->>Client: BuildConfig, CDNConfig
Client->>CDN: HTTP GET content files
CDN-->>Client: Game data
Region Selection
NGDP supports the following regions:
-
us: United States -
eu: Europe -
kr: Korea -
tw: Taiwan -
cn: China (restricted access) -
sg: Singapore
HTTPS v2 Endpoints
The v2 API provides three primary endpoints:
-
versions: Product version information and build manifests
-
cdns: CDN server configurations and endpoints
-
bgdl: Background download configurations
Configuration Retrieval Process
- Query product versions to get current build information
- Retrieve CDN configurations to get the correct Path value
- Download BuildConfig and CDNConfig files using the Path from step 2
- Parse configuration to locate content files
- Begin content download from CDN servers
CRITICAL: Always extract the Path field from CDN responses. Never assume
paths based on product names. For example, all WoW products (wow,
wow_classic, wow_classic_era, wow_classic_titan, wow_anniversary) use
tpr/wow despite having different product codes.
Content Download Workflow
flowchart TD
A[Get Product Versions] --> B[Select Build]
B --> C[Get CDN Config]
C --> D[Download BuildConfig]
D --> E[Download CDNConfig]
E --> F[Parse Archive Lists]
F --> G[Download Content Files]
G --> H[Verify Content Hashes]
style A stroke-width:3px
style H stroke-width:3px
style C stroke-width:2px,stroke-dasharray:5 5
style E stroke-width:2px,stroke-dasharray:5 5
CDN URL Construction
URL Pattern
http(s)://{cdn_server}/{cdn_path}/{type}/{hash[0:2]}/{hash[2:4]}/{full_hash}
Component Breakdown
-
cdn_server: CDN hostname from the
Hostsfield (e.g.,level3.blizzard.com) -
cdn_path: Path from the
Pathfield - MUST be extracted from CDN response -
type: Content type (
config,data,patch) -
hash[0:2]: First two characters of content hash
-
hash[2:4]: Next two characters of content hash
-
full_hash: Complete content hash
Path vs ProductPath Distinction
IMPORTANT: The CDN response contains two path fields that serve different purposes:
-
Path (e.g.,
tpr/wow): Used for ALL game content including:- Build configuration files (
/config/) - CDN configuration files (
/config/) - Encoding files (
/data/) - Root files (
/data/) - Archive files (
/data/) - Patch files (
/patch/) - All other game data
- Build configuration files (
-
ProductPath (e.g.,
tpr/configs): Used ONLY for:- Product configuration files that Battle.net agent/launcher use
- These are JSON files containing product metadata and settings
- Example:
http://cdn.arctium.tools/tpr/configs/data/{hash}
Common mistake: Do NOT use ProductPath for build configs, CDN configs, or any game data files. ProductPath is exclusively for Battle.net launcher product configuration.
Directory Sharding
The two-level directory structure (hash[0:2]/hash[2:4]) distributes files
across 65,536 directories, keeping per-directory file counts low for filesystem
and CDN edge server performance.
Example URLs
# Configuration file
http://level3.blizzard.com/tpr/wow/config/12/34/1234567890abcdef1234567890abcdef
# Game data file
http://level3.blizzard.com/tpr/wow/data/ab/cd/abcdef1234567890abcdef1234567890
# Patch data
http://level3.blizzard.com/tpr/wow/patch/56/78/567890abcdef1234567890abcdef123456
Real-World Examples
Examples from wow_classic_era version 1.15.7.61582 (archived on Arctium CDN):
# Build configuration (hash: ae66faee0ac786fdd7d8b4cf90a8d5b9)
http://cdn.arctium.tools/tpr/wow/config/ae/66/ae66faee0ac786fdd7d8b4cf90a8d5b9
# CDN configuration (hash: 63eee50d456a6ddf3b630957c024dda0)
http://cdn.arctium.tools/tpr/wow/config/63/ee/63eee50d456a6ddf3b630957c024dda0
# Patch configuration (hash: 474b9630df5b46df5d98ec27c5f78d07)
http://cdn.arctium.tools/tpr/wow/config/47/4b/474b9630df5b46df5d98ec27c5f78d07
# Product configuration (different path structure)
http://cdn.arctium.tools/tpr/configs/data/c9/93/c9934edfc8f217a2e01c47e4deae8454
# Encoding file (using encoding key, not content key!)
# From build config: encoding = b07b881f4527bda7cf8a1a2f99e8622e bbf06e7476382cfaa396cff0049d356b
# Must use the SECOND hash (encoding key): bbf06e7476382cfaa396cff0049d356b
http://cdn.arctium.tools/tpr/wow/data/bb/f0/bbf06e7476382cfaa396cff0049d356b
# Root file: Cannot be fetched directly!
# The root file's encoding key must be looked up in the encoding file first.
# The hash ea8aefdebdbd6429da905c8c6a2b1813 is the content key, not the encoding key.
Note the different path structures:
-
Most files use
/tpr/wow/{type}/ -
Product configurations use
/tpr/configs/data/ -
Patch files would be under
/tpr/wow/patch/
Configuration Files
BuildConfig, CDNConfig, PatchConfig
See Configuration File Formats for the authoritative documentation of BuildConfig, CDNConfig, and PatchConfig fields, formats, and examples.
The key point for CDN access: most BuildConfig fields contain <content-key> <encoding-key> pairs. Use the encoding key (second hash) for CDN fetches.
The encoding file must be fetched first to resolve encoding keys for other
files.
CDN Response Structure
Field Definitions
-
Name: CDN configuration identifier
-
Path: Base path for content requests
-
Hosts: List of CDN hostnames
-
Servers: Legacy server configuration
-
ConfigPath: Path to configuration files
Special Parameters
-
maxhosts: Maximum number of hosts to use simultaneously
-
fallback: Fallback CDN configuration
Example CDN Response
Name!STRING:0|Path!STRING:0|Hosts!STRING:0|Servers!STRING:0|ConfigPath!STRING:0
us|tpr/wow|level3.blizzard.com edgecast.blizzard.com|http://level3.blizzard.com/ http://edgecast.blizzard.com/|tpr/configs/data
eu|tpr/wow|eu.cdn.blizzard.com|http://eu.cdn.blizzard.com/|tpr/configs/data
Path Types
Content Types
-
config: Configuration files (BuildConfig, CDNConfig, etc.)
-
data: Game content files and archives
-
patch: Differential patch data
Usage Patterns
# Configuration files
/{cdn_path}/config/{hash_dirs}/{hash}
# Game data
/{cdn_path}/data/{hash_dirs}/{hash}
# Patch data
/{cdn_path}/patch/{hash_dirs}/{hash}
Implementation Requirements
Mandatory Components
Both BuildConfig AND CDNConfig are required for proper NGDP operation:
-
BuildConfig provides system file references
-
CDNConfig specifies content storage locations
-
Missing either file prevents content access
CDN Path Resolution
Extract the Path field from CDN responses as described in the
Configuration Retrieval Process section.
Cache the path per product for the session duration.
Fallback Logic
Implement fallback mechanisms:
- CDN Rotation: Cycle through available CDN servers
- Region Fallback: Fall back to alternate regions if available
- Protocol Fallback: HTTPS preferred, HTTP as fallback
- Retry Logic: Exponential backoff for failed requests
Rate Limiting
Implement client-side rate limiting:
-
Respect CDN server limitations
-
Implement connection pooling
-
Use appropriate request timeouts
-
Avoid overwhelming CDN infrastructure
Regional Restrictions
China (cn) region has special considerations:
-
Limited CDN access
-
Different server infrastructure
-
Potential connectivity restrictions
-
Require region-specific handling
Backup Servers
Community Mirrors
Several community-maintained mirrors provide NGDP content:
cdn.arctium.tools
-
Protocol: HTTP only
-
Status: Active
-
Coverage: Full NGDP content mirror
casc.wago.tools
-
Protocol: HTTP with HTTPS redirects
-
Status: Active
-
Coverage: Full NGDP mirror
archive.wow.tools
-
Protocol: HTTPS
-
Status: Active
-
Coverage: Historical NGDP content archive
Mirror Usage
# Primary CDN (preferred)
curl http://level3.blizzard.com/tpr/wow/data/12/34/1234567890abcdef
# Backup mirror
curl http://cdn.arctium.tools/tpr/wow/data/12/34/1234567890abcdef
File Types
Core Manifests
System files that define content structure:
-
root: Maps file paths to content keys
-
encoding: Maps content keys to encoded storage keys
-
install: Defines installation requirements and file tags
-
download: Specifies download priorities for streaming
-
size: Contains file size information
Storage Files
Content storage and indexing:
-
archives: Bulk content storage containers
-
indexes: Index files for locating content within archives
Encryption Files
Content protection and key management:
- KeyRing: Encryption key storage format for protected content
File Type Usage
graph TD
A[BuildConfig] --> B[Root File]
A --> C[Encoding File]
A --> D[Install Manifest]
A --> E[Download Manifest]
B --> F[Game Files]
C --> G[Archive Content]
D --> H[Installation Tags]
E --> I[Download Priorities]
J[CDNConfig] --> K[Archive Files]
K --> L[Archive Indices]
M[KeyRing] --> N[Encryption Keys]
N --> O[Protected Content]
style A stroke-width:4px
style J stroke-width:4px
style M stroke-width:3px,stroke-dasharray:5 5
style B stroke-width:2px
style C stroke-width:2px
style D stroke-width:2px
style E stroke-width:2px
Error Handling
HTTP Status Codes
-
200: Successful content retrieval
-
404: Content not found (may require fallback)
-
416: Range not satisfiable (check request headers)
-
503: Service unavailable (implement retry with backoff)
Retry Strategies
#![allow(unused)]
fn main() {
// Example retry logic
async fn download_with_retry(url: &str, max_retries: u32) -> Result<Vec<u8>> {
let mut attempts = 0;
loop {
match download(url).await {
Ok(data) => return Ok(data),
Err(e) if attempts < max_retries => {
attempts += 1;
let delay = Duration::from_secs(2_u64.pow(attempts));
tokio::time::sleep(delay).await;
}
Err(e) => return Err(e),
}
}
}
}
Content Verification
Always verify downloaded content:
- Check HTTP response status
- Verify content length if provided
- Validate content hash against expected value
- Retry from alternate CDN on mismatch
Streaming Architecture Implementation
Connection Pooling Architecture
#![allow(unused)]
fn main() {
/// Connection-pooled CDN client with retry logic
pub struct PooledCdnClient {
/// Inner CDN client
inner: CdnClient,
/// Maximum concurrent connections
max_connections: usize,
/// Maximum retry attempts
max_retries: usize,
/// Initial retry delay
retry_delay: Duration,
}
impl PooledCdnClient {
/// Fetch range with exponential backoff retry logic
pub async fn fetch_range_with_retry(
&self,
archive_hash: &str,
offset: u64,
size: u64,
) -> ArchiveResult<Vec<u8>> {
let mut last_error = None;
for attempt in 0..=self.max_retries {
match self.inner.fetch_range(archive_hash, offset, size).await {
Ok(data) => return Ok(data),
Err(e) if attempt < self.max_retries && e.is_retryable() => {
// Exponential backoff: 100ms, 200ms, 400ms, 800ms...
let delay = self.retry_delay * (1u32 << attempt);
tokio::time::sleep(delay).await;
last_error = Some(e);
}
Err(e) => return Err(e),
}
}
Err(last_error.unwrap_or_else(||
ArchiveError::NetworkError("All retries exhausted".to_string())))
}
}
}
CDN Failover Mechanisms
#![allow(unused)]
fn main() {
/// Resilient archive resolver with fallback support
pub struct ResilientArchiveResolver {
/// Primary resolver
primary: CdnArchiveResolver,
/// Fallback resolvers
fallbacks: Vec<CdnArchiveResolver>,
/// Error threshold before switching to fallback
error_threshold: usize,
/// Current error count (atomic for thread safety)
error_count: AtomicUsize,
}
impl ResilientArchiveResolver {
/// Fetch content with automatic fallback
pub async fn fetch_content_resilient(&self, encoding_key: &[u8; 16]) -> ArchiveResult<Vec<u8>> {
// Try primary resolver first
match self.primary.fetch_content(encoding_key).await {
Ok(content) => {
// Reset error count on success
self.error_count.store(0, Ordering::Relaxed);
return Ok(content);
}
Err(e) if e.is_permanent() => return Err(e),
Err(e) => {
self.error_count.fetch_add(1, Ordering::Relaxed);
// Try fallback resolvers if error threshold exceeded
if self.error_count.load(Ordering::Relaxed) >= self.error_threshold {
for fallback in &self.fallbacks {
if let Ok(content) = fallback.fetch_content(encoding_key).await {
return Ok(content);
}
}
}
Err(e)
}
}
}
}
}
Range Request Coalescing
#![allow(unused)]
fn main() {
/// Streaming archive reader for network content
pub struct StreamingArchiveReader {
/// CDN client for network operations
client: Arc<PooledCdnClient>,
/// Current archive being read
archive_hash: String,
/// Current offset in archive
current_offset: u64,
/// Remaining size to read
remaining_size: u64,
/// Chunk size for streaming reads (default 64KB)
chunk_size: u64,
}
impl StreamingArchiveReader {
/// Read next chunk with automatic coalescing
pub async fn read_chunk(&mut self) -> ArchiveResult<Option<Vec<u8>>> {
if self.remaining_size == 0 {
return Ok(None);
}
let chunk_size = self.chunk_size.min(self.remaining_size);
let data = self
.client
.fetch_range_with_retry(&self.archive_hash, self.current_offset, chunk_size)
.await?;
// Verify response size matches request
if data.len() as u64 != chunk_size {
return Err(ArchiveError::IncompleteRangeResponse {
requested: chunk_size,
received: data.len() as u64,
});
}
self.current_offset += chunk_size;
self.remaining_size -= chunk_size;
Ok(Some(data))
}
/// Read all remaining data in one request (coalescing)
pub async fn read_all(&mut self) -> ArchiveResult<Vec<u8>> {
if self.remaining_size == 0 {
return Ok(Vec::new());
}
let data = self
.client
.fetch_range_with_retry(&self.archive_hash, self.current_offset, self.remaining_size)
.await?;
self.current_offset += self.remaining_size;
self.remaining_size = 0;
Ok(data)
}
}
}
Circuit Breaker Pattern
#![allow(unused)]
fn main() {
/// Circuit breaker states for CDN resilience
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum CircuitState {
Closed, // Normal operation
Open, // Failing fast, not attempting requests
HalfOpen, // Testing if service recovered
}
/// Circuit breaker for CDN endpoints
pub struct CdnCircuitBreaker {
state: Arc<Mutex<CircuitState>>,
failure_count: Arc<AtomicUsize>,
failure_threshold: usize,
timeout: Duration,
last_failure: Arc<Mutex<Option<Instant>>>,
}
impl CdnCircuitBreaker {
/// Execute request with circuit breaker protection
pub async fn execute<F, T, E>(&self, request: F) -> Result<T, E>
where
F: Future<Output = Result<T, E>>,
E: std::fmt::Debug,
{
// Check circuit state
match *self.state.lock().unwrap() {
CircuitState::Open => {
// Check if timeout period has passed
if let Some(last_failure) = *self.last_failure.lock().unwrap() {
if last_failure.elapsed() > self.timeout {
// Transition to half-open
*self.state.lock().unwrap() = CircuitState::HalfOpen;
} else {
return Err(/* circuit open error */);
}
}
}
CircuitState::HalfOpen => {
// Allow one test request
}
CircuitState::Closed => {
// Normal operation
}
}
// Execute request
match request.await {
Ok(result) => {
// Success - reset failure count and close circuit
self.failure_count.store(0, Ordering::Relaxed);
*self.state.lock().unwrap() = CircuitState::Closed;
Ok(result)
}
Err(error) => {
// Failure - increment count and possibly open circuit
let failures = self.failure_count.fetch_add(1, Ordering::Relaxed) + 1;
if failures >= self.failure_threshold {
*self.state.lock().unwrap() = CircuitState::Open;
*self.last_failure.lock().unwrap() = Some(Instant::now());
}
Err(error)
}
}
}
}
}
Caching Strategy
Implement efficient caching:
-
Cache configuration files with appropriate TTL
-
Use content-addressed storage for game files
-
Implement cache invalidation for updated content
-
Support offline operation with cached content
Security Considerations
- Transport: Use HTTPS with certificate validation for all CDN requests
- Content Integrity: Verify MD5 content hashes after download; reject mismatches and retry from an alternate CDN
- Encryption Keys: CASC uses static community-maintained keys; see Salsa20 Encryption for key management details
Ribbit Protocol
Ribbit is a TCP-based protocol operating on port 1119 that serves as the discovery mechanism for NGDP. It provides version information, CDN endpoints, and configuration data for Blizzard products.
Protocol Variants
Ribbit has three access methods:
TCP Ribbit
Direct TCP connection to tcp://{region}.version.battle.net:1119
-
V1 Protocol: MIME-formatted responses with ASN.1 signatures and SHA-256 checksums
-
V2 Protocol: Raw BPSV responses without metadata
-
Endpoints: summary, products, certificates, and OCSP
HTTP TACT v1
HTTP wrapper at http://{region}.patch.battle.net:1119
-
Endpoints:
/{product}/versions,/{product}/cdns,/{product}/bgdl -
Response format: BPSV directly without MIME wrapping
-
No authentication: Public access
-
Connection pooling: Reusable HTTP connections
HTTPS TACT v2
Secure wrapper at https://{region}.version.battle.net (standard HTTPS port 443)
-
Same endpoints as HTTP TACT v1
-
TLS encryption: Standard HTTPS security
-
HTTP/2 support: Multiplexing for concurrent requests
-
Response format: BPSV directly
Protocol Flow
sequenceDiagram
participant Client
participant Ribbit as Ribbit Server
participant Cache as Local Cache
Client->>Cache: Check cached sequence
Cache-->>Client: Return cached seqn
Client->>Ribbit: TCP Connect (port 1119)
Client->>Ribbit: Send command + \n
Ribbit->>Ribbit: Process request
Ribbit-->>Client: Send response
Ribbit->>Client: Close connection
Client->>Client: Parse response
Client->>Client: Extract sequence number
alt Sequence changed
Client->>Cache: Update cache
Client->>Client: Process new data
else Sequence unchanged
Client->>Client: Use cached data
end
Endpoints
Endpoint Comparison
| Endpoint | TCP Ribbit | HTTP TACT v1 | HTTPS TACT v2 |
|---|---|---|---|
| Summary | v1/summary | ✗ | ✗ |
| Product versions | v1/products/{product}/versions | /{product}/versions | /{product}/versions |
| CDN config | v1/products/{product}/cdns | /{product}/cdns | /{product}/cdns |
| Background download | v1/products/{product}/bgdl | /{product}/bgdl | /{product}/bgdl |
| Certificates | v1/certs/{id} | ✗ | ✗ |
| OCSP | v1/ocsp/{id} | ✗ | ✗ |
Response Format Comparison
| Protocol | Response Format | Signature | Checksum |
|---|---|---|---|
| TCP Ribbit V1 | MIME multipart with BPSV | PKCS#7/CMS | SHA-256 |
| TCP Ribbit V2 | Raw BPSV | None | None |
| HTTP TACT v1 | Raw BPSV | None | None |
| HTTPS TACT v2 | Raw BPSV | None | None |
Note: The certificate and OCSP endpoints were part of Blizzard’s custom PKI infrastructure, now replaced by system trust stores.
Certificate and Signature Verification
V1 Signature Structure
V1 responses include PKCS#7/CMS signatures for authenticity:
SignedData Structure
-
Content Type: PKCS#7 SignedData (OID: 1.2.840.113549.1.7.2)
-
Signer Identification: IssuerAndSerialNumber or SubjectKeyIdentifier
-
Certificates: Embedded in CertificateSet or fetched via SKI
-
Signed Attributes: Optional, DER-encoded as SET for verification
Supported Algorithms
Digest Algorithms:
-
SHA-256 (OID: 2.16.840.1.101.3.4.2.1)
-
SHA-384 (OID: 2.16.840.1.101.3.4.2.2)
-
SHA-512 (OID: 2.16.840.1.101.3.4.2.3)
Signature Algorithms:
-
RSA with SHA-256 (OID: 1.2.840.113549.1.1.11)
-
RSA with SHA-384 (OID: 1.2.840.113549.1.1.12)
-
RSA with SHA-512 (OID: 1.2.840.113549.1.1.13)
Verification Process
Basic Flow
- Extract Signature: From MIME part with
Content-Disposition: signature - Parse PKCS#7 Structure: Extract SignedData from ContentInfo
- Identify Signer: Match via IssuerAndSerialNumber or SubjectKeyIdentifier
- Extract Public Key: From embedded certificate or fetch via endpoint
- Verify Signature: Process depends on signed attributes presence
- Validate Checksum: SHA-256 of content matches epilogue
Signed Attributes Processing
When signed attributes are present (typical case):
-
Re-encode as DER SET:
- Convert from implicit [0] to SET OF (tag 0x31)
- Sort attributes in DER canonical order
- Apply proper DER length encoding
-
Verify Against SET:
- Signature verifies the DER-encoded SET
- Message digest attribute must match content hash
-
Without Signed Attributes:
- Signature directly verifies message content
- Direct RSA verification of content hash
RSA Verification Details
-
Padding Scheme: PKCS#1 v1.5
-
Key Format: Parse SubjectPublicKeyInfo to extract RSA public key
-
Signature Format: Raw signature bytes converted to RSA signature object
-
Hash Algorithms: SHA-256, SHA-384, or SHA-512 based on OID
Certificate Fetching
When certificates are not embedded:
-
Extract Subject Key Identifier from signer info
-
Request certificate via
/v1/certs/{ski}endpoint -
Validate SKI matches between signature and certificate
-
Extract public key for verification
Implementation Strategies
Parsing Approaches:
-
Primary: Use ASN.1/CMS parsing libraries
-
Fallback: Pattern-based manual parsing for compatibility
-
Handle both embedded and detached signatures
Key Extraction:
-
Parse SubjectPublicKeyInfo structure
-
Extract RSA public key in PKCS#1 format
-
Determine key size from modulus length
Critical Implementation Details:
-
SET Encoding: Signed attributes MUST be re-encoded as DER SET for verification
-
Canonical Ordering: Attributes sorted for DER canonical form
-
Dual Verification Paths: Different handling for signed vs unsigned attributes
-
Base64 Detection: Signatures may be binary or base64-encoded in MIME
Error Handling:
-
Invalid ASN.1 structures
-
Missing or mismatched certificates
-
Unsupported algorithms
-
Signature verification failures
-
DER encoding errors
Regional Servers
Available regions for {region}.version.battle.net:
-
us- United States -
eu- Europe -
kr- Korea -
tw- Taiwan -
sg- Singapore -
cn- China (restricted to China-only access)
BPSV Format
Blizzard Pipe-Separated Values (BPSV) is the data format for responses:
Structure
- Header line: Column names with type annotations
- Data lines: Pipe-separated values
- Sequence line:
## seqn = {number}(exact format with spaces required)
Data Types
-
STRING:0- Variable-length string -
HEX:16- 16-byte hexadecimal value (MD5 hash) -
DEC:4- 4-byte decimal integer
Example
Region!STRING:0|BuildConfig!HEX:16|CDNConfig!HEX:16|BuildId!DEC:4|VersionsName!String:0
us|be2bb98dc28aee05bbee519393696cdb|fac77b9ca52c84ac28ad83a7dbe1c829|61491|11.1.7.61491
eu|be2bb98dc28aee05bbee519393696cdb|fac77b9ca52c84ac28ad83a7dbe1c829|61491|11.1.7.61491
## seqn = 2241282
V1 MIME Response Structure
TCP Ribbit V1 responses use MIME multipart format:
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="{boundary}"
--{boundary}
Content-Type: text/plain
Content-Disposition: data
[BPSV data here]
--{boundary}
Content-Type: application/octet-stream
Content-Disposition: signature
[ASN.1 signature data]
--{boundary}--
Checksum: {64-character SHA-256 hash}
The checksum validation process:
-
Search for “Checksum: “ pattern at end of response
-
Extract 64-character hexadecimal checksum
-
Compute SHA-256 hash of content before checksum line
-
Compare with provided checksum (case-insensitive)
Connection Handling
TCP Ribbit Connection Flow
graph TD
A[Create TCP Socket] --> B[Connect to server:1119]
B --> C[Send command + \n]
C --> D[Read response until EOF]
D --> E[Server closes connection]
E --> F{Response type?}
F -->|V1| G[Parse MIME]
F -->|V2| H[Parse BPSV]
G --> I[Validate checksum]
I --> J[Extract BPSV data]
H --> K[Process data]
J --> K
style E stroke-width:4px
style A stroke-width:3px
style F stroke-width:3px,stroke-dasharray:5 5
style I stroke-width:2px
style K stroke-width:2px
HTTP/HTTPS TACT Connection Flow
graph TD
A[Connection Pool] --> B{Existing connection?}
B -->|Yes| C[Reuse connection]
B -->|No| D[Create new HTTP connection]
D --> C
C --> E[Send HTTP request]
E --> F[Receive response]
F --> G[Parse BPSV directly]
G --> H[Return connection to pool]
H --> I[Process data]
style H stroke-width:4px,stroke-dasharray:5 5
style A stroke-width:3px
style B stroke-width:3px,stroke-dasharray:5 5
style G stroke-width:2px
style I stroke-width:2px
Key differences:
-
TCP: New connection per request, server closes after response
-
HTTP/HTTPS: Connection pooling, keep-alive, multiple requests per connection
Unified Client Architecture
Protocol Abstraction
A unified client should abstract protocol differences:
graph TD
A[Unified NGDP Client] --> B{Protocol}
B -->|TCP| C[Ribbit TCP Client]
B -->|HTTP| D[TACT HTTP Client]
B -->|HTTPS| E[TACT HTTPS Client]
C --> F[BPSV Parser]
D --> F
E --> F
C --> G[Response Types]
D --> G
E --> G
style A stroke-width:4px
style B stroke-width:3px,stroke-dasharray:5 5
style F stroke-width:3px
style G stroke-width:2px
style C stroke-width:2px
style D stroke-width:2px
style E stroke-width:2px
Common Interface
All protocol variants share common operations:
-
Get product versions
-
Get CDN configurations
-
Get background download info
-
Parse BPSV responses
Protocol-Specific Features
TCP Ribbit Only:
-
Summary endpoint
-
Certificate/OCSP endpoints
-
MIME response parsing
-
Signature verification
HTTP/HTTPS TACT Only:
-
Connection pooling
-
HTTP/2 multiplexing
-
Standard HTTP features
Configuration Requirements
Host Configuration:
-
Default hosts:
{region}.version.battle.netor{region}.patch.battle.net -
Custom hosts: Support for private servers or testing
-
Port configuration: 1119 for TCP/HTTP, 443 for HTTPS
Connection Settings:
-
Timeout configuration (connect, read, total)
-
Retry logic (count, backoff, jitter)
-
Pool settings (max connections, idle timeout)
-
HTTP/2 settings (multiplexing, window size)
Implementation Requirements
TCP Client
-
Create new connection per request (no pooling)
-
Send ASCII command terminated with
\n -
Read response until server closes connection
-
Default connection timeout: 10 seconds
Retry Logic
Production implementations should include retry logic:
-
Default: 0 retries for backward compatibility
-
Exponential backoff: 100ms initial, 10s maximum, 2x multiplier
-
Jitter: 10% randomization to prevent thundering herd
-
Retryable: Connection, timeout, network failures
-
Non-retryable: Parse errors, validation failures
DNS Caching
Implementations may cache DNS lookups:
-
TTL: 300 seconds (5 minutes) typical
-
Multiple IPs: Try all resolved addresses sequentially
-
Thread-safe: Concurrent access protection required
Response Parsing
-
V1: Parse MIME structure, validate SHA-256 checksum
-
V2/HTTP/HTTPS: Parse BPSV directly
-
Handle empty responses (headers without data rows)
-
Parse typed column headers correctly
Caching
-
Cache responses with key:
{endpoint}-{arguments}-{sequence_number} -
Check sequence numbers to detect updates
-
Sequence numbers only increase (never decrease)
-
Skip re-downloading if sequence unchanged
Product Identifiers
Common product identifiers used with Ribbit:
World of Warcraft
-
wow- Retail -
wow_beta- Beta -
wow_classic- Classic -
wow_classic_era- Classic Era -
wow_classic_ptr- Classic PTR -
wow_classic_titan- Classic Titan (CN region only, WotLK 3.80.x with upgraded Classic/TBC raids) -
wow_anniversary- Classic Anniversary (TBC 2.5.x, progression through Classic branches on a shortened timeline) -
wowt- Public Test Realm -
wowz- Internal/Development
Other Products
-
agent- Battle.net Agent -
bna- Battle.net Application
Version Response Fields
| Field | Type | Description |
|---|---|---|
| Region | STRING:0 | Region identifier |
| BuildConfig | HEX:16 | Build configuration hash |
| CDNConfig | HEX:16 | CDN configuration hash |
| KeyRing | HEX:16 | Encryption keys hash |
| BuildId | DEC:4 | Build number |
| VersionsName | String:0 | Version string |
| ProductConfig | HEX:16 | Product configuration hash |
CDN Response Fields
| Field | Type | Description |
|---|---|---|
| Name | STRING:0 | CDN name |
| Path | STRING:0 | Base path for content |
| Hosts | STRING:0 | Space-separated host list |
| Servers | STRING:0 | Full URLs with parameters |
| ConfigPath | STRING:0 | Path to configuration files |
Error Handling
Connection Errors
-
Connection timeout: Implement 10-30 second timeout (not automatic)
-
CN region: Only accessible from within China (will timeout from elsewhere)
-
Network failures: TCP connection may fail or drop
Response Errors
-
Empty responses: Some endpoints return headers only (especially bgdl)
-
404 errors: Not all products have all endpoints
-
Malformed MIME: V1 responses may have invalid structure
-
Invalid checksum: V1 checksum validation may fail
-
Buffer overflow: No standard response size limit
Parsing Errors
-
Type inconsistency: Handle
String:0vsSTRING:0in BPSV -
Column mismatch: Data rows may not match header count
-
Invalid sequence format: Must match
## seqn =exactly (with space after equals) -
Escaped characters: Pipe characters in values not escaped
Implementation Notes
Buffer Management
-
Use appropriate buffer sizes for TCP reads (typically 4KB-8KB)
-
Stream responses to avoid loading entire response in memory
-
No standard maximum response size - implement limits as needed
MIME Parsing Complexity
-
V1 MIME parsing requires multipart message handling
-
Consider using established MIME libraries
-
First chunk typically contains BPSV data
-
Signature chunk identified by Content-Disposition header
Ribbit Server
cascette-ribbit implements a Ribbit protocol server that serves BPSV-formatted game version and CDN configuration data over HTTP and TCP.
For protocol specification details, see Ribbit Protocol.
Architecture
graph TD
A[cascette-ribbit binary] --> B[Server]
B --> C[HTTP Server - axum]
B --> D[TCP Server - tokio]
C --> E[AppState]
D --> E
E --> F[BuildDatabase]
E --> G[CdnConfig]
C --> H[HTTP Handlers]
H --> I[BpsvResponse]
D --> J{Protocol Version}
J -->|v1| K[MIME Wrapper + SHA-256]
J -->|v2| L[Raw BPSV]
K --> I
L --> I
Components
| Component | File | Purpose |
|---|---|---|
ServerConfig | config.rs | CLI arguments, env vars, TLS paths |
CdnConfig | config.rs | CDN host/path resolution per region |
BuildDatabase | database.rs | JSON build record storage with product indexing |
BuildRecord | database.rs | Single build entry with MD5 hash validation |
AppState | server.rs | Shared state (database, CDN config, timestamps) |
Server | server.rs | Orchestrates HTTP + TCP listeners |
BpsvResponse | responses/bpsv.rs | BPSV response builder (versions, cdns, summary) |
| HTTP handlers | http/handlers.rs | axum route handlers for /{product}/{endpoint} |
| TCP handlers | tcp/handlers.rs | Command routing for v1/ and v2/ prefixes |
| V1 wrapper | tcp/v1.rs | RFC 2046 MIME wrapping with SHA-256 checksums |
| V2 handler | tcp/v2.rs | Raw BPSV TCP responses |
Configuration
CLI Arguments and Environment Variables
| Flag | Env Var | Default | Description |
|---|---|---|---|
--http-bind | CASCETTE_RIBBIT_HTTP_BIND | 0.0.0.0:8080 | HTTP listen address |
--tcp-bind | CASCETTE_RIBBIT_TCP_BIND | 0.0.0.0:1119 | TCP listen address |
--builds | CASCETTE_RIBBIT_BUILDS | ./builds.json | Path to build database JSON |
--cdn-hosts | CASCETTE_RIBBIT_CDN_HOSTS | cdn.arctium.tools | CDN host(s) |
--cdn-path | CASCETTE_RIBBIT_CDN_PATH | tpr/wow | CDN base path |
--tls-cert | CASCETTE_RIBBIT_TLS_CERT | none | TLS certificate path (enables HTTPS) |
--tls-key | CASCETTE_RIBBIT_TLS_KEY | none | TLS private key path |
Build Database Format
The server reads build records from a JSON file. Each record represents a product build:
| Field | Type | Required | Description |
|---|---|---|---|
id | u64 | yes | Unique build identifier |
product | string | yes | Product code (e.g., wow, wowt) |
version | string | yes | Version string (e.g., 1.14.2.42597) |
build | string | yes | Build number |
build_config | string | yes | 32-char hex MD5 hash |
cdn_config | string | yes | 32-char hex MD5 hash |
keyring | string | no | 32-char hex MD5 hash |
product_config | string | no | 32-char hex MD5 hash |
build_time | string | yes | ISO 8601 timestamp |
encoding_ekey | string | yes | 32-char hex encoding key |
root_ekey | string | yes | 32-char hex root key |
install_ekey | string | yes | 32-char hex install key |
download_ekey | string | yes | 32-char hex download key |
MD5 hash fields are validated to be exactly 32 lowercase hexadecimal characters.
HTTP Endpoints
The HTTP server uses axum with gzip compression and CORS support.
Routes
| Route | Handler | Response |
|---|---|---|
GET /{product}/versions | handle_versions | BPSV versions table |
GET /{product}/cdns | handle_cdns | BPSV CDN configuration |
GET /{product}/bgdl | handle_bgdl | BPSV background download (same as versions) |
All responses use Content-Type: text/plain; charset=utf-8.
Returns HTTP 404 if the product is not found in the database.
TCP Protocol
The TCP server accepts one command per connection. After sending the response, the server closes the connection. A 10-second read timeout applies.
V2 Commands (Raw BPSV)
v2/products/{product}/versionsv2/products/{product}/cdnsv2/products/{product}/bgdl
V1 Commands (MIME-wrapped)
v1/products/{product}/versionsv1/products/{product}/cdnsv1/products/{product}/bgdlv1/summary
V1 responses wrap BPSV data in RFC 2046 MIME multipart format with a SHA-256 checksum epilogue. The server does not include PKCS#7 signatures (unlike Blizzard’s production servers).
BPSV Response Format
Versions Response
7 rows, one per region (us, eu, cn, kr, tw, sg, xx):
Region!STRING:0|BuildConfig!HEX:16|CDNConfig!HEX:16|KeyRing!HEX:16|BuildId!DEC:4|VersionsName!STRING:0|ProductConfig!HEX:16
us|0123456789abcdef...|fedcba9876543210...|<keyring>|42597|1.14.2.42597|<product_config>
eu|...|...|...|...|...|...
...
## seqn = 1730534400
CDNs Response
5 rows, one per CDN region (us, eu, kr, tw, cn):
Name!STRING:0|Path!STRING:0|Hosts!STRING:0|Servers!STRING:0|ConfigPath!STRING:0
us|tpr/wow|cdn.arctium.tools|https://cdn.arctium.tools/?maxhosts=4|tpr/wow/config
...
## seqn = 1730534400
Summary Response (TCP v1 only)
One row per product:
Product!STRING:0|Seqn!DEC:4
wow|1730534400
wowt|1730534400
## seqn = 1730534400
Running
Binary
cargo run --bin cascette-ribbit -- --builds ./builds.json
Library
#![allow(unused)]
fn main() {
use cascette_ribbit::{Server, ServerConfig};
let config = ServerConfig {
http_bind: "127.0.0.1:8080".parse()?,
tcp_bind: "127.0.0.1:1119".parse()?,
builds: "./builds.json".into(),
cdn_hosts: "cdn.arctium.tools".to_string(),
cdn_path: "tpr/wow".to_string(),
tls_cert: None,
tls_key: None,
};
config.validate()?;
let server = Server::new(config)?;
server.run().await?;
}
Example
cargo run --example simple_server
Then test with:
# HTTP
curl http://localhost:8080/wow/versions
curl http://localhost:8080/wow/cdns
# TCP v2
echo "v2/products/wow/versions" | nc localhost 1119
# TCP v1
echo "v1/products/wow/versions" | nc localhost 1119
Testing
The crate has four test suites:
| Suite | File | Coverage |
|---|---|---|
| HTTP integration | tests/http_test.rs | HTTP endpoints, status codes, BPSV format |
| TCP v1 integration | tests/tcp_v1_test.rs | MIME wrapping, checksums, summary |
| TCP v2 integration | tests/tcp_v2_test.rs | Raw BPSV over TCP, connection lifecycle |
| Contract tests | tests/contract_test.rs | cascette-protocol client against server |
Contract tests verify that cascette-protocol’s RibbitTactClient can query the
server and parse responses correctly. This ensures wire-level compatibility
between client and server implementations.
cargo test -p cascette-ribbit
cargo bench -p cascette-ribbit
TLS Support
Enable TLS with the tls feature flag:
cargo run --bin cascette-ribbit --features tls -- \
--tls-cert /path/to/cert.pem \
--tls-key /path/to/key.pem
When TLS is enabled, the HTTP server serves HTTPS. The TCP server is not affected (Ribbit TCP does not use TLS).
Battle.net Agent
The Battle.net Agent is a local HTTP service that manages game installations and updates. It runs on port 1120 and provides an API for downloading, installing, and managing Blizzard products.
Overview
The agent serves as the bridge between Blizzard’s CDN infrastructure and the local CASC storage. It handles:
- Product installation and updates
- Download management and prioritization
- Local CASC storage maintenance
- Installation verification and repair
HTTP API
The agent exposes a REST API on http://127.0.0.1:1120.
Endpoints
Documentation of the agent’s HTTP endpoints is pending.
Installation Flow
When installing a product, the agent:
- Queries Ribbit for product version information
- Downloads build and CDN configuration
- Fetches encoding and root manifests
- Downloads required archives from CDN
- Writes data to local CASC storage
- Updates local indices
cascette-agent
cascette-agent is a replacement implementation of the Battle.net Agent. It
provides the same HTTP API on port 1120 and can be used as a drop-in replacement
for:
- Downloading products from official Blizzard CDNs
- Fallback to community archive mirrors (cdn.arctium.tools)
- Managing local CASC installations
Differences from Official Agent
- Open source implementation
- Supports community CDN mirrors
- Cross-platform (Linux, macOS, Windows)
- No Battle.net account required for public content
References
CASC Local Storage
Local CASC storage is the on-disk format used by the Battle.net client to store game data. Unlike CDN archives which are content-addressed, local storage uses optimized indices for fast file lookups.
Directory Structure
A typical CASC installation has the following structure:
<install-dir>/
├── .build.info # Build configuration (BPSV format)
├── Data/
│ ├── data/
│ │ ├── 0000000001.idx # Local index files (16 buckets)
│ │ ├── 0100000001.idx
│ │ ├── ...
│ │ ├── 0f00000001.idx
│ │ ├── data.000 # Combined archive data
│ │ ├── data.001
│ │ ├── ...
│ │ └── *.shmem # Shared memory control file (temp)
│ ├── indices/
│ │ └── ... # CDN index files (not local storage)
│ ├── residency/ # Download state tracking tokens
│ ├── ecache/ # Encoding cache
│ └── hardlink/ # Hard link trie directory
└── Cache/
└── ADB/ # Hotfix database cache
└── *.bin
Local .idx index files and .data archive files both reside in Data/data/.
The Data/indices/ directory holds CDN index files, which are a separate
concern from local storage.
Container Types
CASC manages four container types for local storage:
| Type | Size | Purpose |
|---|---|---|
| Dynamic | 0x3c bytes | Read/write CASC archives (.data files) |
| Static | – | Read-only archives (shared installations) |
| Residency | 0x30 bytes | File state tracking (.residency tokens) |
| Hard Link | 0x30 bytes | Filesystem hard links (trie directory) |
The Dynamic container is the primary read-write storage. It manages archive segments, key state tracking, and shared memory coordination. Access modes: 0=none, 1=read-only, 2=read-write, 3=exclusive.
Index Files (.idx)
Local indices use IDX Journal v7 format with little-endian headers (unlike most NGDP formats which use big-endian).
- Key size: 9 bytes (truncated encoding keys)
- Location size: 5 bytes (1 byte archive high + 4 bytes packed)
- Entry size: 18 bytes (9 key + 5 location + 4 size)
- Bucket distribution: 16 index buckets (0x00-0x0F)
The 9-byte key truncation saves space while maintaining sufficient uniqueness for local lookups. Keys are encoding keys, not content keys.
Index File Format
Each .idx file contains guarded blocks with Jenkins hash validation:
[GuardedBlockHeader] (8 bytes: size + Jenkins hash)
[IndexHeaderV2] (16 bytes: version, bucket, field sizes, segment_size)
[padding] (8 bytes: hash/alignment)
[GuardedBlockHeader] (8 bytes: entry block size + Jenkins hash)
[IndexEntry[]] (N * 18 bytes: sorted by key)
Index Filename Format
{bucket:02x}{version:08x}.idx
Example: 0a00000003.idx = bucket 0x0A, version 3. Total filename length is
14 characters (10 hex digits + .idx).
Bucket Assignment
Files are assigned to index buckets using the XOR-fold algorithm on the first 9 bytes of the encoding key:
hash = key[0] ^ key[1] ^ key[2] ^ key[3] ^ key[4] ^ key[5] ^ key[6] ^ key[7] ^ key[8]
bucket = (hash & 0x0F) ^ (hash >> 4)
Agent uses a flush-and-bind pattern with 3-retry atomic commits when writing index files.
Key Mapping Table (KMT)
Below the index files, CASC maintains a Key Mapping Table (KMT) as the primary on-disk structure for key-to-location resolution:
- Two-tier LSM-tree: sorted section (0x12-byte entries) + update section (0x200-byte pages)
- Jenkins lookup3 hashes for bucket distribution
- 9-byte EKey prefix binary search within sorted sections
- Update section uses 0x200-byte (512-byte) pages with 0x15 (21) entries per page (minimum 0x7800 bytes)
Data Files (data.NNN)
Data files contain BLTE-encoded content. Each entry has a 30-byte (0x1E) local header before the BLTE data:
Offset Size Field
0x00 16 Encoding key (reversed byte order)
0x10 4 Size including header (big-endian)
0x14 2 Flags
0x16 4 ChecksumA
0x1A 4 ChecksumB
0x1E ... BLTE data
Archive Location Packing
The 5-byte archive location in index entries encodes both archive ID and offset:
Byte 0: archive_id >> 2 (high 8 bits)
Bytes 1-4: (archive_id_low << 30) | (offset & 0x3FFFFFFF) (big-endian)
This gives 10-bit archive IDs (max 1023) and 30-bit offsets (max ~1 GiB).
Container Index
Agent maintains a ContainerIndex with 16 segments and supports frozen/thawed archive management:
- Segments can be frozen (read-only) or thawed (writable)
- 0x1E-byte reconstruction headers per archive entry
- Segment limit configurable up to 0x3FF (1023)
- Per-segment tracking: 0x40 (64) bytes per segment in compactor state
Shared Memory (shmem)
The shmem file provides memory-mapped coordination between the Agent process and game clients:
- Protocol versions 4 (base) and 5 (exclusive access flag at DWORD index 0x54)
- Free space table format identifier at DWORD index 0x42 (value 0x2AB8)
- V5 PID tracking: slot array with PID (u32) and mode (u32) per slot
- Writer lock: named global mutex with
Global\prefix - DACL:
D:(A;;GA;;;WD)(A;;GA;;;AN)(grant all to Everyone + Anonymous) - Retry logic: 10 attempts with
Sleep(0)between failures .lockfile with 10-second backoff for coordination
LRU Cache
Agent maintains an LRU cache in shared memory:
- Linked-list table structure
- Generation-based checkpoints for eviction
- 20-character hex filenames with
.lruextension
.build.info
The .build.info file contains installation metadata in BPSV format:
- Product code and region
- Active build configuration hash
- CDN configuration hash
- Installation tags and flags
Residency Tracking
The Residency container tracks which content keys are fully downloaded:
.residencytoken files mark valid containers- Byte-span tracking for partial downloads (header and data residency)
- Reserve, mark-resident, remove, query operations
- Scanner API for enumeration
- Drive type check prevents unsupported storage media
Hard Link Storage
The Hard Link container uses a TrieDirectory for content sharing:
- Hard links allow multiple keys to reference the same physical file
- 32-character hex filename validation
- Unlinked key collection (link count <= 1)
- Recursive compaction
- LRU file descriptor cache with two open modes (handle vs async IO)
- 3-retry delete before hard link creation
- Falls back to residency when hard links are unsupported
Maintenance Operations
Compaction
Two-phase process: archive merge then extract-compact.
- Defrag algorithm: removes gaps between files, reorganizes positions
- Fillholes algorithm: estimates free space without moving data
- Merge threshold: float in [0.0, 0.4]
- Async read/write pipeline with 128 KB minimum buffer
- Per-segment span validation with overlap detection
Garbage Collection
4-stage pipeline:
- Remove unreferenced keys from dynamic container
- Remove obsolete config files
- Remove CDN index files
- Clean up empty directories recursively
Build Repair
Multi-stage pipeline using marker files for crash recovery:
RepairMarker.psv(pipe-separated, writable keys)CASCRepair.mrk(V2 marker format)- Stages: read config, init CDN index, repair containers (data/ecache/hardlink sequentially), data repair, post-repair cleanup
Differences from CDN Storage
| Aspect | CDN | Local |
|---|---|---|
| Key size | 16 bytes | 9 bytes (truncated) |
| Key type | Content keys | Encoding keys |
| Organization | Per-archive indices | 16-bucket index files |
| Entry header | None | 30-byte local header |
| Index format | CDN index footer | IDX Journal v7 with guarded blocks |
| Mutability | Immutable | Updated during patches |
| Containers | Single type | 4 types (dynamic/static/residency/hardlink) |
References
CDN Content Caching
The cascette-cache crate provides multi-layer caching for NGDP/CDN content.
It optimizes network bandwidth and latency by caching frequently accessed data
at multiple levels.
Architecture
graph TD
A[Application] --> B[Multi-Layer Cache]
B --> C[L1: Memory Cache]
B --> D[L2: Disk Cache]
D --> E[CDN]
C --> E
subgraph "Cache Layers"
C
D
end
L1: Memory Cache
Fast in-memory cache with LRU eviction:
- Immediate access for hot data
- Size-based eviction when memory limit reached
- TTL-based expiration for stale data
- Zero-copy data sharing with
bytes::Bytes
L2: Disk Cache
Persistent disk cache for larger datasets:
- Survives application restarts
- Atomic writes with fsync for durability
- Configurable storage limits
- Asynchronous I/O with tokio
NGDP-Specific Caches
The crate provides specialized caches for NGDP content types:
Resolution Cache
Caches the NGDP resolution chain:
Root File → Content Key
Content Key → Encoding Key
Encoding Key → CDN Location
Content-Addressed Cache
Stores content by its MD5 hash (ContentKey):
- Automatic validation on retrieval
- Deduplication across builds
- Supports partial content access
BLTE Block Cache
Caches individual BLTE blocks for large files:
- Enables partial file access without full download
- Block-level validation
- Decompressed and raw block storage
Archive Range Cache
Caches byte ranges from CDN archives:
- Coalesces nearby requests into larger ranges
- Reduces CDN round-trips
- Supports range request optimization
Memory Pooling
NGDP files have predictable size distributions. The memory pool uses size classes optimized for these patterns:
| Size Class | Range | Typical Content |
|---|---|---|
| Small | < 16 KB | Config files, small assets |
| Medium | < 256 KB | Most game files |
| Large | < 8 MB | Textures, models |
| Huge | > 8 MB | Large archives, cinematics |
Benefits:
- Reduced allocation overhead
- Better memory locality
- Thread-local pools for zero-contention
Content Validation
All cached content is validated on retrieval:
MD5 Validation
Content keys are MD5 hashes of the data:
#![allow(unused)]
fn main() {
let content_key = ContentKey::from_data(&data);
// Cache validates: MD5(data) == content_key
}
Jenkins96 Validation
Archive indices use Jenkins96 for fast hashing:
#![allow(unused)]
fn main() {
let hash = Jenkins96::hash(path.as_bytes());
// Validates archive index lookups
}
TACT Key Validation
Encrypted content requires TACT key verification before decryption.
SIMD Optimizations
Hash operations use SIMD acceleration when available:
| Instruction Set | Vector Width | Speedup |
|---|---|---|
| SSE2 | 128-bit | 2x |
| SSE4.1 | 128-bit | 2x |
| AVX2 | 256-bit | 4x |
| AVX-512 | 512-bit | 8x |
Runtime CPU detection selects the best available implementation.
Configuration
Memory Cache
#![allow(unused)]
fn main() {
MemoryCacheConfig {
max_size: 256 * 1024 * 1024, // 256 MB limit
ttl: Duration::from_secs(3600), // 1 hour TTL
eviction_batch_size: 100, // Evict 100 items at a time
}
}
Disk Cache
#![allow(unused)]
fn main() {
DiskCacheConfig {
cache_dir: PathBuf::from("/var/cache/cascette"),
max_size: 10 * 1024 * 1024 * 1024, // 10 GB limit
sync_writes: true, // fsync after writes
}
}
Multi-Layer
#![allow(unused)]
fn main() {
MultiLayerConfig {
l1: MemoryCacheConfig::default(),
l2: DiskCacheConfig::default(),
write_through: true, // Write to both layers
promote_on_hit: true, // Copy L2 hits to L1
}
}
CDN Integration
The cache integrates with CDN clients for miss handling:
sequenceDiagram
participant App
participant L1 as Memory Cache
participant L2 as Disk Cache
participant CDN
App->>L1: get(key)
alt L1 Hit
L1-->>App: data
else L1 Miss
L1->>L2: get(key)
alt L2 Hit
L2-->>L1: data
L1-->>App: data
else L2 Miss
L2->>CDN: fetch(key)
CDN-->>L2: data
L2-->>L1: data
L1-->>App: data
end
end
Features:
- Automatic CDN fallback on cache miss
- Retry logic with exponential backoff
- Multiple CDN endpoint failover
- Range request support for partial content
Streaming
Large files are processed in chunks to avoid memory exhaustion:
#![allow(unused)]
fn main() {
StreamingConfig {
chunk_size: 64 * 1024, // 64 KB chunks
max_buffered_chunks: 16, // 1 MB max buffer
validate_chunks: true, // Validate each chunk
}
}
Streaming enables:
- Processing files larger than available memory
- Progressive validation during download
- Early error detection
Metrics
The cache tracks performance metrics:
- Hit rate (L1, L2, overall)
- Miss rate and CDN fallback frequency
- Eviction counts and reasons
- Memory and disk usage
- Validation success/failure rates
References
CDN Mirroring and Archival Strategy
Overview
This document outlines strategies for mirroring Blizzard’s CDN content for WoW using NGDP/CASC.
Note: Python code examples in this document are conceptual pseudocode illustrating mirroring workflows. For working code, see the
cascette mirrorCLI command or reference implementations in References.
Rationale for Mirroring
Blizzard removes older builds from CDN within days to weeks of new patches (see Archival Urgency below). Mirroring preserves builds that would otherwise be lost, enabling:
- Preservation: Maintain access to historical builds after CDN removal
- Development: Test CASC implementations against known data offline
- Performance: Local access avoids CDN latency and bandwidth limits
Target Products
Focus on World of Warcraft products:
| Product Code | Description | Update Frequency |
|---|---|---|
| wow | Retail/Live | Weekly patches |
| wowt | Public Test Realm | Frequent updates |
| wow_beta | Beta servers | Daily during beta |
| wow_classic | Classic (Wrath/Cata) | Bi-weekly |
| wow_classic_era | Classic Era (Vanilla) | Rare updates |
| wow_classic_ptr | Classic PTR | During test cycles |
| wow_classic_titan | Classic Titan (CN only, WotLK 3.80.x) | Unknown |
| wow_anniversary | Classic Anniversary (TBC 2.5.x) | Unknown |
Archival Urgency
Based on testing CDN retention windows:
| Product | Retention Window | Archival Priority |
|---|---|---|
| wow (Retail) | 14-15 days | High - Daily checks |
| wow_classic | 2-4 weeks | Medium - Weekly checks |
| wow_classic_era | ~3 months | Low - Monthly checks |
| wow_beta | 7-10 days | Critical - Continuous |
| wowt (PTR) | 10-14 days | High - Every 2-3 days |
Critical Finding: Retail builds disappear within 2 weeks of new patches.
Build Discovery
Track new builds via Ribbit protocol:
Sequence Number Monitoring
# Query summary endpoint
echo -e "v1/summary\r\n" | nc us.version.battle.net 1119
# Response includes sequence numbers
## seqn = 2241282
Monitor sequence number changes:
async def check_for_updates():
summary = await ribbit_client.get_summary()
for product in summary.products:
stored_seqn = database.get_sequence(product.name)
if product.seqn > stored_seqn:
# New build detected!
await process_new_build(product)
database.update_sequence(product.name, product.seqn)
Version Information
# Get specific product versions
echo -e "v1/products/wow/versions\r\n" | nc us.version.battle.net 1119
CDN Path Discovery
Critical: Always Extract CDN Paths
# Get CDN information - NEVER hardcode paths!
echo -e "v1/products/wow/cdns\r\n" | nc us.version.battle.net 1119
Example response:
Region!STRING:0|Hosts!STRING:0|Path!STRING:0|ConfigPath!STRING:0
us|level3.blizzard.com edgecast.blizzard.com|tpr/wow|tpr/configs/data
eu|level3.blizzard.com edgecast.blizzard.com|tpr/wow|tpr/configs/data
CRITICAL: The Path field (tpr/wow) must be used for URL construction:
# CORRECT - Uses path from CDN response
cdn_url = f"http://{host}/{path}/data/{hash[:2]}/{hash[2:4]}/{hash}"
# WRONG - Hardcoded path
cdn_url = f"http://{host}/tpr/wow/data/{hash[:2]}/{hash[2:4]}/{hash}"
All WoW products use tpr/wow regardless of product code:
-
wow,wow_classic,wow_classic_era,wow_classic_titan,wow_anniversaryall usetpr/wow -
Never assume paths based on product names
Essential Files
Priority order for archival:
1. Configuration Files (Critical)
-
BuildConfig: Build-specific settings
-
CDNConfig: CDN and archive information
-
ProductConfig: Product metadata
2. System Files (Required)
-
Encoding: Content key mappings (~500MB-2GB)
-
Root: File manifest
-
Install: Installation manifest
-
Download: Download priority
3. Indices (Important)
-
Archive indices (
.indexfiles) -
Patch indices for updates
4. Data Archives (Bulk)
-
Archive files (
data.###) -
Largest storage requirement
-
Can be fetched on-demand
Mirroring Architecture
Storage Structure
/mirror
├── configs/
│ └── data/
│ ├── {hash[0:2]}/
│ │ └── {hash[2:4]}/
│ │ └── {hash}
├── data/
│ ├── {hash[0:2]}/
│ │ └── {hash[2:4]}/
│ │ └── {hash}
├── indices/
│ └── *.index
└── metadata.db
Database Schema
CREATE TABLE builds (
id SERIAL PRIMARY KEY,
product VARCHAR(50),
build_config VARCHAR(32),
cdn_config VARCHAR(32),
build_name VARCHAR(100),
detected_at TIMESTAMP,
archived BOOLEAN DEFAULT FALSE
);
CREATE TABLE files (
hash VARCHAR(32) PRIMARY KEY,
size BIGINT,
type VARCHAR(20),
downloaded_at TIMESTAMP
);
Download Strategy
Priority-Based Downloading
class MirrorStrategy:
def __init__(self):
self.priorities = {
'configs': 1, # Highest priority
'encoding': 2,
'root': 3,
'install': 4,
'indices': 5,
'data': 10 # Lowest priority
}
async def mirror_build(self, build_info):
# 1. Download configs first
await self.download_configs(build_info)
# 2. Get encoding file
encoding = await self.download_encoding(build_info)
# 3. Download indices
indices = await self.download_indices(build_info)
# 4. Optional: Download data archives
if self.full_mirror:
await self.download_archives(indices)
Bandwidth Management
-
Concurrent downloads: 4-8 connections
-
Rate limiting: Respect CDN limits
-
Retry logic: Handle transient failures
-
Resume support: Continue interrupted downloads
Incremental Updates
Track changes efficiently:
async def incremental_update(product):
current_build = await get_current_build(product)
stored_build = database.get_latest_build(product)
if current_build != stored_build:
# Download only new/changed files
new_files = await diff_builds(current_build, stored_build)
await download_files(new_files)
database.update_build(product, current_build)
Verification
Ensure data integrity:
Hash Verification
def verify_file(filepath, expected_hash):
actual_hash = calculate_md5(filepath)
if actual_hash != expected_hash:
raise IntegrityError(f"Hash mismatch: {filepath}")
Archive Integrity
-
Verify BLTE headers
-
Check chunk checksums
-
Validate encoding entries
Storage Optimization
Deduplication
Content-addressed storage automatically deduplicates:
def store_file(content, hash):
path = get_path_from_hash(hash)
if not os.path.exists(path):
# Only store if not already present
write_file(path, content)
Compression
-
Keep BLTE files compressed
-
Use filesystem compression for configs
-
Consider archive formats for old builds
Historical Build Recovery
Using External Sources
-
Community Archives:
- Shared build collections
- Private archives
-
Wayback Machine:
- Historical Ribbit responses
- Cached configuration files
-
Torrent archives:
- Community-shared build collections
- Distributed preservation efforts
Reconstruction
Rebuild missing content:
flowchart TD
A[Partial Build] --> B[Identify Missing]
B --> C[Search Mirrors]
C --> D{Found?}
D -->|Yes| E[Download Missing]
D -->|No| F[Check Archives]
F --> G{In Archive?}
G -->|Yes| H[Extract Content]
G -->|No| I[Search Community]
E --> J[Verify Hashes]
H --> J
I --> K{Available?}
K -->|Yes| L[Request Copy]
K -->|No| M[Document Gap]
L --> J
J --> N[Update Archive]
M --> O[Gap Report]
style A stroke-width:4px
style N stroke-width:4px
style O stroke-width:3px,stroke-dasharray:5 5
style D stroke-width:3px,stroke-dasharray:5 5
style G stroke-width:3px,stroke-dasharray:5 5
style K stroke-width:3px,stroke-dasharray:5 5
style J stroke-width:2px
style B stroke-width:2px
Legal Considerations
Fair Use
Archival under fair use principles:
-
Research: Academic study of game development
-
Education: Teaching game architecture
-
Preservation: Cultural heritage of gaming
-
Non-commercial: No monetization of archives
Best Practices
-
Respect intellectual property
-
Don’t distribute copyrighted content
-
Use for personal/research purposes
-
Cooperate with takedown requests
Reference Implementations
For detailed analysis of NGDP/CASC reference implementations, see references.md.
Key implementations examined:
-
CascLib: Complete C++ library with 10+ years of development
-
TACT.Net: C# architecture with modular design
-
rustycasc: Rust implementation with type safety
-
BlizzTrack: Production monitoring with database persistence
-
blizztools: Rust CLI for NGDP operations
-
blizzget: C++ downloader with custom version support
-
tactmon: Advanced C++ monitoring with template ORM
-
TACTSharp: .NET extraction library with memory-mapped files
These implementations informed cascette-rs design for CDN interaction and content resolution.
Implementation Examples
Build Tracker
class BuildTracker:
def __init__(self, products):
self.products = products
self.check_interval = 300 # 5 minutes
async def run(self):
while True:
for product in self.products:
await self.check_product(product)
await asyncio.sleep(self.check_interval)
async def check_product(self, product):
try:
versions = await ribbit.get_versions(product)
cdns = await ribbit.get_cdns(product)
for region in versions.regions:
build_config = region.build_config
if not self.is_archived(build_config):
await self.archive_build(product, region, cdns)
except Exception as e:
logger.error(f"Failed to check {product}: {e}")
Archive Manager
class ArchiveManager:
def __init__(self, storage_path):
self.storage = storage_path
self.cdn_client = CDNClient()
async def archive_build(self, build_info):
# Create build directory
build_dir = self.storage / build_info.product / build_info.build_config
build_dir.mkdir(parents=True, exist_ok=True)
# Download in priority order
await self.download_configs(build_info)
await self.download_encoding(build_info)
await self.download_root(build_info)
# Mark as archived
self.mark_archived(build_info)
Monitoring and Alerts
Health Checks
class MirrorHealth:
async def check_health(self):
return {
'disk_space': self.check_disk_space(),
'cdn_connectivity': await self.check_cdn(),
'database': self.check_database(),
'last_check': datetime.now()
}
def check_disk_space(self):
usage = shutil.disk_usage(self.storage_path)
return {
'used': usage.used,
'free': usage.free,
'percent': (usage.used / usage.total) * 100
}
Disaster Recovery
Backup Strategy
- Primary Mirror: Fast SSD storage
- Secondary Backup: HDD archive
- Cloud Backup: Critical configs only
- Community Sharing: Torrent distribution
Recovery Procedures
# Restore from backup
rsync -av /backup/mirror/ /primary/mirror/
# Verify integrity
find /mirror -type f -name "*.index" | xargs -I {} md5sum {}
# Rebuild database
python rebuild_metadata.py /mirror
Community Coordination
Shared Resources
-
Mirror status: Track who has what builds
-
Gap identification: Find missing builds
-
Bandwidth sharing: Distribute download load
-
Verification: Cross-check integrity
Future Considerations
- Automated build discovery with predictive downloading before CDN removal
- Differential compression between builds to reduce storage
- Geographic replication for redundancy
Tools and Resources
Existing Tools
-
CASCExplorer: Browse CASC archives
-
WoW.tools: Online CASC viewer
-
TACTSharp: .NET extraction library
-
CascLib: C++ CASC library
Monitoring Services
-
BlizzTrack: Real-time build tracking
-
Wago.tools: API for build information
Community
-
Discord servers: Coordinate archival efforts
-
GitHub repos: Share tools and scripts
-
Forums: Technical discussions
The 14-15 day retention window for retail WoW makes automated monitoring and archival essential.
Reference Implementations
This document lists NGDP/CASC implementations useful for understanding the system. These projects have informed cascette-rs development and serve as references for format details and edge cases.
C++ Implementations
ladislav-zezula/CascLib
The original C++ CASC library by the author of StormLib (MPQ library).
- Repository: https://github.com/ladislav-zezula/CascLib
- Use for: Binary format details, algorithm verification, edge cases
- Features: Complete CASC support, local and online archives, multiple games
heksesang/CascLib
C++17 header-only library from the WoW 6.0 era.
- Repository: https://github.com/heksesang/CascLib
- Use for: Simplified CASC reading, header-only integration
- Note: Early implementation, lacks modern features (LZMA, LZ4, Zstd, encryption)
C# Implementations
Marlamin/CascLib
C# fork with WoW-specific enhancements, used by wow.tools.
- Repository: https://github.com/Marlamin/CascLib
- Use for: Encryption keys, root handlers, CDN index parsing, BLTE decoding
- Features: Game-specific root handlers for 20+ Blizzard titles
wowdev/TACTSharp
Memory-mapped C# implementation focused on performance.
- Repository: https://github.com/wowdev/TACTSharp
- Use for: Performance patterns, zero-copy techniques, CDN optimization
- Features: Efficient handling of large encoding files
wowdev/TACT.Net
C# library for TACT extraction operations.
- Repository: https://github.com/wowdev/TACT.Net
- Use for: Extraction patterns, multiple input/output formats
- Features: EKey, CKey, FileDataID, and filename-based extraction
WowDevTools/CASCHost
Server-side CASC hosting for modding.
- Repository: https://github.com/WowDevTools/CASCHost
- Use for: CASC building, CDN structure generation, content serving
- Note: Server-focused (produces content), opposite of cascette-rs (consumes content)
danielsreichenbach/BuildBackup
C# CDN backup tool (maintained fork of TACTAdder).
- Repository: https://github.com/danielsreichenbach/BuildBackup
- Use for: Mirror command reference, CDN failover, parallel downloads
- Features: Archive size caching, resume support, multi-product mirroring
Rust Implementations
ferronn-dev/rustycasc
Rust CASC types and FrameXML extractor.
- Repository: https://github.com/ferronn-dev/rustycasc
- Use for: Rust type definitions, archive index parsing
- Note: Hardcodes 4-byte offsets (doesn’t handle archive-groups)
ohchase/blizztools
Rust CLI for NGDP/TACT operations.
- Repository: https://github.com/ohchase/blizztools
- Use for: Ribbit protocol, install manifest parsing, async download patterns
- Features: Version queries, manifest parsing, file downloads
Other Tools
Warpten/tactmon
C++ CDN tracker with Ribbit monitoring.
- Repository: https://github.com/Warpten/tactmon
- Use for: Ribbit protocol implementation, CDN monitoring, product tracking
- Features: Template-based ORM, database persistence, production monitoring
funjoker/blizzget
Windows GUI CDN downloader.
- Repository: https://github.com/nickscha/blizzget
- Use for: Download workflow, custom version configs, tag selection
- Note: GUI-focused, Windows-only
Kruithne/wow.export
Node.js/TypeScript export toolkit.
- Repository: https://github.com/Kruithne/wow.export
- Use for: File extraction patterns, M2/WMO handling, BLP conversion
- Features: Visual export interface, multiple format support
Marlamin/wow.tools.local
Local wow.tools implementation.
- Repository: https://github.com/Marlamin/wow.tools.local
- Use for: File history tracking, DB2 diffing, hotfix management
- Features: Web-based content browser, model viewer, database browser
Community Resources
wowdev.wiki
Community wiki documenting WoW file formats and systems.
- URL: https://wowdev.wiki
- Key pages: NGDP, CASC, TACT
wago.tools
Build database with 1,900+ WoW builds.
- URL: https://wago.tools/builds
- Use for: Build history, version information, product tracking
Community CDN Mirrors
Community-operated mirrors preserving historical WoW builds. These provide access to game data after Blizzard removes it from official CDNs.
cdn.arctium.tools
- URL: https://cdn.arctium.tools
- Coverage: WoW 6.x onwards (2014+)
- Products: World of Warcraft (all variants)
casc.wago.tools
- URL: https://casc.wago.tools
- Coverage: Recent WoW builds
- Products: World of Warcraft
archive.wow.tools
- URL: https://archive.wow.tools
- Coverage: Various WoW builds
- Products: World of Warcraft, historical data
cascette-rs supports automatic fallback between these mirrors when official Blizzard CDNs are unavailable.
Project Setup
This page covers the requirements and setup for developing cascette-rs.
Requirements
Rust Toolchain
- Minimum Supported Rust Version (MSRV): 1.92.0
- Edition: Rust 2024
Install the required toolchain:
rustup install 1.92.0
rustup default 1.92.0
Required components:
rustup component add rustfmt clippy
For WASM development:
rustup target add wasm32-unknown-unknown
Development Tools
| Tool | Purpose | Installation |
|---|---|---|
cargo-deny | Dependency auditing | cargo install cargo-deny |
cargo-nextest | Test runner | cargo install cargo-nextest |
cargo-llvm-cov | Code coverage | cargo install cargo-llvm-cov |
mdbook | Documentation | cargo install mdbook or via mise install |
Optional Tools
| Tool | Purpose | Installation |
|---|---|---|
ripgrep | Code search | cargo install ripgrep or system package |
hyperfine | Benchmarking | cargo install hyperfine |
cargo-watch | Auto-rebuild | cargo install cargo-watch |
Repository Structure
cascette-rs/
├── crates/ # Workspace members
│ ├── cascette-crypto/ # Cryptographic primitives
│ ├── cascette-formats/ # Binary format parsers
│ └── ...
├── docs/ # mdBook documentation
│ ├── src/ # Documentation source
│ └── book.toml # mdBook configuration
├── deny.toml # cargo-deny configuration
├── Cargo.toml # Workspace manifest
└── AGENTS.md # AI assistant guidance
First-Time Setup
-
Clone the repository:
git clone https://github.com/wowemulation-dev/cascette-rs.git cd cascette-rs -
Verify the toolchain:
rustc --version # Should be 1.92.0 or later cargo --version -
Build the workspace:
cargo build --workspace -
Run tests:
cargo nextest run --workspace -
Verify lints pass:
cargo fmt --all -- --check cargo clippy --workspace --all-targets
IDE Configuration
VS Code
Recommended extensions:
rust-analyzer- Rust language supportEven Better TOML- TOML file supportcrates- Dependency version management
Settings (.vscode/settings.json):
{
"rust-analyzer.check.command": "clippy",
"rust-analyzer.check.allTargets": true,
"editor.formatOnSave": true,
"[rust]": {
"editor.defaultFormatter": "rust-lang.rust-analyzer"
}
}
JetBrains (RustRover/IntelliJ)
- Install the Rust plugin
- Enable “Run rustfmt on save”
- Configure clippy as the external linter
Quality Gate
All changes must pass the CI workflow before merging. Run these checks locally:
# Full CI check (run before committing)
cargo fmt --all -- --check && \
cargo clippy --workspace --all-targets && \
cargo nextest run --profile ci --workspace && \
cargo doc --workspace --no-deps
Individual checks:
| Command | Purpose |
|---|---|
cargo fmt --all -- --check | Format verification |
cargo clippy --workspace --all-targets | Lint checks |
cargo nextest run --profile ci --workspace | Unit and integration tests |
cargo doc --workspace --no-deps | Documentation build |
cargo deny check | Dependency audit |
WASM Compatibility
Core libraries must compile to WASM:
cargo check --target wasm32-unknown-unknown -p cascette-crypto
cargo check --target wasm32-unknown-unknown -p cascette-formats
Documentation
Build and serve the documentation locally:
# Build HTML documentation
mdbook build docs
# Serve locally with auto-reload
mdbook serve docs --open
The documentation will be available at http://localhost:3000.
Workspace Configuration
The workspace uses strict linting. Key settings from Cargo.toml:
[workspace.lints.clippy]
# Lint groups
all = { level = "warn", priority = -1 }
pedantic = { level = "warn", priority = -1 }
nursery = { level = "warn", priority = -1 }
cargo = { level = "warn", priority = -1 }
# Safety lints (higher priority)
unwrap_used = { level = "warn", priority = 2 }
panic = { level = "warn", priority = 2 }
todo = { level = "warn", priority = 2 }
unimplemented = { level = "warn", priority = 2 }
expect_used = { level = "warn", priority = 2 }
Library code should avoid unwrap(), expect(), and panic!(). Use Result
types and proper error handling instead.
Testing Guidelines
This page covers testing conventions and practices for cascette-rs.
Test Organization
Module Structure
Tests live in the same file as the code they test, using a #[cfg(test)] module:
#![allow(unused)]
fn main() {
pub fn parse_header(data: &[u8]) -> Result<Header, ParseError> {
// Implementation
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_parse_header_with_valid_data_returns_header() {
// Test implementation
}
}
}
Nested Modules for Large Files
For files with many tests, use nested modules to group related tests:
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
use super::*;
mod parsing {
use super::*;
#[test]
fn test_parse_entry_from_valid_bytes() { ... }
#[test]
fn test_parse_entry_from_truncated_bytes_returns_error() { ... }
}
mod building {
use super::*;
#[test]
fn test_builder_with_entries_produces_sorted_output() { ... }
}
mod edge_cases {
use super::*;
#[test]
fn test_edge_empty_input_returns_empty_result() { ... }
}
}
}
Test Naming Convention
Pattern
Use this naming pattern for test functions:
test_<subject>_<condition>_<expected_outcome>
Components:
| Part | Description | Example |
|---|---|---|
subject | What is being tested | parser, builder, entry |
condition | The scenario or input | with_valid_data, from_empty_input |
expected_outcome | What should happen | returns_struct, returns_error |
Examples
Parsing tests:
#![allow(unused)]
fn main() {
// Good - specific and descriptive
fn test_parse_header_with_valid_magic_returns_header() { ... }
fn test_parse_header_with_invalid_magic_returns_error() { ... }
fn test_parse_entry_from_truncated_data_returns_incomplete_error() { ... }
// Bad - too vague
fn test_parse() { ... }
fn test_header() { ... }
fn test_error() { ... }
}
Building tests:
#![allow(unused)]
fn main() {
// Good
fn test_builder_with_single_entry_creates_valid_output() { ... }
fn test_builder_with_unsorted_entries_sorts_before_writing() { ... }
// Bad
fn test_builder() { ... }
fn test_build() { ... }
}
Round-trip tests:
#![allow(unused)]
fn main() {
// Good - suffix with _round_trip
fn test_index_entry_round_trip_preserves_all_fields() { ... }
fn test_blte_compression_round_trip_matches_original() { ... }
// Bad
fn test_round_trip() { ... } // Round trip of what?
}
Category Prefixes
Use consistent prefixes for special test categories:
| Prefix | Use Case | Example |
|---|---|---|
test_edge_* | Edge cases and boundary conditions | test_edge_empty_input_handled |
test_error_* | Error path validation | test_error_invalid_checksum_detected |
*_round_trip | Serialization/deserialization | test_config_round_trip |
Edge case examples:
#![allow(unused)]
fn main() {
fn test_edge_empty_index_builds_successfully() { ... }
fn test_edge_single_entry_is_searchable() { ... }
fn test_edge_max_u32_offset_handled() { ... }
fn test_edge_zero_length_data_returns_empty() { ... }
}
Error handling examples:
#![allow(unused)]
fn main() {
fn test_error_truncated_footer_returns_parse_error() { ... }
fn test_error_invalid_checksum_returns_mismatch() { ... }
fn test_error_unsorted_entries_rejected() { ... }
}
Test Types
Unit Tests
Test individual functions in isolation:
#![allow(unused)]
fn main() {
#[test]
fn test_jenkins96_hash_with_known_input_produces_expected_output() {
let result = Jenkins96::hash(b"test");
assert_eq!(result.hash32, 0x12345678); // Known value
}
}
Integration Tests
Place in tests/ directory for testing public APIs:
crates/cascette-formats/
├── src/
│ └── lib.rs
└── tests/
└── archive_integration.rs
Property-Based Tests
Use proptest for testing invariants across many inputs:
#![allow(unused)]
fn main() {
#[cfg(test)]
mod proptest_tests {
use proptest::prelude::*;
proptest! {
#[test]
fn round_trip_preserves_entries(entries in prop::collection::vec(any::<Entry>(), 0..100)) {
let built = build(&entries);
let parsed = parse(&built)?;
prop_assert_eq!(entries, parsed);
}
}
}
}
Property test naming (inside proptest! macro):
- No
test_prefix needed (macro adds it) - Describe the property being verified
- Examples:
round_trip_preserves_entries,checksum_detects_corruption
Assertions
Use pretty_assertions
Import pretty_assertions for better diff output on failures:
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
use pretty_assertions::assert_eq;
#[test]
fn test_something() {
assert_eq!(expected, actual); // Shows colored diff on failure
}
}
}
Common Assertions
| Assertion | Use Case |
|---|---|
assert_eq!(expected, actual) | Value equality |
assert_ne!(a, b) | Values differ |
assert!(condition) | Boolean conditions |
assert!(result.is_ok()) | Success check |
assert!(result.is_err()) | Error check |
matches!(value, pattern) | Pattern matching |
Error Assertions
Test specific error types:
#![allow(unused)]
fn main() {
#[test]
fn test_parse_with_invalid_data_returns_checksum_error() {
let result = parse(invalid_data);
assert!(matches!(
result,
Err(ParseError::ChecksumMismatch { .. })
));
}
}
Running Tests
This project uses cargo-nextest for faster, parallel test execution with better output formatting.
Basic Commands
# Run all tests with nextest (recommended)
cargo nextest run --workspace
# Run tests with CI profile (stricter timeouts, immediate output on failures)
cargo nextest run --profile ci --workspace
# Run tests for a specific crate
cargo nextest run -p cascette-formats
cargo nextest run --profile ci -p cascette-formats
# Run tests matching a pattern
cargo nextest run --workspace edge_ # All edge case tests
cargo nextest run --workspace error_ # All error tests
cargo nextest run --workspace round_trip # All round-trip tests
# Run a specific test
cargo nextest run -p cascette-formats test_parse_header_with_valid_data
Feature Combinations
Test with different feature combinations:
# Default features
cargo test --workspace
# No default features (minimal build)
cargo test --workspace --no-default-features
# All features
cargo test --workspace --all-features
Code Coverage
Generate coverage reports:
# Generate LCOV report
cargo llvm-cov --workspace --lcov --output-path lcov.info
# Generate HTML report
cargo llvm-cov --workspace --html
# Open HTML report
open target/llvm-cov/html/index.html
Test Data
Embedded Test Data
For small test cases, embed data directly in tests:
#![allow(unused)]
fn main() {
#[test]
fn test_parse_minimal_header() {
let data = [
0x42, 0x4C, 0x54, 0x45, // Magic: "BLTE"
0x00, 0x00, 0x00, 0x10, // Header size: 16
];
let header = parse_header(&data).expect("should parse");
assert_eq!(header.magic, b"BLTE");
}
}
Test Fixtures
For larger test files, use the include_bytes! macro or test fixtures:
#![allow(unused)]
fn main() {
const TEST_INDEX: &[u8] = include_bytes!("fixtures/sample.index");
#[test]
fn test_parse_real_index_file() {
let index = ArchiveIndex::parse(TEST_INDEX).expect("should parse");
assert!(!index.entries.is_empty());
}
}
Property Test Strategies
Define reusable strategies for property tests:
#![allow(unused)]
fn main() {
fn valid_entry_strategy() -> impl Strategy<Value = IndexEntry> {
(
prop::array::uniform16(any::<u8>()), // 16-byte key
0u32..u32::MAX, // offset
1u32..1_000_000, // size
).prop_map(|(key, offset, size)| {
IndexEntry { key: key.to_vec(), offset, size, archive_index: None }
})
}
}
CI Integration
Tests run automatically on every pull request using cargo-nextest. The CI workflow:
- Runs
cargo nextest run --profile ci --workspacewith default features - Runs tests with
--no-default-featureson changed crates - Tests each changed crate individually on stable Rust
- Collects code coverage using
cargo llvm-cov --nextestand uploads to Codecov
See .github/workflows/ci.yml for the full configuration.
Nextest Profiles
The project uses three nextest profiles configured in .config/nextest.toml:
| Profile | Description | Use Case |
|---|---|---|
default | Standard timeouts, final output on completion | Local development |
ci | Stricter timeouts, immediate output on failures | CI, PR checks |
release | Release build with optimizations | Performance testing |
Cargo Aliases
Convenient cargo aliases are defined in .cargo/config.toml:
cargo nextest-all # All tests with default profile
cargo nextest-lib # Library tests only
cargo nextest-ci # All tests with CI profile
cargo nextest-release # All tests with release profile
cargo nextest-unit # Unit tests only
cargo nextest-integration # Integration tests only
Performance Profiling
Flamegraphs
The project supports flamegraph generation using cargo-flamegraph. Flamegraphs help visualize CPU time spent in different functions during execution.
Generating Flamegraphs Locally
# Generate flamegraph for benchmarks
cargo flamegraph --bench throughput -- --bench
# Generate flamegraph for a binary
cargo flamegraph --bin cascette-ribbit -- --help
# Generate flamegraph for tests
cargo flamegraph --test integration
# Specify output location (flamegraph.svg is created in working directory by default)
cargo flamegraph --output target/flamegraphs/flamegraph.svg --bench throughput -- --bench
Flamegraph outputs are stored in target/flamegraphs/ and ignored by git.
CI Flamegraph Generation
The .github/workflows/profiling.yml workflow generates flamegraphs automatically:
- Trigger: Manual via
workflow_dispatchor commits with[perf]in the message - Targets:
bench(default),test,binary - Output: Uploaded as artifacts and posted to PR comments
To trigger a flamegraph run:
git commit -m "Add performance optimization [perf]"
git push
Or manually trigger via GitHub Actions UI with a target selector.
Benchmarking
The project uses criterion for benchmarking.
# Run all benchmarks
cargo bench
# Run specific benchmark
cargo bench --bench throughput
# Generate HTML report
cargo bench --bench throughput -- --output-format html
open target/criterion/report/index.html
Benchmark Regression Detection
The profiling workflow automatically detects performance regressions:
- Runs on main branch pushes
- Uses
benchmark-action/github-action-benchmarkto store results - Alerts when performance degrades by >200%
- Posts comments to commits with regression alerts
Benchmark data is stored in GitHub Actions cache for historical comparison.
Coding Standards
This page covers coding conventions and style guidelines for cascette-rs.
Formatting
All code must be formatted with rustfmt. Run before committing:
cargo fmt --all
The workspace uses default rustfmt settings. No custom configuration is needed.
Linting
The workspace enables strict clippy lints. All warnings must be resolved:
cargo clippy --workspace --all-targets
Lint Configuration
From Cargo.toml:
[workspace.lints.clippy]
# Lint groups at low priority
all = { level = "warn", priority = -1 }
pedantic = { level = "warn", priority = -1 }
nursery = { level = "warn", priority = -1 }
cargo = { level = "warn", priority = -1 }
# Safety lints at higher priority
unwrap_used = { level = "warn", priority = 2 }
panic = { level = "warn", priority = 2 }
todo = { level = "warn", priority = 2 }
unimplemented = { level = "warn", priority = 2 }
expect_used = { level = "warn", priority = 2 }
Error Handling
Library Code
Library crates must use proper error handling:
#![allow(unused)]
fn main() {
// Good - returns Result
pub fn parse(data: &[u8]) -> Result<Header, ParseError> {
if data.len() < HEADER_SIZE {
return Err(ParseError::InsufficientData {
expected: HEADER_SIZE,
actual: data.len(),
});
}
// ...
}
// Bad - panics
pub fn parse(data: &[u8]) -> Header {
assert!(data.len() >= HEADER_SIZE); // Don't do this
// ...
}
}
Error Types
Use thiserror for error definitions:
#![allow(unused)]
fn main() {
use thiserror::Error;
#[derive(Debug, Error)]
pub enum ParseError {
#[error("insufficient data: expected {expected} bytes, got {actual}")]
InsufficientData { expected: usize, actual: usize },
#[error("invalid magic: expected {expected:?}, got {actual:?}")]
InvalidMagic { expected: [u8; 4], actual: [u8; 4] },
#[error("checksum mismatch")]
ChecksumMismatch { expected: [u8; 8], actual: [u8; 8] },
}
}
Avoiding unwrap() and expect()
Library code should avoid unwrap() and expect(). Use these alternatives:
#![allow(unused)]
fn main() {
// Instead of unwrap(), propagate errors
let value = map.get(&key).ok_or(Error::KeyNotFound)?;
// Instead of expect(), use ok_or_else() with context
let value = map.get(&key)
.ok_or_else(|| Error::KeyNotFound { key: key.clone() })?;
// For truly impossible cases, use unreachable!() with comment
match validated_enum {
Known::Variant => { /* ... */ }
// Validation already checked all variants
}
}
When expect() is unavoidable (e.g., in binrw map functions), add a
file-level allow with documentation:
#![allow(unused)]
fn main() {
//! Module description
//!
//! Uses expect in binrw map functions where Result types cannot be used.
#![allow(clippy::expect_used)]
}
Test Code
Test code may use unwrap() and expect() with the allow attribute:
#![allow(unused)]
fn main() {
#[cfg(test)]
#[allow(clippy::unwrap_used, clippy::expect_used, clippy::panic)]
mod tests {
// Tests can use unwrap/expect/panic freely
}
}
Binary Format Parsing
Use binrw
All binary formats use the binrw crate for parsing and building:
#![allow(unused)]
fn main() {
use binrw::{BinRead, BinWrite};
#[derive(Debug, BinRead, BinWrite)]
#[brw(big)] // NGDP uses big-endian
pub struct Header {
#[brw(magic = b"BLTE")]
pub magic: (),
pub header_size: u32,
pub flags: u8,
}
}
Big-Endian Default
NGDP/CASC formats use big-endian byte order. Always specify:
#![allow(unused)]
fn main() {
#[derive(BinRead, BinWrite)]
#[brw(big)] // Required for NGDP formats
pub struct Entry {
pub offset: u32,
pub size: u32,
}
}
If a field uses little-endian (rare), annotate explicitly:
#![allow(unused)]
fn main() {
#[derive(BinRead, BinWrite)]
#[brw(big)]
pub struct MixedEntry {
pub big_endian_field: u32,
#[brw(little)]
pub little_endian_field: u32, // Exception - document why
}
}
Round-Trip Testing
Every format must have round-trip tests:
#![allow(unused)]
fn main() {
#[test]
fn test_header_round_trip_preserves_all_fields() {
let original = Header {
header_size: 16,
flags: 0x01,
};
let mut buffer = Vec::new();
original.write(&mut Cursor::new(&mut buffer)).unwrap();
let parsed = Header::read(&mut Cursor::new(&buffer)).unwrap();
assert_eq!(original, parsed);
}
}
Documentation
Public API Documentation
All public items require documentation:
#![allow(unused)]
fn main() {
/// Parses a BLTE header from the given data.
///
/// # Arguments
///
/// * `data` - Raw bytes containing the BLTE header
///
/// # Returns
///
/// The parsed header on success, or an error if parsing fails.
///
/// # Errors
///
/// Returns `ParseError::InsufficientData` if the data is too short.
/// Returns `ParseError::InvalidMagic` if the magic bytes don't match.
///
/// # Examples
///
/// ```
/// use cascette_formats::blte::parse_header;
///
/// let data = include_bytes!("../fixtures/sample.blte");
/// let header = parse_header(data)?;
/// println!("Header size: {}", header.header_size);
/// # Ok::<(), cascette_formats::blte::ParseError>(())
/// ```
pub fn parse_header(data: &[u8]) -> Result<Header, ParseError> {
// ...
}
}
Binary Format Documentation
Document binary formats with exact byte layouts:
#![allow(unused)]
fn main() {
/// Archive index entry.
///
/// ## Binary Layout
///
/// | Offset | Size | Field | Description |
/// |--------|------|-------|-------------|
/// | 0x00 | 16 | key | Encoding key (MD5 hash) |
/// | 0x10 | 4 | size | Compressed size in bytes |
/// | 0x14 | 4 | offset | Offset into archive file |
///
/// Total size: 24 bytes (0x18)
///
/// All multi-byte fields are big-endian.
#[derive(Debug, BinRead, BinWrite)]
#[brw(big)]
pub struct IndexEntry {
pub key: [u8; 16],
pub size: u32,
pub offset: u32,
}
}
Naming Conventions
Types and Traits
| Item | Convention | Example |
|---|---|---|
| Structs | PascalCase | ArchiveIndex, BlteHeader |
| Enums | PascalCase | CompressionType, ParseError |
| Traits | PascalCase | CascFormat, KeyStore |
| Type aliases | PascalCase | ContentKey, EncodingKey |
Functions and Methods
| Item | Convention | Example |
|---|---|---|
| Functions | snake_case | parse_header, build_index |
| Methods | snake_case | self.get_entry(), self.is_valid() |
| Constructors | new or from_* | Header::new(), Key::from_hex() |
| Conversions | to_* or into_* | to_bytes(), into_vec() |
| Getters | no prefix | fn size(&self) not fn get_size(&self) |
| Boolean getters | is_* or has_* | is_empty(), has_entries() |
Constants and Statics
#![allow(unused)]
fn main() {
// Constants: SCREAMING_SNAKE_CASE
pub const HEADER_SIZE: usize = 16;
pub const MAGIC_BYTES: [u8; 4] = *b"BLTE";
// Statics (rare): SCREAMING_SNAKE_CASE
static GLOBAL_CONFIG: Lazy<Config> = Lazy::new(Config::default);
}
Modules
Module names use snake_case:
#![allow(unused)]
fn main() {
mod archive;
mod blte;
mod encoding;
mod root;
}
File structure mirrors module structure:
src/
├── archive/
│ ├── mod.rs
│ ├── index.rs
│ └── builder.rs
├── blte/
│ ├── mod.rs
│ ├── header.rs
│ └── compression.rs
└── lib.rs
Memory and Performance
Zero-Copy When Possible
Prefer borrowing over copying:
#![allow(unused)]
fn main() {
// Good - borrows data
pub fn parse<'a>(data: &'a [u8]) -> Result<Entry<'a>, Error> {
Ok(Entry {
key: &data[0..16],
// ...
})
}
// Less efficient - copies data
pub fn parse(data: &[u8]) -> Result<Entry, Error> {
Ok(Entry {
key: data[0..16].to_vec(),
// ...
})
}
}
Avoid Loading Large Files Into Memory
Stream large files instead of loading entirely:
#![allow(unused)]
fn main() {
// Good - streams data
pub fn process_archive<R: Read + Seek>(reader: &mut R) -> Result<(), Error> {
loop {
let entry = read_entry(reader)?;
process_entry(&entry)?;
}
}
// Bad - loads everything
pub fn process_archive(data: &[u8]) -> Result<(), Error> {
let archive = parse_entire_archive(data)?; // Out of memory for large files
// ...
}
}
Use Appropriate Collection Types
| Use Case | Type |
|---|---|
| Ordered, indexed access | Vec<T> |
| Key-value lookup | HashMap<K, V> or BTreeMap<K, V> |
| Unique values | HashSet<T> or BTreeSet<T> |
| Small fixed-size | [T; N] or ArrayVec<T, N> |
| Bytes | Bytes (from bytes crate) for shared ownership |
Unsafe Code
Unsafe code requires explicit documentation:
#![allow(unused)]
fn main() {
/// # Safety
///
/// Caller must ensure:
/// - `ptr` is valid for reads of `len` bytes
/// - `ptr` is properly aligned for `T`
/// - The memory is not mutated during this call
pub unsafe fn read_from_ptr<T>(ptr: *const u8, len: usize) -> T {
// ...
}
}
Prefer safe abstractions when possible. Use unsafe only when necessary for performance or FFI.
WASM Compatibility
Core libraries must compile to WASM. Avoid:
- C dependencies (use pure Rust implementations)
- File system access in library code
- Platform-specific code without
#[cfg]guards
Test WASM compilation:
cargo check --target wasm32-unknown-unknown -p cascette-crypto
cargo check --target wasm32-unknown-unknown -p cascette-formats