Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

cascette-rs Documentation

About This Project

cascette-rs is part of the wowemulation-dev initiative to build open source tooling for World of Warcraft emulation. The project focuses on modern WoW Classic clients (1.13+, 2.5+, 3.4+) which use Blizzard’s NGDP content distribution system.

Why Modern Clients?

The WoW emulation and modding community has historically focused on the 3.3.5a client from 2008. While functional, this approach has limitations:

  • Outdated technology: MPQ archives, no content addressing, manual patching
  • Fragmented tooling: Many tools exist only as abandoned Windows binaries
  • Limited modding: Technical constraints restrict what can be modified

Modern Classic clients differ from 3.3.5a:

  • Active development: Blizzard continues updating these clients
  • Better architecture: NGDP/CASC enables content addressing and streaming
  • Cross-platform: Same content format works on Windows, macOS, and Linux
  • Preservation: Community CDN mirrors ensure historical builds remain available

What You Can Do with cascette-rs

For Emulator Developers

  • Download specific WoW Classic builds for server development
  • Extract game data files (DBCs, maps, models) for server-side use
  • Verify client installations match expected versions
  • Serve game content to clients via the Agent API

For Archivists

  • Mirror complete WoW builds from Blizzard’s CDN
  • Preserve historical game versions before they disappear
  • Access builds from community CDN mirrors when Blizzard removes them
  • Track version history across all WoW products

For Modders

  • Extract assets from any WoW build for modification
  • Understand file relationships through encoding and root manifests
  • Work with modern file formats instead of legacy MPQ tools
  • Build custom content distribution for modified clients

For Tool Developers

  • Parse all NGDP/CASC binary formats with the cascette-formats library
  • Build applications on top of cascette’s CDN and protocol layers
  • Integrate CASC reading into existing toolchains
  • Create cross-platform tools that work on Linux, macOS, and Windows

What is NGDP?

NGDP (Next Generation Distribution Pipeline) is Blizzard’s content distribution system. It replaced MPQ/P2P/Torrent distribution with World of Warcraft 6.0 in 2014.

For technical details, see NGDP on wowdev.wiki.

System Overview

NGDP consists of three components:

  1. Ribbit API: Provides product versions, CDN endpoints, and configuration data
  2. CDN Distribution: Delivers game content through HTTP/HTTPS
  3. Agent: Local HTTP service (port 1120) that manages downloads and installations

Key Differences from MPQ

  • Distribution Method: CDN-based delivery instead of P2P/Torrent

  • Content Addressing: Files identified by content hashes rather than names

  • Update Mechanism: Incremental updates through partial file downloads

  • Archive Format: CASC (Content Addressable Storage Container) replaces MPQ archives

  • Content Protection: Encryption support for secure pre-release distribution

Benefits of NGDP

For Distribution

  • Reduced Server Load: CDN infrastructure handles content delivery

  • Faster Downloads: Users connect to nearest CDN nodes

  • Incremental Updates: Only changed content needs downloading

  • Parallel Downloads: Multiple files retrieved simultaneously

  • Pre-release Distribution: Encrypted content can be distributed before launch

For Development

  • Content Deduplication: Identical files stored once across versions

  • Stream Installation: Games playable before download completes

  • Platform Independence: Same content system across operating systems

Core Concepts

Content Addressing

Files are identified by MD5 hashes of their content. Identical content produces identical hashes, enabling deduplication, integrity verification, and cache efficiency.

System Files

NGDP uses metadata files to manage content:

  • Root File: Maps game files to content keys

  • Encoding File: Maps content to compressed versions

  • Install Manifest: Defines installation requirements

  • Download Manifest: Sets download priorities

BLTE Format

BLTE (Block Table Encoded) is the container format for game data. It supports:

  • Block-based compression

  • Multiple compression algorithms

  • Encryption per block

  • Chunked processing

Content Encryption

NGDP supports Salsa20 encryption for distributing content before its release date. Files can be pre-positioned on CDN while remaining inaccessible until decryption keys are provided.

Technical Specifications

  • Byte Order: Big-endian (network byte order)

  • Hash Algorithm: MD5 for content identification

  • Key Size: 128-bit (16 bytes)

  • Compression: zlib, lz4, and other algorithms per block

  • Encryption: Salsa20 stream cipher for content protection

Format Organization

NGDP/CASC formats are organized by their storage location and usage context:

1. CDN Formats (Network/Remote)

Formats served by Blizzard CDN servers via HTTP/HTTPS.

2. CASC Formats (Local/Client)

Formats created and managed by the Battle.net client on local storage.

3. Shared Formats

Formats used in both CDN and local contexts.

Component Documentation

Service Discovery

Service discovery components handle version information, CDN endpoint discovery, and product configuration metadata:

  • Ribbit Protocol - TCP-based discovery and version information API

  • BPSV Format - Blizzard Pipe-Separated Values format for API responses

CDN Formats

Configuration Files (Text)

  • Build Config - Build-specific settings (/config/{hash})

  • CDN Config - CDN server and archive lists (/config/{hash})

  • Product Config - Product settings and versions (/config/{hash})

  • Patch Config - Differential patch information (/config/{hash})

Content Files (Binary)

Immutable, content-addressed files served from CDN:

  • CDN Archives - BLTE containers with game content (/data/{prefix}/{hash}.archive)

  • CDN Indices - Maps keys to archive locations (/data/{prefix}/{hash}.index)

  • Encoding File - Maps content to encoding keys (/data/{prefix}/{hash})

  • Root File - Maps files to content keys (/data/{prefix}/{hash})

  • Install Manifest - Installation requirements (/data/{prefix}/{hash})

  • Download Manifest - Download priorities (/data/{prefix}/{hash})

  • Patch Archives - Delta patches (/patch/{prefix}/{hash}.archive)

  • Patch Indices - Patch archive index (/patch/{prefix}/{hash}.index)

Modern Additions (WoW 8.2+)

  • TVFS - Virtual file system manifest (via vfs-* fields in BuildConfig)

CASC Local Formats

Client-side storage structures created and managed by Battle.net:

Local Indices

  • IDX Journal - Bucket-based local index (Data/indices/{bucket}.idx)

  • Archive Groups - Combined archive index (client-generated optimization)

  • Shadow Memory - Memory-mapped cache (Data/shmem)

Local Archives

  • data.### - Combined CDN archives (Data/data/data.###)

  • patch.### - Combined patch archives (Data/patch/patch.###)

Local Configuration

  • .build.info - Local build configuration (root directory), BPSV-formatted

  • DBCache - Hotfix database cache (Cache/ADB/*.bin)

Shared Formats

Container Formats

  • BLTE Format - Block compression/encryption (all content storage)

  • ESpec Format - Encoding specifications (compression definitions)

Cryptographic

  • MD5 Keys - Content addressing (all key references)

  • Salsa20 Encryption - Stream cipher (content protection)

  • TACT Keys - Key management (decryption keys)

Supporting Systems

  • CDN Architecture - Content distribution network structure

  • CDN Mirroring - Historical preservation strategies

  • FileDataId - Persistent file identification across builds

Format Relationships

CDN Download Flow

flowchart TB
    subgraph Discovery
        Ribbit["Ribbit (BPSV)"]
        ProductConfig["Product Config"]
        CDNConfig["CDN Config"]
        BuildConfig["Build Config"]
    end

    subgraph Content
        Archives["CDN Archives + Indices"]
        Encoding["Encoding File"]
        Root["Root File"]
        Manifests["Install/Download Manifests"]
    end

    Ribbit --> ProductConfig --> CDNConfig --> BuildConfig
    BuildConfig --> Archives
    Archives --> Encoding --> Root
    Root --> Manifests

Content Resolution

flowchart LR
    subgraph Input
        File["Filename/FileDataId"]
    end

    subgraph Lookup
        Root["Root File"]
        CKey["Content Key"]
        Encoding["Encoding File"]
        EKey["Encoding Key + ESpec"]
    end

    subgraph Retrieval
        Index["CDN Index"]
        Location["Archive Location"]
        Archive["CDN Archive"]
        BLTE["BLTE Data"]
    end

    subgraph Output
        Decompress["Decompression"]
        Raw["Raw Content"]
    end

    File --> Root --> CKey
    CKey --> Encoding --> EKey
    EKey --> Index --> Location
    Location --> Archive --> BLTE
    BLTE --> Decompress --> Raw

Glossary

Key terms used throughout this documentation. If you’re coming from the MPQ/3.3.5a modding scene, pay attention to the “MPQ Equivalent” notes.

Content Identification

Content Key (CKey)

MD5 hash of a file’s uncompressed content. Used to identify files regardless of how they’re compressed or stored.

  • Size: 16 bytes (128 bits)
  • MPQ Equivalent: Similar to how MPQ uses filenames, but content-based
  • Example: a1b2c3d4e5f6... (32 hex characters)

Encoding Key (EKey)

MD5 hash of a file’s compressed/encoded BLTE data. Used to locate files on CDN and in archives.

  • Size: 16 bytes (128 bits)
  • Relationship: CKey → Encoding File → EKey
  • Example: Files with identical content share a CKey but may have different EKeys

FileDataID (FDID)

Numeric identifier for a file, persistent across game versions. Replaced filename-based lookups in WoW 8.0+.

  • Size: 4 bytes (32-bit integer)
  • Range: 0 to ~4 million (as of 2024)
  • MPQ Equivalent: None - MPQ used filenames exclusively
  • Example: 1234567 refers to a specific texture, model, or data file

Name Hash

Jenkins96 hash of a file’s path. Used in older builds (pre-8.0) to look up files by name.

  • Algorithm: Jenkins96 (lookup3)
  • MPQ Equivalent: Similar to MPQ’s hash table for filename lookup
  • Note: Deprecated in favor of FileDataID in modern builds

File Formats

BLTE (Block Table Encoded)

Container format that wraps all CASC content. Provides compression and optional encryption.

  • MPQ Equivalent: Similar to MPQ’s sector-based compression
  • Key difference: BLTE supports multiple compression algorithms per file
  • Compression: None, zlib, LZMA, LZ4, Zstd
  • Encryption: Salsa20, ARC4 (older builds)

Encoding File

Maps CKeys to EKeys. The central lookup table for content resolution.

  • Purpose: Find where a file’s compressed data lives
  • MPQ Equivalent: None - MPQ stored files directly by name

Root File

Maps FileDataIDs (or name hashes) to CKeys. The entry point for file lookup.

  • Purpose: Find what content hash a file has
  • MPQ Equivalent: Combines MPQ’s hash table and block table functions
  • Contains: FileDataID, locale flags, content flags, CKey

Install Manifest

Lists files required for a minimal installation (enough to launch the game).

  • Purpose: Prioritize essential files for streaming installs
  • MPQ Equivalent: None - MPQ required full downloads

Download Manifest

Prioritizes files for background downloading after initial install.

  • Purpose: Order non-essential downloads by importance
  • MPQ Equivalent: None

Storage Concepts

Archive

Large file containing many compressed files, identified by EKey.

  • CDN archives: ~256 MB bundles served via HTTP
  • Local archives: data.xxx files in the Data folder
  • MPQ Equivalent: Similar to .mpq files, but content-addressed

Archive Index

Maps EKeys to offsets within an archive file.

  • CDN index: .index file paired with each archive
  • Local index: .idx files in Data/indices/
  • MPQ Equivalent: Similar to MPQ’s block table

Archive Group

Combined index covering multiple archives. Optimization for faster lookups.

  • Location: Generated locally by the client from downloaded archive indices
  • Purpose: Single lookup instead of checking each archive index
  • Note: Never downloaded from CDN - always client-generated

CASC (Content Addressable Storage Container)

The local storage system. Everything is identified by content hash.

  • MPQ Equivalent: Replaces MPQ archives entirely
  • Key difference: Files found by hash, not by name

Network Concepts

CDN (Content Delivery Network)

Servers that host game content. Blizzard uses Akamai, Level3, and others.

  • Structure: https://{cdn}/{product}/{type}/{hash[:2]}/{hash[2:4]}/{hash}
  • Types: config, data, patch

Ribbit

Protocol for querying product versions and CDN information.

  • Port: 1119 (TCP) or HTTP
  • Purpose: Discover what versions exist and where to download them
  • MPQ Equivalent: None - MPQ versions were distributed manually

Agent

Local HTTP service (port 1120) that manages downloads and installations.

  • Purpose: Background downloading, installation management
  • MPQ Equivalent: None - MPQ required manual patching

Configuration

Build Config

Per-build settings including root/encoding file hashes and encryption keys.

  • Location: CDN /config/{hash}
  • Contains: Root CKey, encoding CKey, patch info, VFS info

CDN Config

Lists available CDN servers and archive hashes.

  • Location: CDN /config/{hash}
  • Contains: Archive list, server URLs, file groups

Product Config

Product-wide settings spanning multiple builds.

  • Location: CDN /config/{hash}
  • Contains: Decryption keys, feature flags

Encryption

TACT Key

Encryption key for protected content. Named keys are published, unnamed are secret.

  • Size: 16 bytes
  • Algorithm: Used with Salsa20 stream cipher
  • Source: Community-maintained key databases

Salsa20

Stream cipher used for content encryption in modern builds.

  • Key size: 256 bits (16-byte key + 16-byte name as nonce)
  • Replaces: ARC4 (used in older builds)

MPQ to CASC Quick Reference

MPQ ConceptCASC Equivalent
.mpq fileArchive (data.xxx)
FilenameFileDataID or CKey
Hash tableRoot file
Block tableArchive index
Sector compressionBLTE blocks
Patch MPQPatch archives + encoding
listfile.txtCommunity listfiles
Manual patchingAgent + CDN

See Also

Encoding File Format

The encoding file is the gateway to all CASC content. It maps content keys (unencoded file hashes) to encoding keys (encoded/compressed file hashes) and provides essential metadata for content resolution.

Overview

The encoding file serves multiple critical functions:

  1. Content Resolution: Maps content keys to encoding keys for CDN retrieval
  2. Compression Metadata: Specifies ESpec encoding for each file
  3. Size Information: Tracks both compressed and decompressed sizes
  4. Multi-Version Support: Handles multiple encoding keys per content key

File Structure

The encoding file is BLTE-encoded and consists of:

[BLTE Container]
  [Header]           (22 bytes)
  [ESpec Table]      (variable)
  [CKey Page Index]  (variable)
  [CKey Pages]       (variable)
  [EKey Page Index]  (variable)
  [EKey Pages]       (variable)
  [File ESpec]       (variable) - The encoding file's own ESpec

Binary Format

Header (22 bytes)

struct EncodingHeader {
    uint16_t magic;           // 0x00: 'EN' (0x454E)
    uint8_t  version;         // 0x02: Version (1)
    uint8_t  ckey_size;       // 0x03: Content key size (16)
    uint8_t  ekey_size;       // 0x04: Encoding key size (16)
    uint16_t ckey_page_size;  // 0x05: CKey page size in KB (BE)
    uint16_t ekey_page_size;  // 0x07: EKey page size in KB (BE)
    uint32_t ckey_page_count; // 0x09: Number of CKey pages (BE)
    uint32_t ekey_page_count; // 0x0D: Number of EKey pages (BE)
    uint8_t  flags;            // 0x11: Flags (must be 0)
    uint32_t espec_size;      // 0x12: ESpec table size (BE)
};

ESpec String Table

Immediately follows the header. Contains null-terminated strings referenced by entries:

"z\0b:{0,4}\0b:{0,4},z\0b:{0,2},z:{0,6}\0...\0"

Common ESpec patterns:

  • z - ZLib compression

  • n - No compression

  • b:{start,size} - Block encoding (see ESpec)

  • Empty string for uncompressed files

Page Index Tables

CKey Page Index

For each CKey page:

struct PageIndex {
    uint8_t first_key[ckey_size];  // First key in the page
    uint8_t page_hash[16];         // MD5 of the page data
};

EKey Page Index

Similar structure but uses ekey_size for the first key.

Content Key (CKey) Pages

Pages are sorted by content key for binary search. Each page contains multiple entries:

struct CKeyEntry {
    uint8_t  ekey_count;                    // Number of encoding keys
    uint8_t  file_size[5];                  // Decompressed size (40-bit BE)
    uint8_t  ckey[ckey_size];               // Content key
    uint8_t  ekeys[ekey_size * ekey_count]; // Encoding keys
};

Entry layout (sizes from header):

[count:1] [size:5] [ckey:ckey_size] [ekey1:ekey_size] [ekey2:ekey_size] ...

Multiple EKeys: A single content key can map to multiple encoding keys, allowing:

  • Different compression algorithms for the same content

  • Regional variations with different encryption

  • Platform-specific optimizations

Encoding Key (EKey) Pages

Maps encoding keys to ESpec entries:

struct EKeyEntry {
    uint8_t  ekey[ekey_size];     // Encoding key
    uint32_t espec_index;          // Index into ESpec table (BE)
    uint8_t  file_size[5];         // Encoded file size (40-bit BE)
};

Padding Detection: EKey pages may contain padding entries that must be skipped. Two sentinel patterns indicate padding:

  1. espec_index == 0xFFFFFFFF (Agent.exe sentinel)
  2. espec_index == 0 with all key bytes 0x00 (zero-fill padding)

Content Resolution Process

  1. Find CKey Entry:

    • Binary search CKey page index for target page
    • Linear search within page for content key
    • Extract encoding key(s) and decompressed size
  2. Find EKey Entry (optional):

    • Binary search EKey page index
    • Locate entry to get ESpec index and compressed size
  3. Parse ESpec:

    • Index into ESpec string table
    • Parse encoding specification for compression details

Usage

Parsing

#![allow(unused)]
fn main() {
use cascette_formats::encoding::EncodingFile;

// From decompressed data
let encoding = EncodingFile::parse(&data)?;

// From BLTE-encoded CDN data
let encoding = EncodingFile::parse_blte(&blte_data)?;
}

Content Key Lookup

#![allow(unused)]
fn main() {
use cascette_crypto::ContentKey;

// Single lookup (binary search on page index, linear within page)
if let Some(ekey) = encoding.find_encoding(&content_key) {
    println!("Encoding key: {:?}", ekey);
}

// Get all encoding keys for a content key
let ekeys = encoding.find_all_encodings(&content_key);

// Batch lookup (sort-merge across pages)
let results = encoding.batch_find_encodings(&content_keys);
}

EKey to ESpec Lookup

#![allow(unused)]
fn main() {
use cascette_crypto::EncodingKey;

if let Some(espec) = encoding.find_espec(&encoding_key) {
    println!("Compression spec: {}", espec);
}
}

Building

#![allow(unused)]
fn main() {
use cascette_formats::encoding::{EncodingBuilder, CKeyEntryData, EKeyEntryData};

let mut builder = EncodingBuilder::new(); // 4KB pages
builder.add_ckey_entry(CKeyEntryData {
    content_key,
    file_size: 524_288,
    encoding_keys: vec![encoding_key],
});
builder.add_ekey_entry(EKeyEntryData {
    encoding_key,
    espec: "z".to_string(),
    file_size: 187_234,
});
let encoding_file = builder.build()?;
}

Page Structure

All pages are loaded eagerly. Each page preserves its original binary data for byte-exact round-trip reconstruction:

#![allow(unused)]
fn main() {
// Page<T> holds parsed entries and raw bytes
pub struct Page<T> {
    pub entries: Vec<T>,
    pub original_data: Vec<u8>,
}

// IndexEntry holds first key + MD5 checksum for integrity
pub struct IndexEntry {
    pub first_key: [u8; 16],
    pub checksum: [u8; 16],
}
}

All multi-byte header and page fields are big-endian.

ESpec Integration

The ESpec strings define how files are encoded:

Common Patterns

  1. Uncompressed: Empty string or n
  2. ZLib: z
  3. Partial compression: b:{0,1000},z,b:{1000,500},n
    • Bytes 0-1000: ZLib compressed
    • Bytes 1000-1500: Uncompressed

Parsing ESpec

#![allow(unused)]
fn main() {
enum ESpecOp {
    None,
    ZLib,
    ByteRange { start: u32, size: u32 },
}

fn parse_espec(spec: &str) -> Vec<ESpecOp> {
    if spec.is_empty() || spec == "n" {
        return vec![ESpecOp::None];
    }

    spec.split(',')
        .map(|part| match part {
            "z" => ESpecOp::ZLib,
            "n" => ESpecOp::None,
            s if s.starts_with("b:") => {
                // Parse "b:{start,size}"
                let nums = parse_range(s);
                ESpecOp::ByteRange {
                    start: nums.0,
                    size: nums.1
                }
            }
            _ => ESpecOp::None,
        })
        .collect()
}
}

Multi-Version Support

Files can have multiple encoding keys (different compression/encryption):

#![allow(unused)]
fn main() {
struct CKeyEntry {
    ekey_count: u8,        // Usually 1, can be 2+
    file_size: u64,        // Same for all versions
    ckey: [u8; 16],        // Content key
    ekeys: Vec<[u8; 16]>,  // Multiple encoding keys
}
}

Use cases include different regional encryption and progressive quality levels.

Performance Considerations

Memory-Mapped Access

For large encoding files (100MB+):

#![allow(unused)]
fn main() {
use memmap2::MmapOptions;

struct EncodingFile {
    mmap: Mmap,
    header: EncodingHeader,
    // ...
}

impl EncodingFile {
    fn open(path: &Path) -> Result<Self> {
        let file = File::open(path)?;
        let mmap = unsafe { MmapOptions::new().map(&file)? };

        // Parse header from mmap
        let header = EncodingHeader::read(&mmap[..22])?;

        Ok(Self { mmap, header })
    }
}
}

Page Caching

Cache frequently accessed pages:

#![allow(unused)]
fn main() {
struct PageCache {
    entries: LruCache<u32, Arc<CKeyPage>>,
}
}

Validation

Checksums

Each page has an MD5 checksum in the index:

#![allow(unused)]
fn main() {
fn validate_page(index: &PageIndex, data: &[u8]) -> bool {
    let computed = md5::compute(data);
    computed.0 == index.page_hash
}
}

Size Constraints

  • Page sizes must be > 0 (no power-of-2 requirement enforced)

  • Key sizes in range [1, 16] bytes

  • Page counts must be > 0

  • ESpec size must be > 0

  • File sizes use 40-bit integers (up to 1TB)

File’s Own ESpec

After all the data structures, the encoding file contains its own ESpec string describing how it’s compressed. This self-referential metadata is an intentional, documented feature of the NGDP format.

Official Documentation

The wowdev.wiki TACT specification explicitly lists this as the 5th component:

  1. Header
  2. Encoding specification data (ESpec)
  3. Content key → encoding key table
  4. Encoding key → encoding spec table
  5. “Encoding specification data for the encoding file itself”

Reference Implementation

TACT.Net explicitly handles this in EncodingFile.cs:

  • Line 151: // remainder is an ESpec block for the file itself

  • Implements GetFileESpec() method to generate this when writing

Real-World Examples

wow_classic 5.5.0.62655 (60 bytes):

b:{22=n,76025=z,223424=n,28598272=n,146656=n,18771968=n,*=z}

wow_classic_era 1.15.7.61582 (55 bytes):

b:{22=n,2069=z,65536=n,8388608=n,43008=n,5505024=n,*=z}

Meaning:

  • 22=n: Header (22 bytes) uncompressed

  • 76025=z: ESpec table compressed with ZLib

  • 223424=n: CKey index uncompressed

  • 28598272=n: CKey pages uncompressed

  • 146656=n: EKey index uncompressed

  • 18771968=n: EKey pages uncompressed

  • *=z: Remainder (the file’s own ESpec) compressed

This self-referential design allows files to describe their own compression structure using the same ESpec format as all other files.

Common Issues

  1. Page Boundary Errors: Entries can span pages
  2. Endianness: All multi-byte values are big-endian
  3. ESpec Index: Zero-based into string table
  4. CKey Padding: Entries with ekey_count = 0 indicate end of page data
  5. EKey Padding: Entries with espec_index = 0xFFFFFFFF or all-zero keys indicate padding (see Padding Detection above)
  6. File Size: Remember to account for the file’s own ESpec at the end

Real-World Example

Using wow_classic_era 1.15.7.61582:

Encoding file: bbf06e7476382cfaa396cff0049d356b

Header:
  Magic: 0x454E ('EN')
  Version: 1
  CKey/EKey size: 16 bytes each
  CKey pages: 4KB × 127 pages
  EKey pages: 4KB × 127 pages
  ESpec table: 1,234 bytes

Example CKey entry:
  Content Key: 3ce96e7a9e3b6f5c9d99c8b4e0a4f3d2
  EKey count: 1
  File size: 524,288 bytes (512KB)
  Encoding Key: 7f8a9b3c4d5e6f7081929a3b4c5d6e7f

Corresponding EKey entry:
  Encoding Key: 7f8a9b3c4d5e6f7081929a3b4c5d6e7f
  ESpec index: 1 (points to "z" - ZLib)
  Compressed size: 187,234 bytes

This shows a typical game asset compressed from 512KB to 183KB using ZLib.

Implementation Flow

#![allow(unused)]
fn main() {
use cascette_formats::encoding::EncodingFile;
use cascette_crypto::ContentKey;

// 1. Parse encoding file from BLTE-encoded CDN data
let encoding = EncodingFile::parse_blte(&cdn_data)?;

// 2. Look up content by content key
let ekey = encoding.find_encoding(&content_key)
    .ok_or("content key not found")?;

// 3. Optionally get the compression spec
let espec = encoding.find_espec(&ekey);

// 4. Fetch actual file from CDN using encoding key, then decompress
}

Version History

The Encoding file format currently has only one version:

Version 1 (Current)

  • Header Size: 22 bytes
  • Magic: “EN” (0x454E)
  • Features:
    • Content key to encoding key mapping
    • Dual page index system (CKey and EKey pages)
    • ESpec string table for compression metadata
    • 40-bit file sizes (up to 1TB per file)
    • Multiple encoding keys per content key support
    • Page-based binary search
    • MD5 page checksums for integrity

Version Detection

All known encoding files use version 1. The version field is at offset 2 in the header. If future versions are introduced, parsers should check this field after validating the “EN” magic bytes.

References

Root File Format

The Root file is the primary catalog of all files stored in CASC archives. It maps file paths or FileDataIDs to content keys, enabling game clients to locate and retrieve specific assets.

Overview

The Root file serves as the master index for all game content:

  • Maps FileDataIDs to content keys

  • Supports multiple locales and content flags

  • Groups files into blocks for efficient lookup

  • Handles both named and unnamed entries

File Structure

The Root file is BLTE-encoded and organized into blocks:

[BLTE Container]
  [Header]
  [Block 1]
  [Block 2]
  ...
  [Block N]

Binary Format

Version Detection

The Root file format has evolved significantly:

  • Pre-30080: No MFST magic, raw block data

  • Build 30080+ (v2): MFST magic with file counts

  • Build 50893+ (v3): Added header_size/version fields

  • Build 58221+ (v4): Extended content flags to 40 bits

Header Structures

Version 2 (Build 30080+)

struct RootHeaderV2 {
    uint32_t magic;              // 'MFST' (0x4D465354) or 'TSFM' (0x5453464D)
    uint32_t total_file_count;   // Total number of files
    uint32_t named_file_count;   // Number of named entries
};

Note: Some builds use ‘TSFM’ magic instead of ‘MFST’. This appears to be a little-endian representation. Both should be accepted as valid.

Version 3 (Build 50893+)

struct RootHeaderV3 {
    uint32_t magic;              // 'MFST' (0x4D465354) or 'TSFM' (0x5453464D)
    uint32_t header_size;        // Size of header (20 bytes)
    uint32_t version;            // Version (1)
    uint32_t total_file_count;   // Total number of files
    uint32_t named_file_count;   // Number of named entries
    uint32_t padding;            // Padding (0)
};

Note: Version 3 also uses TSFM magic in observed builds, maintaining consistency with Version 2.

Version Detection Heuristic: After reading the magic, check the next two u32 values. If the first value (header_size) is in range [16, 100) and the second value (version) is less than 10, the file is v3+. Otherwise treat the first value as total_file_count (v2). Version 1 maps to V2 block format.

Block Structure

Each block contains file entries for specific locale and content flag combinations. Important: The block header format changed significantly between V1 and V2+.

V1 Block Header (Pre-30080, 12 bytes)

V1 files have no MFST/TSFM magic and use a 12-byte block header with interleaved record format:

struct RootBlockHeaderV1 {
    uint32_t num_records;        // Number of records in block
    uint32_t content_flags;      // Content flags (32-bit)
    uint32_t locale_flags;       // Locale flags (language/region)

    // FileDataID deltas (delta-encoded)
    int32_t fileDataIDDeltas[num_records];

    // Interleaved record data (content_key + name_hash per record)
    RootRecordInterleaved records[num_records];
};

V2+ Block Header (Build 30080+, 17 bytes)

V2 and later versions have MFST/TSFM magic and use a 17-byte block header with separated arrays. Per wowdev.wiki documentation for Version 2 (11.1.0+):

#pragma pack(push, 1)
struct RootBlockHeaderV2 {
    uint32_t num_records;        // Number of records in block
    uint32_t locale_flags;       // Locale flags (MOVED - was third in V1!)
    uint32_t content_flags;      // Content flags (was second in V1)
    uint32_t unk2;               // Unknown field 2
    uint8_t  unk3;               // Unknown field 3 (flags via bit-shift)

    // FileDataID deltas (delta-encoded)
    int32_t fileDataIDDeltas[num_records];

    // Separated arrays (all content_keys, then all name_hashes)
    uint8_t content_keys[num_records][16];
    uint8_t name_hashes[num_records][8];  // Optional based on flags
};
#pragma pack(pop)

Critical Implementation Note: The field order change from V1 to V2+ is a common source of parsing bugs. In V1, the order is num_records, content_flags, locale_flags. In V2+, the order is num_records, locale_flags, content_flags, unk2, unk3.

V4 Extended Content Flags

V4 (Build 58221+) extends content flags to 40 bits, increasing the block header to 18 bytes (the content_flags field grows from 4 to 5 bytes). The 40-bit value is read as a u32 (4 bytes) plus a u8 (1 byte):

uint32_t content_flags_low;   // Bits 0-31
uint8_t  content_flags_high;  // Bits 32-39
// Combined: content_flags = content_flags_low | (content_flags_high << 32)

Record Formats

Old Format (Interleaved)

struct RootRecordOld {
    uint8_t content_key[16];     // MD5 content key
    uint8_t name_hash[8];        // Jenkins96 name hash (optional)
};

New Format (Separated)

struct RootRecordNew {
    // Arrays stored separately
    uint8_t content_keys[num_records][16];
    uint8_t name_hashes[num_records][8];  // Optional
};

Content Flags

Content flags specify platform, architecture, and file attributes:

32-bit Flags (v2-v3)

Values match CascLib (CascLib.h), TACTSharp, and WoWDev wiki:

ValueFlagDescription
0x00000004InstallInstall manifest entry
0x00000008LoadOnWindowsWindows platform
0x00000010LoadOnMacOSmacOS platform
0x00000020x86_3232-bit x86 architecture
0x00000040x86_6464-bit x86 architecture
0x00000080LowViolenceCensored content
0x00000100DoNotLoadSkip file
0x00000800UpdatePluginLauncher plugin
0x00008000Arm64ARM64 architecture
0x08000000EncryptedEncrypted content
0x10000000NoNameHashNo name hash in block
0x20000000UncommonResolutionNon-standard resolution
0x40000000BundleBundled content
0x80000000NoCompressionUncompressed

40-bit Flags (v4+)

Build 58221+ extends to 40 bits, stored as u32 + u8:

  • Bits 0-31: Standard content flags (same as v2/v3)

  • Bits 32-39: Extended flags (single byte, shifted left by 32)

Common combinations:

  • 0x00000000: All platforms, default

  • 0x00000008: Windows only

  • 0x00000010: macOS only

  • 0x08000000: Encrypted content

  • 0x10000000: No name hash present

Locale Flags

32-bit field representing language/region:

ValueLocaleDescription
0x00000002enUSEnglish (US)
0x00000004koKRKorean
0x00000010frFRFrench
0x00000020deDEGerman
0x00000040zhCNChinese (Simplified)
0x00000080esESSpanish (Spain)
0x00000100zhTWChinese (Traditional)
0x00000200enGBEnglish (UK)
0x00000400enCNEnglish (China)
0x00000800enTWEnglish (Taiwan)
0x00001000esMXSpanish (Mexico)
0x00002000ruRURussian
0x00004000ptBRPortuguese (Brazil)
0x00008000itITItalian
0x00010000ptPTPortuguese (Portugal)
0xFFFFFFFFAllAll locales

FileDataID Delta Encoding

FileDataIDs use delta encoding for compression:

#![allow(unused)]
fn main() {
fn decode_file_data_ids(deltas: &[i32]) -> Vec<u32> {
    let mut ids = Vec::new();
    let mut current_id = 0u32;

    for (i, &delta) in deltas.iter().enumerate() {
        if i == 0 {
            // First entry: direct value, not a delta
            current_id = delta as u32;
        } else {
            // Subsequent entries: add delta to previous ID
            current_id = (current_id as i32 + delta) as u32;
        }
        ids.push(current_id);

        // Important: Increment for next iteration
        current_id += 1;
    }

    ids
}
}

Note: The algorithm increments current_id by 1 after each entry, then applies the next delta. This handles sequential FileDataIDs efficiently.

Lookup Process

  1. Parse Root file: Decompress BLTE, read header and blocks
  2. Filter by flags: Select blocks matching desired locale/content
  3. Find FileDataID: Binary search or iterate through blocks
  4. Extract content key: Retrieve corresponding MD5 hash
  5. Resolve via encoding: Use content key to find encoding key

Name Hash Calculation

For named files, Jenkins96 hash (hashlittle2) is used:

#![allow(unused)]
fn main() {
fn jenkins96_hash(filename: &str) -> u64 {
    // Normalize path: uppercase with backslashes (matching CascLib's
    // NormalizeFileName_UpperBkSlash)
    let normalized = filename.to_uppercase().replace('/', "\\");
    let bytes = normalized.as_bytes();

    // Jenkins hashlittle2 with pc=0, pb=0
    let hash = Jenkins96::hash(bytes);

    // Return (pc << 32) | pb directly (no word swap)
    // Matches CascLib's CalcNormNameHash
    hash.hash64
}
}

Important Jenkins96 Details:

  • Paths are normalized to uppercase with backslashes (not forward slashes)

  • The hash is 64-bit (8 bytes) not 96-bit despite the name

  • Some blocks have NoNameHash flag, omitting name hashes entirely

  • Uses Bob Jenkins’ lookup3.c algorithm (hashlittle2 function)

  • Processes data in 12-byte chunks with little-endian byte order

  • The 0xDEADBEEF constant is added during initialization

  • Python validation tool available in cascette-py project: https://github.com/wowemulation-dev/cascette-py

Example Hashes:

  • Empty string: 0xDEADBEEFDEADBEEF

  • Interface\Icons\INV_Misc_QuestionMark.blp: 0x9EB59E3C76124837

Implementation Example

#![allow(unused)]
fn main() {
struct RootFile {
    header: RootHeader,
    blocks: Vec<RootBlock>,
}

impl RootFile {
    pub fn find_file(&self, file_data_id: u32) -> Option<MD5Hash> {
        for block in &self.blocks {
            // Check if block matches desired flags
            if !self.matches_flags(block) {
                continue;
            }

            // Search for FileDataID
            if let Some(idx) = block.find_file_index(file_data_id) {
                return Some(block.records[idx].content_key);
            }
        }
        None
    }
}
}

Version History

  • Build 18125 (6.0.1): Initial CASC Root format (V1)

    • No magic header
    • 12-byte block header: num_records, content_flags, locale_flags
    • Interleaved record format: (ckey, name_hash) per record
  • Build 30080 (8.2.0): Added MFST magic signature (V2)

    • MFST/TSFM magic header with file counts
    • 17-byte block header: num_records, locale_flags, content_flags_1, content_flags_2, content_flags_3
    • Field order changed: locale_flags moved before content_flags
    • Combined content flags: content_flags_1 | content_flags_2 | (content_flags_3 << 17)
    • Separated array format: all ckeys, then all name_hashes
  • Build 50893 (10.1.7): Added header_size/version fields (V3)

    • Extended header with header_size, version, padding fields
    • Same 17-byte block header format as V2
  • Build 58221 (11.1.0): Extended content flags to 40 bits (V4)

    • 18-byte block header (content_flags grows from 4 to 5 bytes)
    • 40-bit content flags stored as u32 + u8

Version Detection Code

#![allow(unused)]
fn main() {
fn detect_root_version(data: &[u8]) -> RootVersion {
    if data.len() < 4 {
        return RootVersion::Invalid;
    }

    // Check for MFST or TSFM magic
    let magic = &data[0..4];
    if magic != b"MFST" && magic != b"TSFM" {
        return RootVersion::V1; // Pre-30080, no magic
    }

    // Read the two u32 values after magic
    let value1 = u32::from_le_bytes(data[4..8].try_into().unwrap());
    let value2 = u32::from_le_bytes(data[8..12].try_into().unwrap());

    // Heuristic: header_size in [16, 100) and version < 10
    // indicates v3+ with explicit header_size/version fields
    if (16..100).contains(&value1) && value2 < 10 {
        match value2 {
            4.. => RootVersion::V4,
            _ => RootVersion::V3, // version 1-3 all use V2/V3 block format
        }
    } else {
        RootVersion::V2 // 30080+, value1 is total_file_count
    }
}
}

Parser Implementation Status

The Python parser (cascette-py) currently supports:

  • Version detection (MFST/TSFM magic)

  • Version 1-3 parsing

  • Block-based extraction

  • Content key retrieval

  • Delta encoding detection (identifies but doesn’t decode)

The parser can extract FileDataID to content key mappings from all current WoW root file versions.

See https://github.com/wowemulation-dev/cascette-py for the Python implementation.

Common Issues

  1. V2 block header size: V2+ uses a 17-byte block header, not 12 bytes like V1. Using the wrong header size causes all subsequent parsing to fail with garbage FileDataIDs and content keys.

  2. V2 field order change: V2+ swapped locale_flags and content_flags positions. In V1: num_records, content_flags, locale_flags. In V2+: num_records, locale_flags, content_flags, unk2, unk3.

  3. Multiple matches: Same file may exist in multiple blocks with different locales

  4. Missing entries: Not all FileDataIDs have corresponding entries

  5. Flag interpretation: Game-specific flag meanings vary

  6. Delta overflow: Large gaps in FileDataIDs can cause integer overflow

Implementation Notes

Version Detection Heuristic

The version detection uses value2 < 10 to identify extended headers, which is broader than the strict matches!(value2, 1..=4) check. Version 1 is accepted and maps to V2 block format (17-byte header, locale_flags first). This matches CascLib and TACTSharp behavior. The heuristic may need tightening if future versions use values in the 5-9 range for non-version purposes.

Block Header Dispatch

The current dispatch is verified correct:

  • Plain V1 files (no MFST/TSFM magic) use the 12-byte header (content_flags first)
  • All MFST/TSFM files (including Classic Era) use the 17-byte header (locale_flags first)
  • V4 files use the 18-byte header (40-bit content flags)

The V2 17-byte format applies to all MFST/TSFM files regardless of the header version field value. The 12-byte format is only used for pre-magic V1 files.

References

Install Manifest Format

The Install manifest tracks which game files should be installed on disk and manages file tags for selective installation based on system requirements and user preferences.

Overview

The Install manifest maps content keys to installation paths and uses a tag bitmap system for selective installation based on platform, architecture, and locale. File sizes in entries support installation size estimation.

File Structure

The Install manifest is BLTE-encoded and contains:

[BLTE Container]
  [Header]
  [Tag Section]
  [File Entries]

Binary Format

struct InstallHeader {
    uint16_t magic;              // 'IN' (0x494E)
    uint8_t  version;            // Version (1 or 2)
    uint8_t  ckey_length;        // Content key length in bytes (16)
    uint16_t tag_count;          // Number of tags (big-endian)
    uint32_t entry_count;        // Number of file entries (big-endian)

    // Version 2+ fields (6 additional bytes, total 16 bytes)
    uint8_t  content_key_size;   // Content key size (Agent.exe) / loose file type (CascLib)
    uint32_t entry_count_v2;     // Additional entry count (big-endian)
    uint8_t  unknown;            // Unknown byte
};

For version 1, the content key size is derived as ckey_length + 4 (content key + 4-byte file size). Version 2 specifies content_key_size explicitly.

Tag Section

Tags categorize files for selective installation. Each tag consists of:

struct InstallTag {
    char     name[];             // Null-terminated tag name
    uint16_t type;               // Tag type (big-endian)
    uint8_t  bit_mask[];         // Bit mask ((entry_count + 7) / 8 bytes)
};

Important: The bit mask uses big-endian (MSB-first) bit ordering within each byte:

  • Bit 7 (MSB) corresponds to file index byte_index * 8 + 0

  • Bit 0 (LSB) corresponds to file index byte_index * 8 + 7

  • The mask for a given file index is 0x80 >> (file_index % 8)

File Entry

File entries follow the tag section:

struct InstallFileEntry {
    char     path[];             // Null-terminated file path
    uint8_t  content_key[16];    // MD5 content key
    uint32_t file_size;          // File size (big-endian)
};

Tag associations are determined by bit positions in each tag’s bit mask.

Tag System

Tag Types

TypeValueDescriptionExamples
Platform0x0001Operating system tagsWindows, OSX, Android, IOS
Architecture0x0002CPU architecture tagsx86_32, x86_64, arm64
Locale0x0003Language/region tagsenUS, deDE, frFR
Category0x0004Content category tagsspeech, text
Unknown0x0005Unknown tag type(seen in manifests)
Component0x0010Component tagsgame, launcher
Version0x0020Version tagslive, ptr, beta
Optimization0x0040Optimization tagsretail, debug
Region0x0080Region tagsUS, EU, KR
Device0x0100Device tagsdesktop, mobile
Mode0x0200Mode tagsonline, offline
Branch0x0400Branch tagsmain, experimental
Content0x0800Content tagscinematics, audio
Feature0x1000Feature tagsgraphics, physics
Expansion0x2000Expansion tagsbase, expansion1
Alternate0x4000Alternate contentAlternate, HighRes
Option0x8000Option tags(optional features)

Common Tags

Platform Tags:

- Windows, OSX, Android, IOS, Web

Architecture Tags:

- x86_32, x86_64, arm64

Locale Tags:

- enUS, enGB, deDE, frFR, esES, esMX, itIT,
  ruRU, koKR, zhTW, zhCN, ptBR, ptPT

Category Tags:

- speech, text

Alternate Tags:

- Alternate, HighRes

Tag Mask Usage

Tags use bit masks to indicate which files they apply to:

#![allow(unused)]
fn main() {
fn should_install(
    file_index: usize,
    tag: &InstallTag,
    selected: bool
) -> bool {
    let byte_index = file_index / 8;
    let bit_offset = file_index % 8;

    if byte_index >= tag.bit_mask.len() {
        return false;
    }

    // Big-endian (MSB-first) bit ordering within bytes: bit 0 = MSB
    let has_tag = (tag.bit_mask[byte_index] & (0x80 >> bit_offset)) != 0;
    has_tag && selected
}
}

Installation Planning

Size Calculation

Calculate installation size for selected tags:

#![allow(unused)]
fn main() {
fn calculate_install_size(
    entries: &[InstallFileEntry],
    selected_tags: u16
) -> u64 {
    entries.iter()
        .filter(|e| should_install(e, selected_tags))
        .map(|e| e.file_size as u64)
        .sum()
}
}

Path Resolution

Convert relative paths to absolute:

#![allow(unused)]
fn main() {
fn resolve_install_path(
    base_dir: &Path,
    entry: &InstallFileEntry
) -> PathBuf {
    let relative_path = std::str::from_utf8(&entry.path).unwrap();
    base_dir.join(relative_path)
}
}

File Categories

Essential Files

Files with tag mask 0x0000 or 0xFFFF:

  • Core executables

  • Essential libraries

  • Base configuration

  • Critical game data

Optional Content

Files with specific tag requirements:

  • High-resolution textures (HighResTextures tag)

  • Cinematics (Cinematics tag)

  • Additional languages (locale tags)

  • Developer tools (DevTools tag)

Implementation Example

#![allow(unused)]
fn main() {
struct InstallFile {
    header: InstallHeader,
    tags: Vec<InstallTag>,
    entries: Vec<InstallFileEntry>,
}

impl InstallFile {
    pub fn get_install_list(&self, tags: &[String]) -> Vec<InstallItem> {
        let tag_mask = self.build_tag_mask(tags);

        self.entries.iter()
            .filter(|e| should_install(e, tag_mask))
            .map(|e| InstallItem {
                content_key: e.content_key,
                install_path: String::from_utf8_lossy(&e.path).to_string(),
                file_size: e.file_size,
            })
            .collect()
    }

    fn build_tag_mask(&self, tag_names: &[String]) -> u16 {
        let mut mask = 0u16;

        for name in tag_names {
            if let Some(tag) = self.tags.iter().find(|t| t.name == name) {
                mask |= 1 << tag.id;
            }
        }

        mask
    }
}
}

Selective Installation

Platform-Specific

Install only files for current platform:

#![allow(unused)]
fn main() {
fn get_platform_tags() -> Vec<String> {
    let mut tags = vec!["Base".to_string()];

    #[cfg(target_os = "windows")]
    tags.push("Windows".to_string());

    #[cfg(target_arch = "x86_64")]
    tags.push("x64".to_string());

    tags
}
}

Language Selection

Install specific language assets:

#![allow(unused)]
fn main() {
fn get_locale_tags(selected_locale: &str) -> Vec<String> {
    vec![
        "Base".to_string(),
        selected_locale.to_string(),
    ]
}
}

Optimization Strategies

Parallel Installation

Install multiple files concurrently:

#![allow(unused)]
fn main() {
use rayon::prelude::*;

fn install_files(items: Vec<InstallItem>) {
    items.par_iter()
        .for_each(|item| {
            download_and_install(item);
        });
}
}

Incremental Updates

Track installed files for patching:

#![allow(unused)]
fn main() {
struct InstalledFiles {
    entries: HashMap<PathBuf, InstalledFileInfo>,
}

struct InstalledFileInfo {
    content_key: [u8; 16],
    file_size: u32,
    modified_time: SystemTime,
}
}

Validation

Post-Installation Verification

#![allow(unused)]
fn main() {
fn verify_installation(
    install_dir: &Path,
    install_file: &InstallFile,
    selected_tags: u16
) -> Result<()> {
    for entry in &install_file.entries {
        if !should_install(entry, selected_tags) {
            continue;
        }

        let path = install_dir.join(&entry.path);

        // Verify file exists
        if !path.exists() {
            return Err("Missing file");
        }

        // Verify file size
        let metadata = fs::metadata(&path)?;
        if metadata.len() != entry.file_size as u64 {
            return Err("Size mismatch");
        }
    }

    Ok(())
}
}

Repair Process

Detect and repair corrupted installations:

#![allow(unused)]
fn main() {
fn repair_installation(
    install_file: &InstallFile,
    install_dir: &Path
) -> Vec<RepairAction> {
    let mut actions = Vec::new();

    for entry in &install_file.entries {
        let path = install_dir.join(&entry.path);

        if !path.exists() {
            actions.push(RepairAction::Download(entry.content_key));
        } else if !verify_file(&path, entry) {
            actions.push(RepairAction::Redownload(entry.content_key));
        }
    }

    actions
}
}

Common Issues

  1. Tag conflicts: Multiple tags may include same file
  2. Path separators: Handle platform-specific separators
  3. Case sensitivity: File systems vary in case handling
  4. Symlink support: Some platforms don’t support symlinks
  5. Permission issues: Installation may require elevation

Special Considerations

Shared Files

Files used by multiple products:

#![allow(unused)]
fn main() {
struct SharedFile {
    content_key: [u8; 16],
    products: Vec<String>,
    ref_count: u32,
}
}

Uninstall Tracking

Track files for clean uninstall:

#![allow(unused)]
fn main() {
struct UninstallManifest {
    files: Vec<PathBuf>,
    directories: Vec<PathBuf>,
    registry_keys: Vec<String>,  // Windows only
}
}

Parser Implementation Status

Python Parser (cascette-py)

Status: Complete

Capabilities:

  • Version 1 header parsing with IN magic detection

  • Tag extraction with big-endian (MSB-first) bit ordering

  • Platform/architecture/locale tag type classification

  • File entry parsing with path, content key, and size

  • Tag-to-file association via bitmask resolution

  • BLTE decompression for compressed manifests

Verified Against:

  • WoW 11.0.5.57689 (242 entries, 28 tags)

  • Multiple WoW Classic builds

  • Cross-platform tag validation (Windows, OSX, mobile)

Known Issues: None

See https://github.com/wowemulation-dev/cascette-py for the Python implementation.

Version History

The Install manifest format has two versions:

Version 1

  • Header Size: 10 bytes
  • Magic: “IN” (0x494E)
  • Entry Size: Derived as ckey_length + 4
  • Features:
    • File path to content key mapping
    • Tag-based selective installation
    • Platform/architecture/locale filtering
    • Bit mask system for tag associations
    • Big-endian (MSB-first) bit ordering in tag masks
    • Tag type classification (17 types from Platform through Option)

Version 2

  • Header Size: 16 bytes (10 base + 6 additional)
  • Added Fields: content_key_size (1 byte), entry_count_v2 (4 bytes BE), unknown (1 byte)
  • Features: All version 1 features plus explicit content key size

Version Detection

The version field is at offset 2 in the header. The agent accepts versions 1 and 2 (validates non-zero and <= 2).

Implementation Status

  • cascette-formats: Full support for versions 1 and 2 with validation
  • cascette-py: Complete parsing for version 1 with tag extraction

References

Download Manifest Format

The Download manifest manages content streaming and prioritization during game installation and updates. It defines which files are essential for gameplay and their download order.

Overview

The Download manifest assigns a priority to each file entry so the client can download essential content first (enabling play before full download) and stream remaining content in the background. Tag bitmaps enable per-platform and per-locale filtering. File sizes in entries support progress estimation.

File Structure

The Download manifest is BLTE-encoded and contains:

[BLTE Container]
  [Header]
  [File Entries]
  [Tag Section]

Binary Format

Header

struct DownloadHeader {
    char     magic[2];           // "DL" (0x44, 0x4C)
    uint8_t  version;            // Version (1, 2, or 3)
    uint8_t  ekey_size;          // Encoding key size in bytes (16)
    uint8_t  has_checksum;       // Checksum presence flag
    uint32_t entry_count;        // Number of entries (big-endian)
    uint16_t tag_count;          // Number of tags (big-endian)

    // Version 2+ fields (header grows to 12 bytes)
    uint8_t  flag_size;          // Number of flag bytes per entry (max 4)

    // Version 3+ fields (header grows to 16 bytes)
    int8_t   base_priority;      // Base priority offset
    uint8_t  _reserved[3];       // Reserved (agent does not validate these)
};

Entry Order

The download manifest stores data in this order:

  1. Header
  2. All file entries
  3. All tags (appear after entries)

File Entry

struct DownloadEntry {
    uint8_t  ekey[16];           // Encoding key (variable size from header)
    uint8_t  file_size[5];       // 40-bit file size (big-endian)
    int8_t   priority;           // Download priority (adjusted by base_priority)

    // Optional fields
    uint32_t checksum;           // If has_checksum is true (big-endian)
    uint8_t  flags[N];           // If version >= 2, N = flag_size
};

Tag Entry

Tags appear after all file entries in the manifest:

struct DownloadTag {
    char     name[];             // Null-terminated tag name
    uint16_t type;               // Tag type (big-endian)
    uint8_t  bitmap[];           // Bit mask ((entry_count + 7) / 8 bytes)
};

Each bit in the bitmap corresponds to a file entry index. If bit N is set, entry N has this tag.

Priority System

Priority Calculation

In version 3+, priorities are adjusted:

final_priority = entry.priority - header.base_priority

Priority Levels

Lower values indicate higher priority:

PriorityCategoryTypical Content
< 0CriticalMust download before game starts
0EssentialRequired for basic gameplay
1-2HighImportant for full experience
3-5NormalStandard content
> 5LowOptional/deferred content

Priority-Based Download

#![allow(unused)]
fn main() {
fn get_download_order(entries: &[DownloadFileEntry]) -> Vec<&DownloadFileEntry>
{
    let mut sorted = entries.iter().collect::<Vec<_>>();
    sorted.sort_by_key(|e| (e.priority, e.file_size));
    sorted
}
}

Streaming Strategy

Minimum Playable Set

Calculate minimum download for gameplay:

#![allow(unused)]
fn main() {
fn get_minimum_download(
    download_file: &DownloadFile
) -> (Vec<DownloadFileEntry>, u64) {
    let essential: Vec<_> = download_file.entries
        .iter()
        .filter(|e| e.priority <= 1)  // Essential + Critical
        .cloned()
        .collect();

    let total_size = essential.iter()
        .map(|e| e.file_size as u64)
        .sum();

    (essential, total_size)
}
}

Progressive Download

Download in priority order while game runs:

#![allow(unused)]
fn main() {
struct DownloadManager {
    queue: VecDeque<DownloadItem>,
    active: Vec<DownloadTask>,
    completed: HashSet<[u8; 16]>,
}

impl DownloadManager {
    pub fn start_progressive_download(&mut self) {
        // Sort by priority
        self.queue.sort_by_key(|item| item.priority);

        // Start downloading highest priority
        while self.active.len() < MAX_CONCURRENT {
            if let Some(item) = self.queue.pop_front() {
                self.start_download(item);
            }
        }
    }
}
}

Tag-Based Filtering

Platform-Specific Downloads

Tags are stored separately from entries. Each tag contains a bitmap indicating which entries it applies to. To filter by tag, find the tag by name and check its bitmap:

#![allow(unused)]
fn main() {
fn filter_by_tag<'a>(
    manifest: &'a DownloadManifest,
    tag_name: &str,
) -> Vec<(usize, &'a DownloadFileEntry)> {
    let tag = match manifest.tags.iter().find(|t| t.name == tag_name) {
        Some(t) => t,
        None => return Vec::new(),
    };

    manifest.entries.iter().enumerate()
        .filter(|(index, _)| tag.has_file(*index))
        .collect()
}
}

Language Packs

#![allow(unused)]
fn main() {
fn get_language_pack<'a>(
    manifest: &'a DownloadManifest,
    locale: &str,
) -> Vec<&'a DownloadFileEntry> {
    let tag = match manifest.tags.iter().find(|t| t.name == locale) {
        Some(t) => t,
        None => return Vec::new(),
    };

    manifest.entries.iter().enumerate()
        .filter(|(index, _)| tag.has_file(*index))
        .map(|(_, entry)| entry)
        .collect()
}
}

Download Optimization

Bandwidth Management

#![allow(unused)]
fn main() {
struct BandwidthManager {
    max_bandwidth: u64,      // Bytes per second
    current_usage: u64,
    priority_limits: Vec<u64>, // Per-priority limits
}

impl BandwidthManager {
    pub fn allocate_bandwidth(&mut self, priority: u8) -> u64 {
        let priority_limit = self.priority_limits[priority as usize];
        let available = self.max_bandwidth - self.current_usage;

        std::cmp::min(priority_limit, available)
    }
}
}

Chunk-Based Downloads

For large files, download in chunks:

#![allow(unused)]
fn main() {
struct ChunkedDownload {
    encoding_key: [u8; 16],
    total_size: u64,
    chunk_size: u64,
    chunks_completed: Vec<bool>,
}

impl ChunkedDownload {
    pub fn get_next_chunk(&self) -> Option<(u64, u64)> {
        for (idx, &completed) in self.chunks_completed.iter().enumerate() {
            if !completed {
                let offset = idx as u64 * self.chunk_size;
                let size = std::cmp::min(
                    self.chunk_size,
                    self.total_size - offset
                );
                return Some((offset, size));
            }
        }
        None
    }
}
}

Progress Tracking

Download Statistics

#![allow(unused)]
fn main() {
struct DownloadProgress {
    total_files: u32,
    completed_files: u32,
    total_bytes: u64,
    downloaded_bytes: u64,
    current_speed: f64,
    eta_seconds: u64,
}

impl DownloadProgress {
    pub fn update(&mut self, bytes_downloaded: u64) {
        self.downloaded_bytes += bytes_downloaded;
        self.current_speed = self.calculate_speed();
        self.eta_seconds = self.calculate_eta();
    }

    pub fn completion_percentage(&self) -> f32 {
        (self.downloaded_bytes as f32 / self.total_bytes as f32) * 100.0
    }
}
}

Implementation Example

#![allow(unused)]
fn main() {
struct DownloadFile {
    header: DownloadHeader,
    priorities: Vec<DownloadPriority>,
    tags: Vec<DownloadTag>,
    entries: Vec<DownloadFileEntry>,
}

impl DownloadFile {
    pub fn get_download_plan(
        &self,
        tags: &[String],
        max_priority: u8
    ) -> DownloadPlan {
        let tag_mask = self.build_tag_mask(tags);

        let files: Vec<_> = self.entries
            .iter()
            .filter(|e| e.priority <= max_priority)
            .filter(|e| (e.tag_mask & tag_mask) != 0)
            .cloned()
            .collect();

        let total_size = files.iter()
            .map(|f| f.file_size as u64)
            .sum();

        DownloadPlan {
            files,
            total_size,
            estimated_time: self.estimate_time(total_size),
        }
    }
}
}

On-Demand Streaming

Asset Request Handling

#![allow(unused)]
fn main() {
struct OnDemandManager {
    download_file: DownloadFile,
    cache: LruCache<[u8; 16], Vec<u8>>,
}

impl OnDemandManager {
    pub async fn get_asset(&mut self, encoding_key: &[u8; 16]) -> Result<Vec<u8>> {
        // Check cache first
        if let Some(data) = self.cache.get(encoding_key) {
            return Ok(data.clone());
        }

        // Find in download manifest
        if let Some(entry) = self.find_entry(encoding_key) {
            // Download with high priority
            let data = self.download_immediate(entry).await?;
            self.cache.put(*encoding_key, data.clone());
            return Ok(data);
        }

        Err("Asset not found")
    }
}
}

Verification

Checksum Validation

#![allow(unused)]
fn main() {
fn verify_download(
    data: &[u8],
    entry: &DownloadFileEntry
) -> bool {
    if entry.checksum != [0; 16] {
        let computed = md5::compute(data);
        computed.0 == entry.checksum
    } else {
        true // No checksum to verify
    }
}
}

Common Issues

  1. Priority conflicts: Multiple systems requesting same file
  2. Bandwidth throttling: ISP or network limitations
  3. Incomplete downloads: Handle partial file recovery
  4. Cache corruption: Verify cached files periodically
  5. Tag mismatches: Platform detection errors

Special Features

Differential Downloads

Download only changed portions:

#![allow(unused)]
fn main() {
struct DifferentialDownload {
    old_version: [u8; 16],
    new_version: [u8; 16],
    patches: Vec<PatchInfo>,
}
}

Peer-to-Peer Support

Share downloaded content locally:

#![allow(unused)]
fn main() {
struct P2PManager {
    local_peers: Vec<PeerInfo>,
    shared_files: HashSet<[u8; 16]>,
}
}

Parser Implementation Status

Python Parser (cascette-py)

Status: Complete

Capabilities:

  • Version 1-3 header parsing with DL magic detection

  • 40-bit big-endian compressed size parsing

  • Priority system with base priority adjustment (v3)

  • Tag parsing with bitmap support (tags stored after all entries)

  • Platform/architecture tag identification with type classification

  • Sample entry display (first 100 entries)

  • Format evolution tracking across versions

  • BLTE decompression for compressed manifests

  • Correct entry/tag ordering (entries first, then tags)

Verified Against:

  • WoW 11.0.5.57689 (2.4M entries, 28 tags)

  • WoW 9.0.2.37176 (Shadowlands)

  • WoW 7.3.5.25848 (Legion)

  • WoW Classic builds

Known Issues: None

See https://github.com/wowemulation-dev/cascette-py for the Python implementation.

Version History

The Download manifest format has evolved through 3 versions:

Version 1 (Initial)

  • Header Size: 11 bytes
  • Features: Basic download prioritization with encoding keys, file sizes, optional checksums
  • Fields: magic, version, ekey_size, has_checksum, entry_count, tag_count

Version 2 (Flag Support)

  • Header Size: 12 bytes
  • Added Features: Entry-level flags for additional metadata
  • New Fields: flag_size (number of flag bytes per entry, max 4)
  • Use Cases: Platform-specific flags, content type markers

Version 3 (Priority System)

  • Header Size: 16 bytes
  • Added Features: Base priority adjustment for dynamic prioritization
  • New Fields: base_priority (signed adjustment), reserved (3 bytes)
  • Priority Calculation: final_priority = entry.priority - header.base_priority

Version Detection

Parsers detect version by reading the version field at offset 2 in the header. All versions use the same “DL” magic bytes and big-endian encoding.

Implementation Status

  • cascette-formats: Full support for versions 1-3 with version-aware parsing
  • cascette-py: Complete parsing for versions 1-3 with validation

References

Size Manifest Format

The Size manifest maps encoding keys to estimated file sizes (eSize). It is used when compressed size (cSize) is unavailable, allowing the agent to estimate disk space requirements and report download progress for content that has not yet been downloaded.

Overview

The Size manifest provides:

  • Estimated file sizes for pre-download space allocation

  • Progress bar calculations during installation

  • Disk space requirement checks

  • Fallback sizing when compressed size is unknown

The agent log message “Loose files will estimate using eSize instead of cSize” indicates when this manifest is active.

Build Configuration Reference

The Size manifest is referenced by the size key in build configuration files:

size = d1d9e612a645cc7a7e4b42628bde21ce 0d5704735f4985e555907a7e7647099a
size-size = 3637629 3076687

The first hash is the content key, the second is the encoding key used for CDN fetch. The size-size field contains the unencoded and encoded sizes. Like other manifests, the Size manifest is BLTE-encoded on CDN.

The config key .tact:size_manifest also references this manifest in the agent’s internal configuration.

Community Documentation

This format is documented on wowdev.wiki as the “Download Size” manifest. The wiki documents version 1 from an older Agent build (6700). The TACT 3.13.3 agent binary supports versions 1 and 2. The wiki’s “EKey Size” byte at offset 3 corresponds to the flags field described below. The version 2 format with its 40-bit total size field is not documented on the wiki.

File Structure

The Size manifest is BLTE-encoded and contains:

[BLTE Container]
  [Header]
  [Entries]

Binary Format

All multi-byte integers are big-endian.

Header

struct SizeManifestHeader {
    char     magic[2];           // "DS" (0x44, 0x53)
    uint8_t  version;            // Version (1 or 2)
    uint8_t  flags;              // Flags byte
    uint32_t entry_count;        // Number of entries (big-endian)
    uint16_t key_size_bits;      // Key size in bits (big-endian)

    // Version-specific fields follow
};

Version 1 Header Extension (offset 10)

struct SizeManifestHeaderV1 {
    // ... base header fields above ...
    uint64_t total_size;         // Total size across all entries (big-endian)
    uint8_t  esize_bytes;        // Byte width of eSize per entry (1-8)
};
// Total header size: 19 bytes (0x13)

The esize_bytes field determines how many bytes each entry’s size value occupies. Valid values are 1 through 8. Invalid values produce: “Invalid eSize byte count ‘%u’ in size manifest header.”

Version 2 Header Extension (offset 10)

struct SizeManifestHeaderV2 {
    // ... base header fields above ...
    uint8_t  total_size[5];      // Total size as 40-bit big-endian integer
};
// Total header size: 15 bytes (0x0F)

Version 2 fixes esize_bytes at 4 (32-bit sizes per entry). The total size uses a 40-bit integer (5 bytes), reducing header size compared to version 1.

Minimum Size Validation

The parser validates two minimum sizes:

  1. 15 bytes (0x0F) – enough to read magic, version, entry_count, and key_size_bits
  2. 19 bytes (0x13) – full version 1 header (version 2 headers are shorter and pass this check)

If the data is too small: “Detected truncated size manifest. Only got %u bytes, but minimum header size is %u bytes.”

Entry Format

Entries are stored sequentially after the header:

struct SizeManifestEntry {
    uint8_t  key[];              // Encoding key, null-terminated
    uint16_t key_hash;           // 16-bit hash/identifier (big-endian)
    uint8_t  esize[];            // Estimated size (esize_bytes width, big-endian)
};

The key field length in bytes is (key_size_bits + 7) / 8, which rounds the bit count up to the nearest byte. The key is stored as a null-terminated byte string within this field.

Key Hash Validation

The 2-byte key_hash field after the key is validated. Values 0x0000 and 0xFFFF are treated as invalid sentinel values and cause the parser to reject the entry.

Entry Size Field

The esize field width depends on the version:

Versionesize widthSource
1esize_bytes from header (1-8)Variable
24 bytes (fixed)Hardcoded

Version History

VersionHeader sizeesize widthtotal_size widthNotes
119 bytesVariable (1-8)64-bitOriginal format, documented on wowdev.wiki
215 bytesFixed (4)40-bitCompact header, undocumented on wiki

Relationship to Other Manifests

The Size manifest is one of six manifest types in TACT:

Config keyMagicFormat
encodingENContent key to encoding key mapping
root(varies)Path to content key mapping
installINInstall manifest with file tags
downloadDLDownload manifest with priorities
patchPAPatch manifest for delta updates
sizeDSSize manifest (this format)

Validation

The parser validates manifests at parse time and via an explicit validate() method:

  • Entry count matches the header’s entry_count field
  • Sum of all entry esize values matches the header’s total_size field
  • key_size_bits must be > 0
  • Key hash sentinel values (0x0000, 0xFFFF) are rejected

Error Messages

ConditionMessage
Truncated data“Detected truncated size manifest. Only got %u bytes, but minimum header size is %u bytes.”
Bad magic“Invalid magic string in size manifest.”
Bad version“Unsupported size manifest version: %u. This client only supports non-zero versions <= %u”
Bad esize width“Invalid eSize byte count ‘%u’ in size manifest header.”
Zero key size“Invalid key size: key_size_bits must be > 0”
Bad key hash“Invalid key hash sentinel value: 0x{value:04X}”
Entry count mismatch“Entry count mismatch: header says {expected}, found {actual}”
Total size mismatch“Total size mismatch: header says {expected}, sum of esizes is {actual}”

Implementation Status

Implemented in cascette-formats crate (crates/cascette-formats/src/size/).

The implementation provides:

  • Parser and builder for both version 1 and version 2 formats
  • Manual BinRead/BinWrite implementations for headers and entries
  • Variable-width esize field support (1-8 bytes for V1, fixed 4 bytes for V2)
  • 40-bit total_size handling for V2 headers
  • Key hash sentinel validation (rejects 0x0000 and 0xFFFF)
  • CascFormat trait implementation for round-trip support
  • Builder pattern for constructing manifests

Archive Files and Indices

CASC/TACT archives are container files that store game content in a packed format. They work with index files to enable efficient content retrieval without unpacking entire archives. The system uses different formats for network (TACT) and local storage (CASC).

Overview

The archive system provides:

  • Bulk storage of game assets in .archive files

  • Index files for fast content location

  • Support for partial downloads via HTTP range requests

  • Deduplication through content addressing

Archive Files

CDN Archives vs Local Archives

CDN Archives (TACT - served over HTTP):

  • Named using 32-character hash keys (e.g., 86b6b0daf3d8ef68271b15567c37300c)

  • Accessed via URL path: /tpr/wow/data/{hash[:2]}/{hash[2:4]}/{hash}

  • Paired with Archive Index files (.index) for content location

  • Single BLTE-encoded container format

  • Part of TACT (Tooling for Archive Content Transfer) protocol

Local Client Archives (CASC - stored on disk):

  • Named with numeric indices: data.001, data.002, etc.

  • Use IDX Journal files (.idx) for local content access

  • Multiple BLTE files concatenated together

  • Part of CASC (Content Addressable Storage Container) system

  • Optimized for memory-mapped access

CDN Archive Structure

CDN archives are single BLTE-encoded containers, while local archives contain multiple BLTE files:

CDN Archive Format (TACT):          Local Archive Format (CASC):
┌──────────────────┐                ┌──────────────────┐
│ BLTE Container   │                │ BLTE File 1      │
├──────────────────┤                ├──────────────────┤
│ Header & Blocks  │                │ BLTE File 2      │
├──────────────────┤                ├──────────────────┤
│ Content Blocks   │                │ BLTE File 3      │
│ (concatenated)   │                │      ...         │
└──────────────────┘                └──────────────────┘

Verified Archive Characteristics

Based on examination of sample archives:

  • File sizes: Range from ~7MB to 268MB when compressed

  • Compression ratios: 4.9x to 190x compression achieved via BLTE

  • Content types: WDB Cache files (WDC3), textures, models, and other game assets

  • Decompressed content: Much smaller than archive size (1-2MB typical)

  • Access pattern: Content addressed via hash keys in index files

CRITICAL: Two Completely Different Index Systems

⚠️ CDN Archive Index (.index) vs Local Storage Index (.idx)

NEVER CONFUSE THESE TWO FORMATS - THEY ARE COMPLETELY DIFFERENT:

  1. CDN Archive Index Files (.index): TACT format with 28-byte footer, variable-length encoding keys
  2. Local Storage Index Files (.idx): CASC format with header, fixed 9-byte content key buckets

These systems serve different purposes and use entirely different formats, key types, and data structures.

CDN Archive Index Format (TACT Protocol)

File Extension: .index Location: Downloaded from CDN Purpose: Maps variable-length encoding keys to CDN archive locations Key Type: Encoding keys (from Encoding file) Key Length: Variable, as specified in footer’s ekey_length field (typically 16 bytes, sometimes 9) Implementation: cascette-formats/src/archive/index.rs

Archive Index Files (.index) - TACT Protocol

Based on analysis of actual CDN index files from various WoW builds.

CDN archive indexes use a chunk-based format with footer metadata:

Archive Index Structure

Index File Layout:
┌────────────────┐
│ Data Chunks    │ <- 4KB chunks containing entries
│ (4096 bytes)   │
├────────────────┤
│ ...            │
├────────────────┤
│ Last Chunk     │ <- Table of contents + entries
├────────────────┤
│ Footer         │ <- Metadata (variable length)
└────────────────┘

CDN Index Entry Format (Variable Length)

struct CDNArchiveIndexEntry {
    uint8_t  ekey[ekey_length];  // Encoding key (variable length from footer)
    uint32_t encoded_size;       // BLTE encoded size (big-endian)
    uint32_t archive_offset;     // Offset in archive (big-endian)
};

Entry Size: Variable = ekey_length + size_bytes + offset_bytes (from footer) Typical Sizes:

  • With 16-byte keys: 16 + 4 + 4 = 24 bytes per entry
  • With 9-byte keys: 9 + 4 + 4 = 17 bytes per entry

Key Properties:

  • Encoding key length specified in footer’s ekey_length field
  • All multi-byte fields use big-endian encoding
  • NEVER assume fixed 9-byte keys - always read from footer

Archive Index files use a 28-byte footer at the end of the file:

struct ArchiveIndexFooter {  // 28 bytes total
    uint8_t  toc_hash[8];     // MD5(toc_keys || block_hashes)[:footer_hash_bytes]
    uint8_t  version;         // Must be 0 or 1
    uint8_t  reserved[2];     // Must be [0, 0]
    uint8_t  page_size_kb;    // Must be 4 (4KB pages)
    uint8_t  offset_bytes;    // Archive offset field size (4, 5, or 6)
    uint8_t  size_bytes;      // Compressed size field size (always 4)
    uint8_t  ekey_length;     // EKey length in bytes (16 for full MD5)
    uint8_t  footer_hash_bytes; // Footer hash length (always 8)
    uint32_t element_count;   // Number of entries (little-endian - special case!)
    uint8_t  footer_hash[8];  // MD5 footer validation (first 8 bytes)
};

Verified Footer Properties:

  • Standard values: offset_bytes=4, size_bytes=4, ekey_length=16 (1-16 valid)

  • offset_bytes can be 4 (regular archives), 5 (archives >4GB), or 6 (archive-groups: 2-byte archive index + 4-byte offset)

  • Page/chunk size consistently 4096 bytes

  • Item length consistently 24 bytes (0x18)

  • Archive filename = MD5 hash of the footer

  • Footer validation uses MD5 hashing (first 8 bytes of hash)

  • Mixed endianness: element_count field is little-endian while all other

    multi-byte fields are big-endian

  • TOC hash field is present but not validated in practice. No known reference implementation (CascLib, TACT.Net, rustycasc) validates this field. Testing against real files shows the stored values do not match any standard hash algorithm applied to the TOC data

Implementation Notes:

  • Extended Block Offsets: The agent logs “Archive w/ Extended Block Offset Found” for archive index entries that use larger-than-4-byte offsets (for archives exceeding 4GB)

  • Archive Count Limit: The agent has a casc_supports_1023_archives configuration flag, indicating a maximum of 1023 archives per CASC storage

Sample Analysis Results

File Sizes Observed:

  • Small indexes: ~8KB (few hundred entries)

  • Medium indexes: ~50-200KB (thousands of entries)

  • Large indexes: ~300KB+ (tens of thousands of entries)

Index Distribution (from sample builds):

  • WoW retail: 400-1400+ archives per build

  • WoW Classic: 1000-1400+ archives per build

  • Beta builds: 400-800 archives per build

Chunk Structure:

  • All indexes use 4KB chunks. Max entries per chunk = 4096 / (ekey_length + offset_bytes + size_bytes). With default 16+4+4 fields: 170 entries per chunk.

  • Table of contents (TOC) is stored after data chunks and contains two sections:

    1. Last encoding key of each data chunk (for binary search)
    2. Per-block MD5 hash of each data chunk (truncated to footer_hash_bytes)
  • TOC hash = MD5(toc_keys || block_hashes)[:footer_hash_bytes]

  • Chunk structure enables streaming and memory-efficient processing

  • Chunks are padded with zeros to maintain 4KB alignment

Archive Index Access Pattern

CDN URL Format:

https://cdn.domain.com/tpr/wow/data/{hash[:2]}/{hash[2:4]}/{hash}.index

Lookup Process:

  1. Get archive content key from CDN configuration
  2. Append ‘.index’ to form index URL
  3. Fetch and parse index file
  4. Search entries for target EKey
  5. Use offset/size to retrieve from corresponding .archive file

Self-Referential Naming:

The archive index filename (hash) is the MD5 of its own footer structure, providing a unique identifier that validates the index contents.

Local Storage Index Format (.idx files)

File Extension: .idx Location: Client-side storage directory (Data/data/) Purpose: Maps content keys to local data file locations using bucket algorithm Key Type: Content keys (MD5 hashes from Root file) Key Length: ALWAYS 9 bytes (truncated for space efficiency in local storage) Implementation: cascette-client-storage/src/index.rs

See the comparison table at the end of this document for a full side-by-side comparison.

IDX Journal Files (.idx) - CASC Local Storage

Local CASC storage uses IDX Journal files for indexing:

IDX Journal Structure

struct IDXJournalHeader {  // 18 bytes + block table
    uint32_t data_size;       // Size of header data
    uint32_t data_hash;       // Jenkins hash validation
    uint16_t version;         // Journal version
    uint8_t  bucket;          // Bucket ID (0x00-0xFF)
    uint8_t  unused;          // Padding
    uint8_t  length_size;     // Size field bytes
    uint8_t  location_size;   // Location field bytes (5 = 1 archive + 4 offset)
    uint8_t  key_size;        // Key field bytes (9 or 16)
    uint8_t  segment_bits;    // Segment size bits
    // Followed by block table entries
};

Key Differences from Archive Indexes:

  • Bucket-based structure (256 buckets, 00-FF)

  • Jenkins hash validation instead of footer hash

  • Fixed key sizes (not truncated)

  • Header at start instead of footer at end

  • One journal file per bucket

Loose Files Index

For files not in archives:

struct LooseFilesIndex {
    uint32_t magic;              // 'LIDX'
    uint32_t version;
    uint32_t entry_count;

    struct Entry {
        uint8_t  encoding_key[16];
        uint32_t file_size;
        uint8_t  file_hash[16];  // For verification
    } entries[];
};

Archive Lookup Process

  1. Get encoding key: From encoding file lookup
  2. Check indices: Search all index files for key
  3. Locate in archive: Extract offset and size
  4. Retrieve data: Read from archive at offset
  5. Decompress: Process BLTE container

Implementation Example

#![allow(unused)]
fn main() {
struct ArchiveIndex {
    header: ArchiveIndexHeader,
    entries: Vec<ArchiveIndexEntry>,
}

impl ArchiveIndex {
    pub fn find_file(&self, encoding_key: &[u8]) -> Option<(u64, u32)> {
        // Truncate search key to index key size
        let search_key = &encoding_key[..self.header.key_size as usize];

        // Binary search entries (sorted by key)
        let idx = self.entries.binary_search_by_key(
            &search_key,
            |e| &e.key[..]
        ).ok()?;

        let entry = &self.entries[idx];
        Some((entry.offset, entry.size))
    }
}
}

HTTP Range Requests

For CDN retrieval without downloading entire archives:

GET /data/5e/16/5e16b6ff530b1816c7b32296e0875ed4 HTTP/1.1
Host: cdn.example.com
Range: bytes=1048576-2097151

Response:

HTTP/1.1 206 Partial Content
Content-Range: bytes 1048576-2097151/134217728
Content-Length: 1048576

Archive Creation

When building archives:

  1. Group related files: Minimize seeks during loading
  2. Align boundaries: 4KB alignment for efficient I/O
  3. Order by access: Frequently accessed files first
  4. Compress individually: Each file is BLTE-encoded
  5. Update indices: Generate index entries

Optimization Strategies

Memory Mapping

For local archives:

#![allow(unused)]
fn main() {
use memmap2::MmapOptions;

struct ArchiveReader {
    mmap: Mmap,
}

impl ArchiveReader {
    pub fn read_file(&self, offset: u64, size: u32) -> &[u8] {
        let start = offset as usize;
        let end = start + size as usize;
        &self.mmap[start..end]
    }
}
}

Index Caching

Keep frequently used indices in memory:

#![allow(unused)]
fn main() {
struct IndexCache {
    indices: HashMap<String, Arc<ArchiveIndex>>,
    lru: LruCache<String, ()>,
}
}

Archive Validation

Checksum Verification

When checksums are present:

#![allow(unused)]
fn main() {
fn verify_file(data: &[u8], expected_checksum: &[u8; 16]) -> bool {
    let computed = md5::compute(data);
    computed.0 == *expected_checksum
}
}

Size Validation

Always verify extracted size matches expected:

#![allow(unused)]
fn main() {
if decompressed.len() != expected_size as usize {
    return Err("Size mismatch");
}
}

Common Issues

  1. Key collisions: Truncated keys may collide (handle gracefully)
  2. Archive corruption: Verify checksums when available
  3. Missing indices: Some files may only exist as loose files
  4. Version mismatches: Handle different index versions
  5. Alignment padding: Account for alignment bytes

Archive Groups

Archive Groups are client-generated mega-indices that combine multiple CDN archive indices into a single lookup structure, reducing search time from scanning hundreds of individual .index files to a single binary search. They use 6-byte offset fields (2-byte archive index + 4-byte offset) and are identified by archive-group and patch-archive-group fields in CDN config.

See Archive-Groups for the full format specification.

File Organization

Typical CASC repository structure:

data/
├── config/           # Configuration files
├── data/            # Archive files
│   ├── 00/
│   │   ├── 00/{hash}.archive
│   │   └── ...
│   └── ff/
│       └── ff/{hash}.archive
├── indices/         # Index files
│   ├── {hash}.index
│   └── ...
└── patch/           # Patch archives

Version History

CDN Archive Index Format (.index files)

The CDN Archive Index format currently has only one version:

Version 1 (Current)

  • Footer Size: 28 bytes
  • Location: End of file
  • Features:
    • Variable-length encoding keys (footer’s ekey_length field)
    • 4KB chunk-based structure with table of contents
    • MD5 hash validation (footer hash and TOC hash)
    • Self-referential naming (filename = MD5 of footer)
    • Mixed endianness (element_count is little-endian, others big-endian)
    • Typical entry size: 24 bytes (16-byte key + 4-byte size + 4-byte offset)

Version Detection

The version field is at offset 8 in the 28-byte footer. All known CDN archive indices use version 1.

Implementation Status

  • cascette-formats: Full support for version 1 with parser
  • Archive-groups: Client-side mega-indices combine multiple CDN indices (6-byte offset variant)

Local Storage Index Format (.idx files)

The Local Storage Index (IDX Journal) format currently has only one version:

Version 7 (Current - IDX Journal v7)

  • Header Size: 16 bytes
  • Location: Start of file
  • Features:
    • Fixed 9-byte truncated content keys (space optimization)
    • 18-byte entries (9-byte key + 5-byte location + 4-byte size)
    • 256 bucket-based organization (0x00-0xFF)
    • Packed 5-byte location field (10-bit archive ID + 30-bit offset)
    • Jenkins hash validation
    • Mixed endianness (header little-endian, entries mixed)
    • Bucket algorithm: XOR first 9 bytes, then XOR nibbles
    • Filename format: {bucket:02x}{version:06x}.idx

Version Detection

The version field is at offset 8 in the header (16-bit little-endian). The implementation validates version equals 7 and warns on unexpected versions.

Implementation Status

  • cascette-client-storage: Full support for version 7 with parser and builder
  • No earlier versions documented (version 7 is standard for modern CASC)

Key Differences Between Index Systems

FeatureCDN Index (.index)Local Index (.idx)
Version1 (footer-based)7 (header-based)
ProtocolTACT (network)CASC (local storage)
Key TypeEncoding keysContent keys
Key LengthVariable (16 typical)Fixed 9-byte truncated
StructureSequential chunksBucket algorithm
ValidationMD5 hashJenkins hash
EndiannessMixed (mostly big)Mixed (header little)
Entry SizeVariable (24 typical)Fixed 18 bytes
LocationCDN downloadClient Data/ directory
Cratecascette-formatscascette-client-storage

References

Archive-Groups

Archive-groups are locally generated mega-indices that combine multiple CDN archive indices into a single unified lookup structure. They are created client-side by merging downloaded archive index files, never downloaded directly from the CDN. They are essential for Battle.net client compatibility and enable efficient content resolution.

Format Specification

Archive-groups use the same binary format as regular CDN archive indices with one critical difference:

FieldRegular IndexArchive-Group
Encoding KeyVariable (9-16 bytes)16 bytes
Offset4 bytes6 bytes
Size4 bytes4 bytes

The 6-byte offset field contains:

  • Bytes 0-1: Archive index (big-endian u16)
  • Bytes 2-5: Offset within archive (big-endian u32)

Critical Findings - SOLVED

Archive Index Mapping Uses Hash-Based Assignment

CONFIRMED: ALL archive-groups use the full u16 range (0-65535) for archive indices:

archive_index = hash(encoding_key) % 65536

This explains why:

  • All archive-groups use indices 0-65535 despite only ~606 CDN archives existing
  • Archive 0 consistently receives 6-8% of entries (hash distribution)
  • The pattern is universal across all Battle.net installations
  • Archive-groups are generated locally using this deterministic hash-based assignment algorithm

CDN Configuration

Archive-groups are referenced in CDN config files by their hash:

archive-group = 6d08c5f69f6a2cf70a50cd40efdcd2fb
patch-archive-group = a5fb3ed088333348d93983d7e8693956

These hashes identify the locally generated archive-group files stored in Data/indices/. The client generates these files locally and stores them using the computed hash as the filename.

Size Characteristics

Archive-groups are significantly larger than regular indices:

  • Regular CDN indices: 4KB - 2MB
  • Archive-groups: 50MB - 150MB
  • Entry count: 2-5 million entries

Growth over time (WoW Classic):

  • Version 1.13.2: 54MB, 2.1M entries
  • Version 1.14.0: 73MB, 2.8M entries
  • Version 1.15.2: 126MB, 5.0M entries

Archive Index Distribution

Due to hash-based assignment, archive indices follow a predictable distribution:

  1. Archive 0: ~6-8% of entries (150K-350K entries)
  2. Archive 1: ~0.6% of entries (13K entries)
  3. Archive 2-65535: Distributed based on hash function

This distribution is consistent across all Battle.net installations.

Implementation Requirements

Detection

#![allow(unused)]
fn main() {
fn is_archive_group(data: &[u8]) -> bool {
    if data.len() < 28 {
        return false;
    }
    // Check offset_bytes field at position -16 from end
    data[data.len() - 16] == 6
}
}

Parsing

#![allow(unused)]
fn main() {
// For archive-groups with 6-byte offsets
let archive_index = u16::from_be_bytes([data[pos], data[pos + 1]]);
let offset = u32::from_be_bytes([data[pos + 2], data[pos + 3], data[pos + 4], data[pos + 5]]);
}

Content Resolution

When resolving content in a Battle.net-compatible installation:

  1. Look up encoding key in archive-group
  2. Extract 2-byte archive index from entry
  3. Map archive index to actual CDN archive (requires mapping table)
  4. Read content from archive at specified offset

Implementation Strategy for Cascette

To achieve binary-identical Battle.net installations:

Required Actions

  1. Generate Archive-Groups Locally

    • Parse CDN config to find all individual archive index hashes
    • Download all individual .index files from CDN
    • Merge them locally into unified archive-group structures
    • Store generated archive-groups in Data/indices/ using computed hash as filename
  2. Implement Hash-Based Archive Assignment

    • Use deterministic algorithm: archive_index = hash(encoding_key) % 65536
    • Ensure identical results to Battle.net client generation
    • Apply to all entries during archive-group creation
  3. Implement Archive Index Mapping

    • Create mapping table: archive_group_index -> actual_cdn_archive_hash
    • The 65536 virtual indices map to ~606 actual CDN archives
    • Use for content resolution when accessing actual archive data
  4. Support Both Types

    • Generate regular archive-group for main content from base archive indices
    • Generate patch-archive-group for patch content from patch archive indices
    • Both use same local generation process with 6-byte offsets

Why Binary-Identical Matters

For cascette to be a trustworthy Battle.net replacement:

  1. Trust: Users need confidence we produce EXACTLY what Battle.net would
  2. Compatibility: Some third-party tools may depend on exact format
  3. Verification: Binary matching allows easy validation
  4. Completeness: Understanding the full algorithm proves our analysis

Archive-groups are identified by the offset_bytes field in the footer:

Footer (28 bytes):
  [0:8]   TOC hash: MD5(toc_keys || block_hashes)[:footer_hash_bytes]
  [8]     Version (always 1)
  [9:11]  Reserved
  [11]    Page size in KB
  [12]    Offset bytes (4 for regular, 6 for archive-group)
  [13]    Size bytes (always 4)
  [14]    Key bytes (16 for archive-groups)
  [15]    Footer hash bytes
  [16:20] Entry count (little-endian u32)
  [20:28] Footer hash

Example Archive-Group Entry

Entry from 6d08c5f69f6a2cf70a50cd40efdcd2fb.index:
  Key: 000003bafc39011c91accae47b94fb2d (16 bytes)
  Archive: 0 (from first 2 bytes of offset field)
  Offset: 0x5dfd00d7 (from last 4 bytes of offset field)
  Size: 92,211,754 bytes

This entry indicates:

  • Content is in archive index 0
  • Starts at offset 0x5dfd00d7 in that archive
  • Compressed size is 92,211,754 bytes

Validation

Archive-groups contain entries for all game content:

  • Every encoding key should be findable
  • Archive indices use full u16 range (0-65535)
  • Entries are sorted by encoding key for binary search
  • Total entries match the entry_count in footer

Battle.net Client Behavior

The Battle.net client:

  1. Downloads individual archive index files during installation
  2. Generates archive-group locally by merging multiple archive indices
  3. Stores generated archive-group in Data/indices/{hash}.index
  4. Uses hash-based assignment algorithm for consistent archive index mapping
  5. Uses archive-group for all subsequent content lookups

Common Issues

Incorrect Detection

  • Checking file size alone is insufficient
  • Must verify offset_bytes == 6 in footer
  • Some patch archives are large but not archive-groups

Index Mapping Confusion

  • Archive index in archive-group ≠ CDN archive position
  • Indices 0-65535 map to ~600 actual archives
  • Mapping requires modulo or lookup table

Parser Assumptions

  • Never hardcode 9-byte keys for archive-groups
  • Archive-groups always use 16-byte keys
  • Respect the key_bytes field in footer

References

TVFS (TACT Virtual File System)

TVFS is the virtual file system introduced in WoW 8.2 (CASC v3), providing a unified interface for managing content across multiple products and build configurations. It replaces direct file path mappings with a more flexible namespace-based system.

How TVFS is Accessed

TVFS manifests are referenced through vfs-* fields in BuildConfig files:

  1. BuildConfig contains vfs-root and numbered vfs-1 through vfs-N fields
  2. Each VFS field contains two hashes: content key and encoding key
  3. The encoding key (second hash) is used to fetch the TVFS manifest from CDN
  4. The manifest is BLTE-encoded and must be decompressed
  5. Once decoded, the manifest describes the virtual file system structure

Example from BuildConfig:

vfs-root = fd2ea24073fcf282cc2a5410c1d0baef 14d8c981bb49ed169e8558c1c4a9b5e5
vfs-root-size = 50071 33487

Modern builds contain 1,500+ VFS entries for different product/region/platform combinations.

Overview

TVFS organizes content into namespaces rather than per-build file trees. This allows multiple products and regions to share common assets through a single content-addressed storage layer, with deduplication across products.

Architecture

Namespace Hierarchy

TVFS Root
├── Product Namespace (e.g., "wow")
│   ├── Build Namespace (e.g., "1.15.7.61582")
│   │   ├── Root Files
│   │   └── Content Trees
│   └── Shared Namespace
│       └── Common Assets
└── Global Namespace
    └── Cross-Product Assets

File Structure

TVFS manifest is BLTE-encoded:

[BLTE Container]
  [Header]
  [Namespace Definitions]
  [Directory Entries]
  [File Entries]
  [Content Mappings]

Binary Format

Based on analysis of 5 TVFS samples from WoW builds 11.0.2.56313 through 11.2.0.62748.

TVFS Header

struct TvfsHeader {  // 38 bytes minimum, 46 with EST table
    uint8_t  magic[4];           // "TVFS" (0x54564653)
    uint8_t  format_version;     // Format version (1; agent accepts <= 1)
    uint8_t  header_size;        // Header size (not read by agent parser)
    uint8_t  ekey_size;          // EKey size (always 9)
    uint8_t  pkey_size;          // PKey size (always 9)
    uint32_t flags;              // Format flags (big-endian)
    uint32_t path_table_offset;  // Offset to path table (big-endian)
    uint32_t path_table_size;    // Size of path table (big-endian)
    uint32_t vfs_table_offset;   // Offset to VFS table (big-endian)
    uint32_t vfs_table_size;     // Size of VFS table (big-endian)
    uint32_t cft_table_offset;   // Offset to container file table (big-endian)
    uint32_t cft_table_size;     // Size of container file table (big-endian)
    uint16_t max_depth;          // Maximum path depth
    // Optional EST fields (only present if TVFS_FLAG_ENCODING_SPEC is set)
    uint32_t est_table_offset;   // Encoding spec table offset
    uint32_t est_table_size;     // Encoding spec table size
};

Verified Header Properties:

  • Magic bytes: Always “TVFS” (0x54564653) in ASCII

  • Format version: Always 1 across all samples

  • Header size: 38 bytes minimum, 46 with EST table

  • EKey size: 9 bytes (TACT standard)

  • PKey size: 9 bytes (TACT standard)

  • All multi-byte integer fields are big-endian (NGDP standard)

Format Flags (Implementation Details):

#![allow(unused)]
fn main() {
// TVFS format flags
const TVFS_FLAG_INCLUDE_CKEY: u32 = 0x01;      // Include content keys
const TVFS_FLAG_ENCODING_SPEC: u32 = 0x02;     // Encoding spec table (EST) present
const TVFS_FLAG_PATCH_SUPPORT: u32 = 0x04;     // Patch support enabled
}
  • Value 7 (0x7): Include C-key + Encoding spec + Patch support (all features)

  • EST Table Present: When bit 1 (0x02) is set. The agent checks flags & 2 for encoding specifier presence.

  • Header Size: 38 bytes minimum (without EST), 46 bytes with EST table fields

Sample Analysis Results:

  • File sizes: 49,896 - 50,844 bytes (decompressed)

  • All files use identical header format

  • Table offsets and sizes are consistent with file structure

  • Two retail builds (11.2.0.62706 and 11.2.0.62748) are byte-identical

Table Structure

Path Table (PathTableOffset + PathTableSize):

Recursive prefix tree (trie) encoding file paths. Each entry has:

  • Optional 0x00 path separator bytes (before/after name fragments)
  • Length-prefixed name fragment (1-byte length + N bytes)
  • 0xFF marker followed by 4-byte big-endian NodeValue:
    • Bit 31 set: folder node, lower 31 bits = folder data length (includes the 4-byte NodeValue). Children are inline within that byte range.
    • Bit 31 clear: file node, value = byte offset into the VFS table.

Maximum depth is tracked in the header.

VFS Table (VfsTableOffset + VfsTableSize):

Span-based entries addressed by byte offset from path table NodeValues. Each entry has:

  • span_count (1 byte): 1-224 = file entry, 225-254 = other, 255 = deleted
  • Per span (repeated span_count times):
    • file_offset (4 bytes BE): offset within the referenced content
    • span_length (4 bytes BE): content size of this span
    • cft_offset (CftOffsSize bytes BE): byte offset into the CFT

CftOffsSize is computed from cft_table_size using GetOffsetFieldSize: >0xFFFFFF = 4 bytes, >0xFFFF = 3 bytes, >0xFF = 2 bytes, else 1 byte.

Container File Table (CftTableOffset + CftTableSize):

Fixed-stride entries addressed by byte offset from VFS span cft_offset values. Entry layout depends on header flags:

  • EKey (ekey_size bytes): encoding key
  • EncodedSize (4 bytes BE): encoded (compressed) size
  • CKey (pkey_size bytes): content key (if TVFS_FLAG_INCLUDE_CKEY)
  • est_index (EstOffsSize bytes BE): EST entry index (if TVFS_FLAG_ENCODING_SPEC)
  • patch_offset (CftOffsSize bytes BE): patch entry offset (if TVFS_FLAG_PATCH_SUPPORT)

EstOffsSize is computed from est_table_size using the same GetOffsetFieldSize function as CftOffsSize.

Encoding Specifier Table (EST) (Optional, if encoding spec flag is set):

  • Contains null-terminated encoding spec strings (same format as the ESpec table in the encoding file)

  • Only present if flag bit 1 (0x02) is set

  • Required for writing files to underlying storage

  • Parsed from est_table_offset for est_table_size bytes

Sample Table Sizes (Build 11.2.0.62748):

Path Table:      Offset 46,     Size 11,814 bytes
VFS Table:       Offset 41,527, Size 9,317 bytes
Container Table: Offset 11,882, Size 29,645 bytes

Format Analysis Status

Verified against CascLib and CDN data (WoW Retail, Classic, Classic Era):

  • Header format, magic bytes, flags, and table offsets
  • Path table recursive prefix tree with 0xFF NodeValue markers
  • VFS span-based entries with variable-width CFT offsets
  • CFT fixed-stride entries with flag-dependent fields
  • EST null-terminated encoding spec strings
  • Round-trip parse/build produces structurally equivalent output

Usage

Parsing a TVFS Manifest

#![allow(unused)]
fn main() {
use cascette_formats::tvfs::TvfsFile;

// From decompressed data
let tvfs = TvfsFile::parse(&data)?;

// From BLTE-encoded CDN data
let tvfs = TvfsFile::load_from_blte(&blte_data)?;
}

Enumerating Files

#![allow(unused)]
fn main() {
// All file entries from the path table
for file in &tvfs.path_table.files {
    println!("{} -> VFS offset {}", file.path, file.vfs_offset);
}

// With VFS entry details
for (file, vfs_entry) in tvfs.enumerate_files() {
    if let Some(entry) = vfs_entry {
        for span in &entry.spans {
            println!("{}: offset={}, length={}, cft_offset={}",
                file.path, span.file_offset, span.span_length, span.cft_offset);
        }
    }
}
}

Resolving a Path

#![allow(unused)]
fn main() {
// Resolve path -> VFS entry -> CFT entry (EKey)
if let Some(container_entry) = tvfs.resolve_path("path/to/file") {
    println!("EKey: {}", container_entry.ekey_hex());
    if let Some(ckey) = container_entry.content_key_hex() {
        println!("CKey: {}", ckey);
    }
}
}

Building a TVFS Manifest

#![allow(unused)]
fn main() {
use cascette_formats::tvfs::TvfsBuilder;

let mut builder = TvfsBuilder::with_flags(0x07); // CKEY + EST + PATCH
builder.add_est_spec("b:256K*=z".to_string());
builder.add_file(
    "path/to/file".to_string(),
    [0x01; 9],   // ekey
    1024,         // encoded_size
    2048,         // content_size
    Some([0x02; 16]), // content_key
);
let data = builder.build()?;
}

References

NGDP Configuration File Formats

This document describes the configuration file formats used in NGDP for managing product versions, CDN endpoints, and content distribution.

Overview

NGDP uses five configuration file types:

  1. Build Configuration - Defines build metadata and system file references
  2. CDN Configuration - Lists CDN servers and available archives
  3. Patch Configuration - Contains delta update information
  4. Keyring Configuration - Encryption keys for Salsa20 decryption
  5. Product Configuration - Client installation and platform metadata

Configuration File Access

Configuration files are accessed through CDN endpoints using content-addressed paths derived from hashes returned by the Ribbit API.

Path Structure

Configuration files use a two-level directory structure for efficient CDN distribution:

http://<cdn-host>/<path>/<type>/<hash[0:2]>/<hash[2:4]>/<full-hash>

Where:

  • <cdn-host>: CDN server hostname

  • <path>: Base path from CDN response (e.g., tpr/wow)

  • <type>: Content type (config, data, patch)

  • <hash[0:2]>: First 2 characters of hash

  • <hash[2:4]>: Characters 3-4 of hash (positions 2-3 in 0-indexed)

  • <full-hash>: Complete hash value

Example:

# Build config for wow_classic_era 1.15.7.61582
# Hash: ae66faee0ac786fdd7d8b4cf90a8d5b9
# Note: hash[0:2] = "ae", hash[2:4] = "66"
http://cdn.arctium.tools/tpr/wow/config/ae/66/ae66faee0ac786fdd7d8b4cf90a8d5b9

Build Configuration

Build configurations define build-specific metadata and reference all system files required for a build.

Format

Key-value pairs, one per line, with = delimiter (space-equals-space). Values may contain multiple space-separated tokens (e.g., content key + encoding key).

Common Keys

KeyDescriptionExample
rootRoot file content key (NOT for direct CDN fetch)ea8aefdebdbd6429da905c8c6a2b1813
installInstall manifest: content key + encoding key54c189d60033f93f42e7b91165e7de1c a9dcee49ab3f952d69441eb3fd91c159
encodingEncoding file: content key + encoding key (use 2nd for CDN)b07b881f4527bda7cf8a1a2f99e8622e bbf06e7476382cfaa396cff0049d356b
encoding-sizeSizes for encoding file versions14004322 14003043
downloadDownload manifest: content key + encoding key42a7bb33cd1e9a7b72bef6ee14719b58 53ba96f0965adc306d2d0cf3b457949c
sizeSize manifest: content key + encoding keyd1d9e612a645cc7a7e4b42628bde21ce 0d5704735f4985e555907a7e7647099a
patchPatch file content key658506593cf1f98a1d9300c418ee5355
patch-configPatch configuration hash (fetch separately)17f5bbcb7eae2fc8fb3ea545c65f74d4
patch-indexPatch index files3806f4c7b1f179ce976d7685f9354025 eb5758bd78805f0aabac15cf44ea767c
patch-sizeSize of patch file22837
build-nameHuman-readable build identifierWOW-55646patch1.15.3_ClassicRetail
build-uidUnique build identifierwow_classic_era
build-productProduct identifierWoW
build-playbuild-installerInstaller build numberngdp:wow_classic_era:55646
build-partial-priorityPartial download prioritiesSpace-separated list
build-numBuild number61582
build-num-retailRetail build number61582
build-attributesBuild attribute metadataAttribute string
build-file-dbFile database for containerless buildsHash value
build-file-db-sizeSize of file databaseSize in bytes
client-versionClient version stringVersion string
feature-placeholderFeature placeholder flagtrue or absent
feature-use-hardlinksEnable hard link supporttrue or absent
no-frame-encodingDisable frame encoding (sets v3.0.0)true or absent
vfs-root-especESpec for VFS root manifestESpec string
install-high-verHigh-version install manifest hashHash value
install-high-ver-sizeSize of high-version installSize in bytes
key-layout-index-bitsStatic key layout index bitsNumeric value

VFS (Virtual File System) Keys

Modern WoW builds (8.2+) include VFS fields that reference TVFS (TACT Virtual File System) manifests:

KeyDescriptionExample
vfs-rootMain TVFS manifest: content key + encoding keyfd2ea24073fcf282cc2a5410c1d0baef 14d8c981bb49ed169e8558c1c4a9b5e5
vfs-root-sizeSizes for TVFS root manifest50071 33487
vfs-1 through vfs-NAdditional TVFS manifests for different products/regionsSame format as vfs-root
vfs-N-sizeSize for corresponding VFS manifestSame format as vfs-root-size
vfs-N-especEncoding spec for corresponding VFS manifestESpec string

Important: Each vfs-* field points to a TVFS manifest file that contains the virtual file system structure. These manifests are BLTE-encoded and fetched using the encoding key (second hash). See TVFS documentation for manifest format details.

Modern builds can have 1,500+ VFS fields representing different:

  • Product variants (retail, PTR, beta)

  • Language/region combinations

  • Platform-specific configurations

  • Feature flags and optional content

Example

# Build Configuration for wow_classic_era 1.15.7.61582
# URL: http://cdn.arctium.tools/tpr/wow/config/ae/66/ae66faee0ac786fdd7d8b4cf90a8d5b9
root = ea8aefdebdbd6429da905c8c6a2b1813
install = 54c189d60033f93f42e7b91165e7de1c a9dcee49ab3f952d69441eb3fd91c159
install-size = 23038 22281
download = 42a7bb33cd1e9a7b72bef6ee14719b58 53ba96f0965adc306d2d0cf3b457949c
download-size = 5606744 4818287
size = d1d9e612a645cc7a7e4b42628bde21ce 0d5704735f4985e555907a7e7647099a
size-size = 3637629 3076687
encoding = b07b881f4527bda7cf8a1a2f99e8622e bbf06e7476382cfaa396cff0049d356b
encoding-size = 14004322 14003043
patch-index = 5472ee24b5b9d148acfd2a436fc514be 76ce88ecb704dc93849def9fb489a6fb
patch-index-size = 16783 6591
patch = 4f185b4a837d4a363b2490432aaef092
patch-size = 11017
patch-config = 474b9630df5b46df5d98ec27c5f78d07
build-name = WOW-61582patch1.15.7_ClassicRetail
build-uid = wow_classic_era
build-product = WoW
build-playbuild-installer = ngdptool_casc2

Critical Implementation Note

ENCODING KEY VS CONTENT KEY:

  • Most build config entries have TWO hashes: <content-key> <encoding-key>

  • The content key (first hash) is the unencoded file identifier

  • The encoding key (second hash) is what you use for CDN fetches

  • EXCEPTION: The encoding file itself can be fetched directly using its

    encoding key

File Fetch Process:

  1. Fetch encoding file using its encoding key: bbf06e7476382cfaa396cff0049d356b
  2. Parse encoding file to find encoding keys for other files
  3. Use those encoding keys to fetch files from CDN
  4. The root file CANNOT be fetched using ea8aefdebdbd6429da905c8c6a2b1813 directly

Notes

  • Multiple encoding/size entries support different compression levels

  • Patch-config reference enables delta updates between builds

  • Build-partial-priority lists files for streaming installation

Static Key Layouts

Build configs can contain key-layout-<number> entries that define static data layout schemes. Each key layout has sub-fields:

  • Chunk Bits: Number of bits for chunk addressing
  • Archive Bits: Number of bits for archive addressing
  • Offset Bits: Number of bits for offset addressing
  • Alignment: Data alignment requirement

The key-layout-index-bits field in the build config specifies the number of index bits for the static key layout system.

Chunk System

Build configs can reference chunk-<number> entries. Chunks are associated with archives and use a bits-based addressing system. The agent validates that chunk identifiers follow the chunk-<number> naming pattern.

Build configs can contain hard link entries. The agent validates the format of these entries and uses them for storage optimization on file systems that support hard links.

Manifest Validation

The agent validates that each manifest type (download, install, size, encoding) has matching C-Key/C-Size and E-Key/E-Size pairs. If a size field is specified, the corresponding key must also be present.

CDN Configuration

CDN configurations list available CDN servers and archive files.

CDN Configuration Format

Key-value pairs with special handling for multi-value keys.

Keys

KeyDescriptionFormat
archivesList of archive hashesSpace-separated
archive-groupGroup identifier for archivesSingle hash
patch-archivesList of patch archive hashesSpace-separated
patch-archive-groupGroup identifier for patch archivesSingle hash
file-indexFile index hashSingle hash
file-index-sizeSize of file indexInteger
patch-file-indexPatch file index hashSingle hash
patch-file-index-sizeSize of patch file indexInteger
archives-index-sizeSizes of archive index filesSpace-separated integers
archive-group-index-sizeSize of archive group indexInteger
patch-archives-index-sizeSizes of patch archive index filesSpace-separated integers
patch-archive-group-index-sizeSize of patch archive group indexInteger
buildsReference to builds using this CDN configSpace-separated

CDN Configuration Example

# CDN Configuration for wow_classic_era 1.15.7.61582
# URL: http://cdn.arctium.tools/tpr/wow/config/63/ee/63eee50d456a6ddf3b630957c024dda0
# (Showing first 10 archives of 1000+)
archives = 0017a402f556fbece46c38dc431a2c9b 003b147730a109e3a480d32a54280955 \
  00b79cc0eebdd26437c7e92e57ac7f5c 00e43d6a55fe497ebaecece75c464913 \
  00f71443fef647344027dd37beda651f 0105f03cb8b8faceda8ea099c2f2f476 \
  0128ec2c42df9e7ac7b58a54ad902147 01794f476dce0d0adeb975eaff4ff850 \
  01df479cca2ad2a8991bac020db5287e 01f0908f6ece2f26d918d1665f919222
archive-group = 58a3c9e02c964b0ec9dd6c085df99a77
patch-archives = 01c87e5f5e87ffc088c3fe20a7e332ce
0239bc973b31a4e52e8c96652a14b9e0 \
  034e2e6e0e5cdecb0f0bc07e87f0e074 04f8e6c8cbfbd6e9fd3e9ccbcd95e53a \
  0662e1cf69dbd0c6c10e7e3e6303b8cf 0bffd45f01e8ad33731f973bb96f3db1 \
  0d17c61fa98e6db91e14e0b24c8bc9f9 0d47f019c36e88c00fc43b3fe973f3d1 \
  101e4f7b592c12bf3c436d3b95e38b8f 1027ab37f63c039a8a3dd8a039e43e81
patch-archive-group = de09c9cd5f93c4e4f6f1f0f4a8edb9c0
file-index = fb37bc7303bae99d6c57e96a079e2c77
file-index-size = 34236152
patch-file-index = eb99f93d5c8dbdbb652f1d71da9c7de6
patch-file-index-size = 5015068
builds = ae66faee0ac786fdd7d8b4cf90a8d5b9

Archive Management

  • Archives are immutable once created

  • New content creates new archives

  • Archive-group combines multiple archives for efficient access

  • File-index provides fast lookups across all archives

Patch Configuration

Patch configurations define delta updates between builds. They are referenced within build configurations using the patch-config field and contain detailed patch entry definitions.

Access Pattern

Patch configs are accessed through:

  1. Fetch build config
  2. Extract patch-config hash from build config
  3. Fetch patch config using standard config path structure

Patch Configuration Format

Text format with metadata and multiple patch-entry lines.

Patch Entry Format

patch-entry = <type> <content-key> <size> <encoding-key> <encoded-size>
[compression-info] [additional-keys...]

Fields

FieldDescription
typeFile type (encoding, install, download, size, vfs:*)
content-keyTarget content key
sizeTarget file size
encoding-keyEncoded version key
encoded-sizeEncoded file size
compression-infoCompression blocks (e.g., b:{11=n,4813402=n,793331=z})
additional-keysAlternative encoding keys and sizes

The agent validates patch-entry fields including target ESpec validation. Patch config parsing uses structured per-entry validation.

Patch Configuration Example

# Patch Configuration for wow_classic 1.13.7.38631
# URL: http://cdn.arctium.tools/tpr/wow/config/17/f5/17f5bbcb7eae2fc8fb3ea545c65f74d4
# (Showing metadata and sample entries)

# Patch Configuration

patch = 658506593cf1f98a1d9300c418ee5355
patch-size = 22837

patch-entry = download 6d616efdfd334916898276805f043927 6113132 \
  64332f9899b6d42a939fa3e02080bf33 5528795 b:{16=n,5524659=n,588457=z} \
  0a45352357be8ddca09749ec421bbb48 6112126 50ac209d796a11818da1429d6cb69c60
12502
patch-entry = encoding fcf166e21580ee48497b4d85e433b900 13084283 \
  716906f960db61ea62f07f7e9697127d 13082541
b:{22=n,2574=z,61216=n,7835648=n,40192=n,5144576=n,*=z} \
  5905362dbda48cebbea7c80d05ef6c60 13084283 ce2c3294ca7e37aa3be1f227bdc9072a
89156
patch-entry = install 179088c6b3495b1a9dec3715e77834e1 15565 \
  a75d4aa7e38dff6a1ddc59bd80c2ad3c 15197 b:{610=z,14955=n} \
  f66d038c20f580be307f4645c7b5d3f2 15633 072a9339d594a00c884ffea987381883 486
patch-entry = size 5841844a1a1ad48eaeb756c716869bf5 3248493 \
  d06fc7a7e4b5d8fb138a2ee27f54674f 2878957 b:{15=n,588457=z,64K*=n} \
  2061f6427c842d01d9445d1bcc58d65b 3247949 daccd8bf9f2719ea9dbbb57991a03ed7
452303

Compression Info Format

The b:{...} notation describes block compression:

  • n = uncompressed block

  • z = zlib compressed block

  • Numbers indicate block sizes or offsets

  • * = all remaining blocks

  • 64K* = 64KB blocks

Entry Types

Patch configs commonly include:

  • System files: download, encoding, install, size, patch-index

  • VFS entries: vfs:* with hexadecimal identifiers (e.g., vfs:000000040000::)

  • Metadata: patch and patch-size fields for the patch file itself

Availability

Patch configs are found in:

  • Classic WoW builds (1.13.x through 5.5.x)

  • Older retail builds (pre-8.0)

  • Rarely in modern builds (mostly replaced by direct patching)

Keyring Configuration

Keyring configurations contain encryption keys for decrypting protected CASC content. Each entry maps an 8-byte key ID to a 16-byte Salsa20 encryption key.

Discovery

Keyring config hashes are in the Ribbit versions response KeyRing column, NOT in build configs. The config is fetched from CDN using the standard config path structure.

Format

Same key-value format as other configs, with = delimiter. Each entry uses the key- prefix followed by a hex-encoded key ID.

key-{KEY_ID_HEX} = {KEY_VALUE_HEX}

Where:

  • KEY_ID_HEX: 16 hex characters (8 bytes) identifying the encryption key
  • KEY_VALUE_HEX: 32 hex characters (16 bytes) Salsa20 encryption key

Example

key-4eb4869f95f23b53 = c9316739348dcc033aa8112f9a3acf5d

Validation

Agent.exe (tact::ConfigReader::ValidateKeyringConfig at 0x6e7020) requires at least one key entry. Duplicate key IDs with different values produce a warning and the duplicate is ignored (first entry wins).

Usage

Keys are loaded into a hash map by tact::KeyGetter::LoadKeyring. During BLTE decryption, the 8-byte key ID from the encrypted block header is used to look up the 16-byte Salsa20 decryption key.

Distribution

Keyring sizes vary by product:

  • WoW Retail: 1 key entry
  • Call of Duty (Odin): 1 key entry
  • Overwatch 2: 63 key entries (largest observed)
  • WoW Classic products: empty keyrings (no KeyRing column in versions response)

Product Configuration

Product configurations contain Battle.net client metadata for installation and platform requirements.

Note: Product config hashes are present in Ribbit/Wago data, and the actual config files are accessible via CDN using the /tpr/configs/data/ path structure as demonstrated in the examples below.

Product Configuration Format

JSON object with nested configuration sections.

Structure

{
  "all": {
    "config": {
      // Global configuration
    }
  },
  "platform": {
    "win": { /* Windows-specific */ },
    "mac": { /* macOS-specific */ }
  },
  "<locale>": {
    "config": {
      // Locale-specific configuration
    }
  }
}

Product Configuration Example

// Product Configuration for WoW 11.2.0.62748
// URL: http://cdn.arctium.tools/tpr/configs/data/53/02/53020d32e1a25648c8e1eafd5771935f
{
  "all": {
    "config": {
      "product": "WoW",
      "update_method": "ngdp",
      "data_dir": "Data/",
      "supports_multibox": true,
      "supports_offline": false,
      "supported_locales": ["enUS", "esMX", "ptBR", "deDE", "esES", "frFR"],
      "display_locales": ["enUS", "esMX", "ptBR", "frFR", "deDE", "esES"],
      "shared_container_default_subfolder": "_retail_",
      "enable_block_copy_patch": true
    }
  },
  "platform": {
    "win": {
      "config": {
        "binaries": {
          "game": {
            "relative_path": "WoW.exe",
            "relative_path_arm64": "Wow-ARM64.exe"
          }
        },
        "min_spec": {
          "default_required_cpu_speed": 2600,
          "default_required_ram": 2048,
          "default_requires_64_bit": true
        }
      }
    },
    "mac": {
      "config": {
        "binaries": {
          "game": {
            "relative_path": "World of Warcraft.app"
          }
        },
        "min_spec": {
          "default_required_cpu_speed": 2200,
          "default_required_ram": 2048
        }
      }
    }
  },
  "enus": {
    "config": {
      "install": [{
        "start_menu_shortcut": {
          "link": "%commonstartmenu%World of Warcraft/World of Warcraft.lnk",
          "target": "%shortcutpath%",
          "description": "Click here to play World of Warcraft."
        }
      }]
    }
  }
  // ... additional locales ...
}

Global Configuration Keys

KeyDescriptionType
productProduct identifierString
update_methodUpdate protocol“ngdp”
data_dirData directory pathString
supported_localesAvailable languagesArray
display_localesUI languagesArray
launch_argumentsDefault launch argsArray
supports_multiboxMultiple instancesBoolean
supports_offlineOffline playBoolean
enable_block_copy_patchBlock-level patchingBoolean
shared_container_default_subfolderShared data pathString

Platform Configuration

{
  "platform": {
    "win": {
      "config": {
        "binaries": {
          "game": {
            "relative_path": "WoWClassic.exe",
            "relative_path_arm64": "WowClassic-arm64.exe",
            "launch_arguments": []
          }
        },
        "min_spec": {
          "default_required_cpu_cores": 1,
          "default_required_cpu_speed": 2600,
          "default_required_ram": 2048,
          "default_requires_64_bit": true,
          "required_osspecs": {
            "6.1": { "required_subversion": 0 }
          }
        },
        "form": {
          "game_dir": {
            "default": "Program Files",
            "required_space": 11500000000,
            "space_per_extra_language": 2000000000
          }
        }
      }
    }
  }
}

Locale Configuration

{
  "enus": {
    "config": {
      "install": [{
        "desktop_shortcut": {
          "link": "%desktoppreference%World of Warcraft Classic.lnk",
          "target": "%shortcutpath%",
          "description": "Click here to play World of Warcraft.",
          "args": "--productcode=wow_classic_era"
        }
      }]
    }
  }
}

Installation Variables

Product configs use variables resolved by Battle.net:

VariableDescription
%installpath%Game installation directory
%binarypath%Executable path
%shortcutpath%Launcher path
%desktoppreference%User desktop path
%commonstartmenu%Start menu path
%titlepath%Product root directory
%game%Game data directory
%locale%Current locale
%uid%Unique installation ID

Parser Implementation Status

Python Parser (cascette-py)

Status: Complete

Capabilities:

  • Fetches patch configs from build config references

  • Parses patch entry format with compression info

  • Analyzes entry types (system files, VFS entries)

  • Supports both patch and product config examination

  • Handles standard CDN path structure

Verified Against:

  • WoW Classic 1.13.7.38631 patch config

  • WoW Classic 4.4.2.60142 patch config (205 entries)

  • WoW Classic 5.5.0.62655 patch config

Known Issues:

  • None identified - both product and patch configs successfully fetched

  • Requires fetching build config first to get patch-config hash

See https://github.com/wowemulation-dev/cascette-py for the Python implementation.

Product Configuration Status Summary

ProductConfig contains product-specific metadata and installation parameters. These are referenced in Ribbit responses and are accessible via CDN.

Status: Available via CDN using /tpr/configs/data/ path structure Format: JSON Purpose: Product metadata, platform settings, feature flags

Known Fields (from Ribbit)

  • Product configuration hash (16 bytes hex)

  • Associated with specific product versions

  • May be embedded in client or launcher

Configuration Discovery Flow

  1. Ribbit Query: Get version and CDN information
  2. Version Lookup: Find build configuration hash and keyring hash
  3. Build Config: Fetch build metadata and system files
  4. CDN Config: Get archive lists and CDN servers
  5. Keyring Config: Fetch encryption keys (if KeyRing column present)
  6. Patch Config: Retrieve update paths (rarely available)
  7. Product Config: Client installation metadata (may not be accessible)

Implementation Considerations

Parsing

  • Build/CDN/Patch/Keyring configs: Simple key-value parser

  • Product config: JSON parser

  • Handle comments (lines starting with #)

  • Support multi-value fields (comma or space separated)

Caching

  • Configuration files are immutable (content-addressed)

  • Cache indefinitely once fetched

  • Validate using content hash

Error Handling

  • Retry failed fetches with exponential backoff

  • Fall back to alternate CDN servers

  • Validate configuration completeness

Security

  • Verify content hashes match expected values

  • Use HTTPS when available

  • Validate file sizes before download

NGDP/TACT Patch System

The NGDP patch system enables incremental updates between game versions using differential patches.

Patch System Architecture

The patch system uses a multi-tier structure:

  1. Patch Manifests (PA files in /patch/): Index files listing patches

    between builds

  2. Patch Archives (ZBSDIFF files in /patch/): Actual differential patch data

  3. Intermediate Results (in /data/): Results of applying patches in a chain

Patch File Locations

According to wowdev.wiki, the directories are:

  • /config/: Build configs, CDN configs, and Patch configs

  • /data/: Archives, indexes, and unarchived files (binaries, media, root,

    install, download)

  • /patch/: Patch manifests, patch files, patch archives, patch indexes

Specifically:

  • Patch Manifests: https://cdn.host/tpr/wow/patch/{hash[:2]}/{hash[2:4]}/{hash}
    • PA (Patch Archive) format files containing patch entry indices
    • Referenced by patch field in build configs
  • Patch Archives: https://cdn.host/tpr/wow/patch/{hash[:2]}/{hash[2:4]}/{hash}
    • ZBSDIFF1 format differential patch files stored in archives
    • Found in patch-entry lines (the patch_hash values)
    • Stored in archives just like regular data files
  • Patch Archive Indices: https://cdn.host/tpr/wow/patch/{hash[:2]}/{hash[2:4]}/{hash}.index
    • Index files for patch archives using the same format as data archive indices
    • Map content hashes to locations within patch archives
    • Referenced by patch-archives-index field in CDN configs
    • Use IndexType::Patch (offset_bytes = 0) in the footer
  • Patch Results: https://cdn.host/tpr/wow/data/{hash[:2]}/{hash[2:4]}/{hash}
    • Intermediate or final results of applying patches
    • BLTE-encoded files with DL/EN/IN signatures for manifest types
  • Patch Configurations: https://cdn.host/tpr/wow/config/{hash[:2]}/{hash[2:4]}/{hash}
    • Text configs with patch-entry lines describing patch chains
    • Referenced by patch-config field in build configs

Patch Manifest Format

Patch manifests use the PA (Patch Archive) format. All numeric fields are big-endian throughout (header, block table, and block data).

Header Structure (10 bytes)

struct PatchArchiveHeader {  // 10 bytes, big-endian
    uint8_t  magic[2];         // "PA" (0x5041)
    uint8_t  version;          // Format version (1-2)
    uint8_t  file_key_size;    // Target file CKey size (1-16, typically 16)
    uint8_t  old_key_size;     // Source file EKey size (1-16, typically 16)
    uint8_t  patch_key_size;   // Patch EKey size (1-16, typically 16)
    uint8_t  block_size_bits;  // Block size as power of 2 (range [12, 24])
    uint16_t block_count;      // Number of blocks (big-endian)
    uint8_t  flags;            // Format flags (see below)
};

Flags:

  • Bit 0 (0x01): Plain data mode (informational, Agent.exe logs but does not reject)
  • Bit 1 (0x02): Extended header present with encoding info. All known CDN patch manifests have this flag set.

Extended Header (when flags & 0x02)

Present immediately after the 10-byte header. Contains encoding file metadata for the patch manifest:

struct PatchArchiveEncodingInfo {
    uint8_t  encoding_ckey[file_key_size];  // Encoding file CKey
    uint8_t  encoding_ekey[file_key_size];  // Encoding file EKey
    uint32_t decoded_size;                  // Decoded size (big-endian)
    uint32_t encoded_size;                  // Encoded size (big-endian)
    uint8_t  espec_length;                  // Length of ESpec string
    uint8_t  espec[espec_length];           // ESpec (length-prefixed, NOT null-terminated)
};

Block Table

Follows the header (or extended header if present). Each entry has a fixed size of file_key_size + 20 bytes:

struct BlockTableEntry {  // file_key_size + 20 bytes per entry
    uint8_t  last_file_ckey[file_key_size];  // Last (highest) CKey in this block
    uint8_t  block_md5[16];                  // MD5 hash of block data
    uint32_t block_offset;                   // Absolute byte offset (big-endian)
};

The block table is sorted by last_file_ckey. Agent.exe validates sort order using _memcmp during parsing.

Block Data

At each block_offset, file entries are stored as variable-length records terminated by a 0x00 sentinel byte:

// Repeat until num_patches == 0:
struct FileEntry {
    uint8_t  num_patches;                    // 0 = end of block
    uint8_t  target_ckey[file_key_size];     // Target file CKey
    uint8_t  decoded_size[5];                // uint40, big-endian
    // Followed by num_patches patch records:
    struct {
        uint8_t  source_ekey[old_key_size];  // Source file EKey
        uint8_t  source_decoded_size[5];     // uint40, big-endian
        uint8_t  patch_ekey[patch_key_size]; // Patch data EKey
        uint32_t patch_size;                 // Patch data size (big-endian)
        uint8_t  patch_index;                // Ordering hint
    } patches[num_patches];
};
uint8_t end_marker = 0;  // Sentinel byte

Decoded sizes use uint40 (5-byte big-endian) to support files up to ~1 TB.

Compression Info Format

The compression info string describes byte ranges and their compression:

  • Format: {offset=method,offset=method,...,*=default}

  • Methods: n (none), z (zlib)

  • Example: {22=n,10044521=z,734880=n,*=z}

Build Config References

Build configurations reference patches through:

  • patch: Main patch manifest hash

  • patch-size: Size of patch manifest

  • patch-index: Patch index files

  • patch-config: Patch configuration hash

Patch Configuration

Patch configs contain patch-entry lines describing patch chains between file versions.

Patch Entry Format

patch-entry = type old_hash old_size new_hash new_size compression_info
[result_hash result_size patch_hash patch_size]+

Components:

  • type: Manifest type (download, encoding, install, size, vfs:, etc.)

  • old_hash: MD5 of original file content

  • old_size: Size of original file

  • new_hash: MD5 of final patched content

  • new_size: Size of final file

  • compression_info: Compression specification (e.g., b:{11=n,8183230=n,1255589=z})

  • Followed by repeating groups of:

    • result_hash: MD5 of intermediate/final result (stored in /data/)
    • result_size: Size of result file
    • patch_hash: MD5 of ZBSDIFF patch file (stored in /patch/)
    • patch_size: Size of patch file

Patch Chain Example

patch-entry = download 6afd6862... 9438830 d29e5263... 8190785 b:{...} \
  557b46d1... 15384969 08c046c8... 1623773 \
  4ebf89a1... 15384925 e960d26b... 1623636

This describes a chain:

  1. Apply patch 08c046c8 to original 6afd6862 → result 557b46d1
  2. Apply patch e960d26b to result 557b46d1 → result 4ebf89a1
  3. Continue until reaching final d29e5263

ZBSDIFF1 Format (Zlib-compressed Binary Differential)

ZBSDIFF1 is the binary differential patch format used by NGDP/TACT for efficient file updates:

Header (32 bytes, little-endian)

struct ZbsdiffHeader {
    uint8_t  signature[8];       // "ZBSDIFF1"
    int64_t  control_size;       // Size of compressed control block (little-endian)
    int64_t  diff_size;          // Size of compressed diff block (little-endian)
    int64_t  output_size;        // Size of final output file (little-endian)
};

Three-Block Structure

  1. Control Block (zlib-compressed):

    • Triple sequences: (diff_size, extra_size, seek_offset)
    • Instructions for applying differences and inserting new data
    • All values are signed 64-bit integers
  2. Diff Block (zlib-compressed):

    • Byte differences to apply to old data
    • Applied by XOR operation: new[i] = old[i] + diff[i]
  3. Extra Block (zlib-compressed):

    • New data to insert at specified positions
    • Copied directly to output

Streaming Application

ZBSDIFF1 supports streaming application without loading entire files:

#![allow(unused)]
fn main() {
// Streaming patch application
let mut old_pos = 0;
let mut new_pos = 0;
let mut control_entries = decompress_control_block(&patch.control_data)?;

while let Some((diff_size, extra_size, seek_offset)) = control_entries.next()? {
    // Copy diff_size bytes with differences
    copy_with_diff(&old_data[old_pos..], &diff_data, &mut new_data[new_pos..], diff_size);
    old_pos += diff_size;
    new_pos += diff_size;

    // Copy extra_size bytes of new data
    copy_extra(&extra_data, &mut new_data[new_pos..], extra_size);
    new_pos += extra_size;

    // Seek in old data
    old_pos += seek_offset;
}
}

Format Characteristics

  • Little-Endian Header: All header fields use little-endian byte order (verified against Agent.exe tact::BsPatch::ParseHeader at 0x6fbd1c)

  • Signed Integers: Control block uses signed 64-bit little-endian integers for sizes and offsets

  • Zlib Compression: All data blocks compressed independently

  • Memory Efficient: Can process large files with minimal RAM usage

  • Error Detection: Header validation and decompression errors detected

Patch Archive Storage

Patch data is stored in archives just like regular game data:

  1. Patch Archives: Large files containing multiple patch data blobs

    • Located in /patch/ directory on CDN
    • Contain BLTE-encoded ZBSDIFF1 patches
    • Named with content hashes like regular archives
  2. Patch Archive Indices: Map patch hashes to archive locations

    • Use the same .index format as data archives
    • Footer uses IndexType::Patch (offset_bytes = 0)
    • Allow CDN to locate specific patches within archives
  3. Patch Archive Groups: Client-side optimization structures

    • Use the same Archive Group format as data archives
    • Group related patches for efficient client caching
    • Located in client’s local CASC storage (not on CDN)
    • Referenced in .idx files with grouped archive information
  4. CDN Config References:

    • patch-archives: List of patch archive hashes
    • patch-archives-index: Corresponding index file hashes
    • patch-archives-index-size: Size of each index file

This completely mirrors the structure used for data archives:

  • archivespatch-archives

  • archives-indexpatch-archives-index

  • Archive Groups → Patch Archive Groups

  • Same formats, just in /patch/ directory instead of /data/

Patch Chain Building and Validation

Patch Chain Construction

Patches can form chains from one content version to another with cycle detection:

#![allow(unused)]
fn main() {
pub fn build_patch_chain(
    &self,
    start_key: &[u8; 16],
    end_key: &[u8; 16]
) -> Option<PatchChain> {
    let mut chain = Vec::new();
    let mut current_key = *start_key;
    let mut visited = HashSet::new();

    while current_key != *end_key {
        // Cycle detection
        if visited.contains(&current_key) {
            return None; // Cycle detected
        }
        visited.insert(current_key);

        let patch_entry = self.find_patch_for_content(&current_key)?;
        current_key = patch_entry.new_content_key;
        chain.push(patch_entry.clone());

        // Safety limit: prevent infinite chains
        if chain.len() > 10 {
            return None; // Chain too long
        }
    }

    Some(PatchChain { steps: chain, start_key: *start_key, end_key: *end_key })
}
}

Safety Validations

  • Cycle Detection: Prevents infinite loops in patch chains

  • Chain Length Limits: Maximum 10 steps to prevent excessive processing

  • Size Validation: Output size must match header specification

  • Checksum Verification: Content keys validated after patch application

  • Stream Bounds Checking: Prevents buffer overflows during streaming

Size Limits and Memory Management

#![allow(unused)]
fn main() {
// ZBSDIFF1 size limits for safety
const MAX_PATCH_SIZE: usize = 100 * 1024 * 1024; // 100MB max patch
const MAX_OUTPUT_SIZE: usize = 1024 * 1024 * 1024; // 1GB max output
const MAX_CONTROL_ENTRIES: usize = 1_000_000; // Prevent memory exhaustion

impl ZbsdiffHeader {
    pub fn validate(&self) -> Result<(), ZbsdiffError> {
        if self.output_size > MAX_OUTPUT_SIZE as u64 {
            return Err(ZbsdiffError::OutputTooLarge(self.output_size));
        }

        if self.control_size + self.diff_size > MAX_PATCH_SIZE as u64 {
            return Err(ZbsdiffError::PatchTooLarge);
        }

        Ok(())
    }
}
}

Patch Application Process

  1. Fetch patch manifest from CDN using patch hash from build config
  2. Parse manifest to find patch entry for target file
  3. Validate patch chain: Check for cycles and reasonable length
  4. Look up patch in patch archive index to find archive and offset
  5. Download patch data from archive using index information
  6. Validate patch size limits before processing
  7. Decode BLTE wrapper and extract ZBSDIFF1 patch
  8. Apply patch using streaming algorithm with bounds checking
  9. Verify result size and hash match expectations

Implementation Notes

  • Patches are not BLTE-encoded at the manifest level

  • Individual patch data files may be BLTE-encoded

  • Block size is typically 64KB (2^16 bytes)

  • Version 2 is the current patch format version

  • Patches enable efficient updates without re-downloading entire files

BPSV Format Specification

BPSV (Blizzard Pipe-Separated Values) is a structured data serialization format, similar to CSV but using pipes (|) as delimiters with Blizzard-specific schemas. It’s used in Ribbit API responses, configuration files, and version manifests. BPSV is a data format, not a network protocol.

Format Structure

BPSV files contain three components:

  1. Header line (required)
  2. Sequence number line (optional)
  3. Data rows (zero or more)
graph TD
    A[BPSV File] --> B[Header Line]
    A -.-> C[Sequence Number Line]
    A --> D[Data Rows]

    B --> E["FieldName!TYPE:length|FieldName2!TYPE:length"]
    C -.-> F["seqn = {number}"]
    D --> G["value1|value2|value3"]

    style A stroke-width:4px
    style B stroke-width:3px
    style C stroke-width:2px,stroke-dasharray:5 5
    style D stroke-width:3px
    style E stroke-width:2px
    style F stroke-width:2px
    style G stroke-width:2px

Header Line Format

The header line defines field structure using pipe-separated field definitions:

FieldName!TYPE:length|FieldName2!TYPE:length|FieldName3!TYPE:length

Each field definition contains:

  • Field name (case-sensitive)

  • Exclamation mark separator

  • Field type (case-insensitive)

  • Colon separator

  • Length specification

Sequence Number

The optional sequence number appears on a separate line:

## seqn = 12345

Properties:

  • Always starts with ## seqn

  • Supported separators: =, :, or space

  • Contains integer value

  • Used for version tracking and cache invalidation

  • Maximum one per file

Accepted formats:

  • ## seqn = 12345 (equals with spaces)

  • ## seqn: 12345 (colon separator)

  • ## seqn 12345 (space only)

  • Extra whitespace is trimmed

Field Types

BPSV supports three field types:

STRING:length

Text data with length constraints:

  • Length 0: unlimited characters

  • Length > 0: maximum character count

  • Type names: STRING, String, string (case-insensitive)

  • UTF-8 encoding

HEX:length

Binary data encoded as hexadecimal:

  • Length specifies bytes in binary form

  • Requires exactly length × 2 hexadecimal characters

  • Valid characters: 0-9, a-f, A-F

  • Empty values always valid

  • Common usage: HEX:16 for MD5 hashes (32 hex chars)

DEC:length

Decimal integers:

  • Length indicates storage size (4 = uint32, 8 = uint64)

  • Length not enforced during parsing

  • Supports full int64 range

  • Type names: DEC, Dec, dec (case-insensitive)

Data Rows

Data rows contain pipe-separated values matching header field definitions:

  • Column count must match header field count

  • Empty values allowed for all field types

  • Values parsed according to field type specifications

Parsing Flow

flowchart TD
    A[Start Parsing] --> B[Read First Line]
    B --> C[Parse Header Fields]
    C --> D[Read Next Line]
    D --> E{"Line starts with '## seqn ='?"}
    E -->|Yes| F[Parse Sequence Number]
    E -->|No| G[Parse as Data Row]
    F --> H[Read Next Line]
    H --> I{More Lines?}
    G --> J[Validate Column Count]
    J --> K[Parse Field Values by Type]
    K --> L[Store Data Row]
    L --> I
    I -->|Yes| M[Read Next Line]
    I -->|No| N[Parsing Complete]
    M --> G

    style A stroke-width:4px
    style N stroke-width:4px
    style C stroke-width:3px
    style E stroke-width:3px,stroke-dasharray:5 5
    style I stroke-width:2px,stroke-dasharray:5 5
    style F stroke-width:2px
    style J stroke-width:2px
    style K stroke-width:2px

Usage Context

BPSV is a data serialization format used in multiple contexts:

  • Ribbit API Responses: Structured data returned by Ribbit protocol

  • Product Configuration Files: .product files with version information

  • Version Manifests: Build and CDN configuration references

  • CDN Configuration: Server URLs and path mappings

  • Background Downloads: Download priority information

Note: BPSV is the data format; Ribbit is the protocol that transmits BPSV data.

Implementation Requirements

Type Validation

Parsers must validate field values according to type specifications:

  • STRING fields accept any UTF-8 text

  • HEX fields require valid hexadecimal characters and exact length

  • DEC fields must parse as valid integers

  • Empty values are valid for all field types

Parsing Architecture

Implementations may use different parsing strategies:

  • Zero-copy parsing: Borrow from original string for efficiency

  • Owned parsing: Copy data for serialization/storage

  • Lazy parsing: Keep raw strings until typed values requested

  • Schema validation: Enforce field uniqueness and type compatibility

Error Handling

Common parsing errors:

  • Column count mismatch between header and data rows

  • Invalid characters in HEX fields

  • Incorrect HEX field length (must be exactly length × 2 chars)

  • Non-numeric values in DEC fields

  • Multiple sequence number lines

  • Duplicate field names in schema

Performance Considerations

  • Typical file size: < 10MB

  • Typical row count: < 10,000

  • UTF-8 encoding recommended

  • Both Unix (\n) and Windows (\r\n) line endings accepted

Format Examples

Basic Product Configuration

Region!STRING:4|BuildConfig!HEX:16|CDNConfig!HEX:16
## seqn = 98765
us|a1b2c3d4e5f6789012345678|f1e2d3c4b5a69870123456789abcdef0
eu|b2c3d4e5f6789012345678a1|e2d3c4b5a69870123456789abcdef0f1

CDN Server List

Name!STRING:0|Path!STRING:0|Hosts!STRING:0
## seqn = 54321
us|tpr/wow|us.patch.battle.net level3.blizzard.com
eu|tpr/wow|eu.patch.battle.net level3.blizzard.com

Version Information

Product!STRING:10|Seqn!DEC:4|Flags!HEX:4
wow|12345|0001
wowt|12346|0002

Type Casing Examples

Field types accept case variations:

# All valid type specifications
Name!STRING:50|ID!DEC:4|Hash!HEX:16
Name!String:50|ID!Dec:4|Hash!Hex:16
Name!string:50|ID!dec:4|Hash!hex:16

Empty Value Handling

Empty values preserve semantic meaning:

Product!STRING:10|Version!STRING:10|Hash!HEX:16
wow|8.3.0|a1b2c3d4e5f6789012345678
wowt||b2c3d4e5f6789012345678a1

The second row contains an empty version field, which differs from a missing field.

Implementation Status

Rust Implementation (cascette-formats)

BPSV parser and builder:

  • Schema parsing - Field name, type, and size validation (complete)

  • Document parsing - Multi-row data with sequence numbers (complete)

  • Type support - STRING, HEX, and DEC field types (complete)

  • Round-trip validation - parse(build(data)) == data guarantee (complete)

  • Case-insensitive types - Accepts STRING, String, string variations (complete)

  • Builder support - Programmatic BPSV file creation (complete)

Validation Status:

  • Byte-for-byte round-trip validation

  • Integration tests with real Ribbit API responses

  • Handles empty values, comments, and sequence numbers

  • Validated against real Battle.net BPSV files

Analysis and Usage

BPSV format is used throughout the NGDP system for configuration and version data.

NGDP/CASC Format Transitions

This document summarizes verified format transitions discovered through systematic analysis of WoW builds from 2014-2025, starting with CASC’s introduction in Warlords of Draenor (6.0.x) which replaced the MPQ system.

Verification Methodology

Format transitions were identified through:

  1. Strategic Build Analysis: Examining key builds across WoW versions using tools/examine_build.py
  2. Chronological Comparison: Tracking format changes between adjacent builds
  3. Cross-Product Validation: Comparing wow, wow_classic, wow_classic_era, wow_classic_titan, and wow_anniversary
  4. Automated Verification: Using Python scripts to validate format assumptions

Discovered Format Transitions

Root File Format Evolution

The Root file format has evolved since CASC’s introduction in Warlords of Draenor:

Version 1 (Early CASC, 2014-2021)

  • Magic: None initially, later MFST (big-endian)

  • First Seen: Warlords of Draenor (6.0.x) - CASC introduction

  • Structure: Basic content key mapping with file flags

  • Features:

    • FileDataID to content key mapping
    • Basic content/locale flags (32-bit)
    • Jenkins96 hash for named files
  • Note: This is the first CASC Root format, replacing the MPQ system

Version 2 (Transitional CASC, 2021)

  • Magic: TSFM (little-endian)

  • First Seen: Shadowlands (9.0.2)

  • Structure: Added size fields and magic signature

  • Features:

    • TSFM magic signature introduction
    • Size fields for validation
    • Maintained v1 data structures

Version 3 (Modern CASC, 2021-Present)

  • Magic: TSFM (little-endian standard)

  • First Seen: Shadowlands late patches

  • Structure: Enhanced metadata and extended flags

  • Features:

    • Extended content flags (40-bit total)
    • Improved compression efficiency
    • Better locale targeting

Version 4 (Current CASC, 2023-Present)

  • Magic: TSFM

  • First Seen: Dragonflight (10.x)

  • Structure: Further optimizations

  • Features:

    • Additional metadata fields
    • VFS integration improvements

Verified Transition Points

Based on build examination across retail and Classic:

WoW Retail (wow) Format Evolution:

VersionBuild DateRoot VersionMagicConfig FieldsKey Changes
6.0.1.181252014-06-201None13CASC introduction, replacing MPQ
7.3.5.258482018-01-161None15Still using v1 format
9.0.2.371762021-01-132TSFM17Major transition: TSFM magic, size fields added
10.1.5.511302023-08-313TSFM1,623VFS expansion: 1,600+ virtual file system fields added
11.2.0.627482025-08-223TSFM1,716Current retail standard with extended features

WoW Classic (wow_classic) Format Evolution:

VersionBuild DateRoot VersionMagicConfig FieldsKey Changes
1.13.0.282112018-10-231None13Classic launch using CASC v1
2.5.2.399262021-08-311None16Patch fields added
3.4.2.500632023-06-201None756VFS adoption: 740+ VFS fields
3.4.4.610752025-05-283TSFM758Format jump: Skipped v2, went directly to v3
5.5.0.626552025-08-193TSFM905Current Classic standard

Classic Format Lag Pattern

Classic follows retail with significant delays:

  • Root v1→v2/v3: Retail (2021) → Classic (2025) = 4 years behind

  • VFS Introduction: Retail (2023) → Classic (2023) = 18 months behind

  • TSFM Magic: Retail (2021) → Classic (2025) = 4 years behind

Classic skipped Root v2 entirely, jumping directly from v1 to v3, demonstrating selective adoption of retail improvements.

Parser Compatibility Matrix

Based on verified transitions, parsers must support:

ProductSupported Root VersionsMagic DetectionVFS SupportTimeframe
wow_classic_erav3 onlyTSFMModern2021+ (uses retail backend)
wow_classicv1, v3None, TSFMLegacy → Modern2018-2025
wow_classic_titanv3 onlyTSFMModern2025+ (CN only, WotLK 3.80.x)
wow_anniversaryv3 onlyTSFMModern2025+ (TBC 2.5.x)
wowv1, v2, v3None, TSFMLegacy → Modern2018-2025

Implementation Recommendation: Always attempt v3 parsing first with TSFM magic detection, then fall back to v1 legacy format. Root v2 is retail-specific and uncommon.

Build Configuration Evolution

Build configurations have evolved to support new file types and compression methods:

Early CASC (6.0.x)

root = <content_key>
encoding = <content_key> <encoding_key>
install = <content_key> <encoding_key>
download = <content_key> <encoding_key>

Modern CASC (11.x)

root = <content_key>
encoding = <content_key> <encoding_key>
install = <content_key> <encoding_key>
download = <content_key> <encoding_key>
patch = <patch_key>
size = <content_key> <encoding_key>

Evolution Pattern:

  • Root field simplified to single content key

  • New fields added (patch, size) for enhanced functionality

  • Encoding/install/download maintain dual-key format

BLTE Format Evolution

BLTE (Block Table Encoded) compression has remained stable but usage patterns evolved:

Compression Type Usage by Era

EraNone (N)ZLIB (Z)Encrypted (E)Frame (F)
Early CASC20%75%0%5%
Modern CASC15%60%5%20%

Key Changes:

  • Increased use of Frame compression for nested compression

  • Introduction of encrypted blocks for sensitive data

  • ZLIB remains primary compression method

Block Structure Evolution

  • Single Block: Simpler files, configuration data

  • Multi Block: Large files, game assets

  • Trend: Growing use of multi-block for better streaming

Verification Scripts

Format verification tools have been moved to the cascette-py project: https://github.com/wowemulation-dev/cascette-py

The Python implementation includes:

  • Cache management for downloaded files
  • Root file version detection testing
  • Build configuration evolution tracking
  • BLTE compression pattern analysis
  • Complete format verification suite

See the cascette-py documentation for setup and usage instructions.

Implementation Impact

For Rust Implementation

Based on verified format evolution across retail and Classic:

  1. Root File Parser:

    • Primary Support: Root v1 (legacy) and v3 (modern) formats
    • Limited Support: Root v2 (retail-only transition format)
    • Magic Detection: TSFM (little-endian) and None (legacy)
    • Version Strategy: Try v3+TSFM first, fall back to v1+None
  2. Configuration Parser:

    • Early Builds: 13-17 fields (simple key=value)
    • VFS Era: 756-1,716 fields (massive vfs-* expansion)
    • Feature Support: Handle feature-placeholder and VFS fields
    • Backwards Compatibility: Support both v1 (legacy) and v3 (modern) formats
  3. Product-Specific Logic:

    • wow_classic_era: Always modern format (v3, TSFM)
    • wow_classic: Dual format support with clear transition point (2025)
    • wow_classic_titan: Modern format only (v3, TSFM), 368 VFS entries, CN region only
    • wow_anniversary: Modern format only (v3, TSFM), 325 VFS entries, all regions
    • wow retail: Full format evolution support (2018-2025)
  4. BLTE Decoder: All compression types (N, Z, E, F) with consistent usage

    patterns across all product lines

Key Architectural Decisions

  1. Version Detection Strategy:

    #![allow(unused)]
    fn main() {
    // Recommended parsing order
    if has_tsfm_magic() {
        try_root_v3_format()
    } else {
        try_root_v1_format()
    }
    }
  2. Configuration Parsing:

    • VFS Detection: Fields starting with vfs- indicate modern builds
    • Feature Detection: feature-placeholder indicates latest builds
    • Backwards Compatibility: Always support minimal 13-field format
  3. Product Detection:

    • Use Wago.tools build database for version context
    • Classic Era assumes modern format post-2021
    • Classic has explicit v1→v3 transition in May 2025
  4. Testing Strategy: Verify against all transition points with real build data

Future Analysis

Formats not yet tracked for transitions:

  • Encoding file table structure changes
  • Install/Download tag system evolution
  • Archive index format stability
  • Patch file introduction timeline

References


Last Updated: 2025-08-23 Verification Status: Automated verification scripts created and tested Next Review: After implementing Rust parsers based on verified formats

BLTE (Block Table Encoded) Format

BLTE is NGDP’s container format for compressed and optionally encrypted content. It provides block-based compression, encryption support, and efficient streaming capabilities for game data delivery.

Overview

BLTE files wrap game content with:

  • Optional multi-block structure for large files

  • Per-block compression (none, zlib, or others)

  • Optional encryption (Salsa20 or ARC4)

  • MD5 checksums for integrity verification

Binary Format

File Structure

BLTE File Layout:
┌─────────────────────────┐
│ BLTE Header (8 bytes)   │
├─────────────────────────┤
│ Extended Header         │ (optional, if header_size > 0)
│ - Flags (1 byte)        │
│ - Chunk Count (3 bytes) │
├─────────────────────────┤
│ Chunk Info Table        │ (24 bytes per chunk)
│ - Compressed Size       │
│ - Decompressed Size     │
│ - MD5 Checksum          │
├─────────────────────────┤
│ Data Block 1            │
│ - Encoding Type (1 byte)│
│ - Compressed Data       │
├─────────────────────────┤
│ Data Block 2            │
│ ...                     │
└─────────────────────────┘

Header Format

#![allow(unused)]
fn main() {
// Primary BLTE header (always 8 bytes)
struct BlteHeader {
    magic: [u8; 4],        // "BLTE" (0x424C5445 in big-endian)
    header_size: u32,      // Big-endian, total header size including these 8 bytes
}
}

Header Size Values

  • header_size == 0: Single chunk file, no extended header

  • header_size > 0: Multi-chunk file with extended header

Extended Header

Present only when header_size > 0:

#![allow(unused)]
fn main() {
struct ExtendedHeader {
    flags: u8,             // 0x0F = standard, 0x10 = extended
    chunk_count: [u8; 3],  // 24-bit big-endian chunk count
}
}

Chunk Information Table

Standard Format (flags = 0x0F)

Each chunk has a 24-byte entry:

#![allow(unused)]
fn main() {
struct ChunkInfo {
    compressed_size: u32,      // Big-endian
    decompressed_size: u32,    // Big-endian
    checksum: [u8; 16],        // MD5 of compressed chunk data
}
}

Extended Format (flags = 0x10)

Each chunk has a 40-byte entry:

#![allow(unused)]
fn main() {
struct ExtendedChunkInfo {
    compressed_size: u32,      // Big-endian
    decompressed_size: u32,    // Big-endian
    checksum: [u8; 16],        // MD5 of compressed chunk data
    decompressed_checksum: [u8; 16], // MD5 of decompressed chunk data
}
}

This extended format provides additional integrity checking with MD5 checksums of both compressed and decompressed data.

Formula Validation

For standard chunks (flags = 0x0F):

header_size = 12 + (chunk_count * 24)

For extended chunks (flags = 0x10):

header_size = 12 + (chunk_count * 40)

Where:

  • 12 = 8 (BLTE header) + 1 (flags) + 3 (chunk count)

  • 24 = size of standard ChunkInfo entry

  • 40 = size of extended ChunkInfo entry

The header_size field includes the 8-byte BLTE header (“BLTE” magic + header_size u32). Data starts at offset header_size from the beginning of the file.

Encoding Types

Each data block starts with a single-byte encoding type:

ByteCharacterTypeDescription
0x4E‘N’NoneUncompressed data
0x5A‘Z’ZLibZLib compressed (deflate)
0x34‘4’LZ4LZ4HC high compression
0x45‘E’EncryptedEncrypted data block
0x46‘F’FrameRecursive BLTE (deprecated)

Compression Formats

None (0x4E)

Uncompressed data follows immediately after the encoding byte:

[0x4E] [raw data...]

ZLib (0x5A)

Standard zlib compression:

[0x5A] [2-byte zlib header] [deflate stream...]

Important: Most implementations skip the zlib header and use raw deflate.

LZ4 (0x34)

LZ4HC (high compression) format:

[0x34] [decompressed_size:8] [compressed_lz4_data...]
  • decompressed_size: 64-bit little-endian size

  • Data following the prefix is a single LZ4 block (no sub-blocks)

  • Provides ~200-300 MB/s decompression speed

Format discrepancy: The WoWDev wiki describes a different LZ4 format with headerVersion (1 byte), 64-bit big-endian size, blockShift (1 byte, range 5-16), and multiple sub-blocks of 1 << blockShift bytes each. Agent.exe 3.13.3 uses the 8-byte LE prefix + single block format documented above. tact::Codec::DecodeLZ4 at 0x6f5fdb is a stub in Agent.exe 3.13.3 (returns error 5), so the LZ4 format cannot be fully verified from this binary version. cascette-rs matches the Agent.exe format. The wiki format may apply to a newer protocol version or a different product.

Encryption Format

Encrypted Block Structure

[0x45] [key_name_size:1] [key_name:8] [iv_size:1] [iv:4] [type:1]
[encrypted_data...]

Fields:

  • key_name_size: Usually 8

  • key_name: 64-bit key identifier

  • iv_size: Usually 4

  • iv: Initialization vector

  • type: 0x53 (‘S’) for Salsa20, 0x41 (‘A’) for ARC4 (legacy, not used in TACT 3.13.3+)

IV Extension and Modification for Chunks

The IV (typically 4 bytes) is zero-padded to 8 bytes for the Salsa20 nonce:

#![allow(unused)]
fn main() {
let mut nonce = [0u8; 8];  // zero-initialized
nonce[..iv_size].copy_from_slice(&iv);
// Remaining bytes stay zero (NOT duplicated)
}

For multi-chunk files, the IV is XORed with the chunk index before extension:

#![allow(unused)]
fn main() {
fn modify_iv(iv: &mut [u8], chunk_index: usize) {
    for i in 0..4 {
        iv[i] ^= ((chunk_index >> (i * 8)) & 0xFF) as u8;
    }
}
}

Parsing Algorithm

Step 1: Read BLTE Header

#![allow(unused)]
fn main() {
let magic = read_u32_be();  // Must be 0x424C5445 ("BLTE")
let header_size = read_u32_be();
}

Step 2: Determine Structure

#![allow(unused)]
fn main() {
if header_size == 0 {
    // Single chunk file
    // Data starts at offset 8
    // Chunk size = file_size - 8 - 1 (encoding byte)
} else {
    // Multi-chunk file
    // Read extended header and chunk table
    // Note: Data offset calculation varies by format!
}
}

The data offset for multi-chunk files is always header_size from the start of the file. The header_size field includes the 8-byte BLTE header.

Step 3: Read Extended Header (if present)

#![allow(unused)]
fn main() {
let flags = read_u8();  // 0x0F for standard, 0x10 for extended
let chunk_count = read_u24_be();  // 24-bit big-endian

// Read chunk information table
let chunks = Vec::with_capacity(chunk_count);
for _ in 0..chunk_count {
    chunks.push(ChunkInfo {
        compressed_size: read_u32_be(),
        decompressed_size: read_u32_be(),
        checksum: read_bytes(16),
    });
}
}

Step 4: Process Data Blocks

#![allow(unused)]
fn main() {
let mut output = Vec::new();
let mut offset = header_size;

for chunk_info in chunks {
    // Read chunk data
    let chunk_data = &data[offset..offset + chunk_info.compressed_size];

    // Optionally verify MD5 checksum (not done automatically during parsing)
    // let hash = md5::compute(chunk_data);
    // assert_eq!(hash.0, chunk_info.checksum);

    // Decompress based on encoding type
    let decompressed = decompress_chunk(chunk_data);
    output.extend_from_slice(&decompressed);

    offset += chunk_info.compressed_size;
}
}

Decompression Implementation

#![allow(unused)]
fn main() {
fn decompress_chunk(data: &[u8]) -> Result<Vec<u8>> {
    if data.is_empty() {
        return Err("Empty chunk");
    }

    match data[0] {
        0x4E => {
            // None - return raw data
            Ok(data[1..].to_vec())
        },
        0x5A => {
            // ZLib - decompress using deflate
            // Skip: [0x5A] [78 9C] (zlib header)
            let deflate_data = &data[3..];
            decompress_deflate(deflate_data)
        },
        0x34 => {
            // LZ4 - high compression
            let decompressed_size = u64::from_le_bytes(
                data[1..9].try_into()?
            );
            let compressed_data = &data[9..];
            decompress_lz4(compressed_data, decompressed_size as usize)
        },
        0x45 => {
            // Encrypted - requires key
            decrypt_chunk(&data[1..])
        },
        0x46 => {
            // Frame - recursive BLTE
            let inner_blte = &data[1..];
            parse_blte(inner_blte)
        },
        _ => Err("Unknown encoding type"),
    }
}
}

Real-World Example

Let’s examine the encoding file we fetched earlier:

00000000: 424c 5445 0000 00b4 0f00 0007 0000 0017  BLTE............
          ^^^^^^^^^ ^^^^^^^^^ ^^ ^^^^^^^ ^^^^^^^^^
          Magic     Hdr Size  F  Count   CompSize

Breaking down the header:

- Magic: 0x424C5445 = "BLTE"

- Header Size: 0x000000B4 = 180 bytes

- Flags: 0x0F (required value)

- Chunk Count: 0x000007 = 7 chunks

- First Chunk Compressed Size: 0x00000017 = 23 bytes

This indicates:

  • Multi-chunk file (header_size > 0)

  • 7 chunks total

  • Extended header size = 12 + (7 * 24) = 180 bytes

Performance Characteristics

Compression Mode Comparison

ModeCompression SpeedDecompression SpeedCompression RatioMemory Usage
None~500 MB/s~500 MB/s1.0xMinimal
LZ4~200 MB/s~300 MB/s2-4x~64 KB
ZLib~50-150 MB/s~100-200 MB/s3-8x~256 KB

Data Type Recommendations

Data TypeRecommended ModeReasoning
Text/ConfigZLib (level 6-9)High compressibility, access infrequent
TexturesLZ4 or NoneOften pre-compressed, need fast access
AudioNone or LZ4Poor compressibility, streaming required
ModelsZLib (level 3-6)Structured data compresses well
TemporaryNoneSpeed critical, short-lived

Special Cases

Headerless Files

When header_size == 0:

  • Single chunk only

  • No chunk information table

  • Data starts immediately at offset 8

  • Entire remaining file is one compressed block

Empty Chunks

Some chunks may have:

  • compressed_size == 0

  • decompressed_size == 0

  • Usually placeholders or removed content

Large Files

Multi-chunk structure enables parallel decompression and partial/resumable downloads, allowing streaming installation of large files.

Error Handling

Critical checks:

  1. Verify BLTE magic number
  2. Validate flags == 0x0F for extended headers
  3. Check chunk count > 0 when header_size > 0
  4. MD5 checksums are available via verify_checksum() on each chunk (not verified automatically during parsing)
  5. Handle unknown encoding types gracefully
  6. Ensure decompressed size matches expected
  7. Enforce maximum decompression size (1 GB) to prevent decompression bombs

Implementation Considerations

  • Process chunks incrementally rather than loading entire files into memory
  • Decompress chunks in parallel where possible
  • Checksum verification is a separate step from parsing (call verify_checksum() on chunk data)
  • Maximum decompression size is 1 GB (MAX_DECOMPRESSION_SIZE). Chunks claiming a larger decompressed size are rejected

Integration with NGDP

BLTE files in NGDP context:

  1. Fetched using encoding keys from CDN
  2. May be stored in archives or as loose files
  3. Encoding file maps content keys to BLTE-encoded versions
  4. Archive indices point to BLTE data within archives

Debugging Tips

Identifying BLTE Files

# Check for BLTE magic
xxd -l 4 file.bin
# Should show: 424c 5445 (BLTE)

# Check header size
xxd -s 4 -l 4 -e file.bin
# Big-endian u32 value

Common Issues

  1. Wrong endianness: BLTE uses big-endian, not little-endian
  2. Skipping zlib header: Most implementations skip bytes 1-2 after 0x5A
  3. IV modification: Remember to XOR IV with chunk index for encryption
  4. Checksum validation: Use MD5 of compressed data, not decompressed

Implementation Status

Rust Implementation (cascette-formats)

BLTE parser and builder:

  • None (N) - Uncompressed passthrough (complete)

  • ZLib (Z) - Deflate compression using flate2 (complete)

  • LZ4 (4) - LZ4 compression with proper size headers (complete)

  • Encrypted (E) - Salsa20 and ARC4 encryption with multi-chunk support (complete)

  • Frame (F) - Recursive BLTE support (not implemented, deprecated format)

  • Extended Format - Full support for 0x10 format with dual checksums (complete)

Validation Status:

  • Byte-for-byte round-trip validation with real WoW files

  • Successfully processes encoding, root, install, and download files

  • Integration tests with WoW Classic Era production data

  • Builder support for creating valid BLTE files programmatically

  • Both standard (0x0F) and extended (0x10) chunk formats supported

Python Tools (cascette-py)

Analysis and decompression tool supports:

  • None (N), ZLib (Z), Frame (F) modes

  • LZ4 (4) - Analysis only, decompression requires Rust implementation

  • Encrypted (E) - Detection and metadata extraction

See https://github.com/wowemulation-dev/cascette-py for the Python implementation.

References

ESpec (Encoding Specification) Documentation

Overview

ESpec is a domain-specific language used throughout NGDP for specifying BLTE encoding instructions. It defines how content blocks are compressed, encrypted, and structured within BLTE containers. ESpec appears in patch configurations, encoding files, and BLTE block headers.

Grammar Components

ESpec uses single-character identifiers for encoding operations:

Basic Encodings

  • n: Plain/uncompressed data

  • z: Zlib compression

  • e: Encryption

  • b: Block-based encoding

  • c: BCPack compression

  • g: GDeflate compression

Encoding Combinations

ESpec supports nested and sequential encoding operations through composition.

Block Syntax

Size Specifications

Block sizes support unit suffixes:

  • K: Kilobytes (1024 bytes)

  • M: Megabytes (1024 * 1024 bytes)

  • No suffix: Bytes

Count Specifications

Block counts can be:

  • Exact number: Specific block count (e.g., 3)

  • Variable: Asterisk (*) for variable block count

  • Dynamic sizing: Block count of zero with an average block size. Block boundaries are determined dynamically based on content. Distinct from variable (*) block count.

Block Format

b:{size[*count]=encoding}

Components:

  • size: Block size with optional unit suffix

  • count: Block count (optional, defaults to 1)

  • encoding: Encoding specification for blocks

Grammar Reference

Simple Encodings

plain := "n"
zlib := "z" [ ":" ( level | "{" zlib_params "}" ) ]
zlib_params := ( level | variant ) [ "," ( variant | window_bits ) ] [ "," window_bits ]
encryption := "e" ":" "{" key "," iv "," content_encoding "}"

Zlib supports multiple syntax forms: z, z:9, z:{9}, z:{9,mpq}, z:{9,15}, z:{9,mpq,15}, z:{mpq}, z:{mpq,15}. The second parameter can be either a variant name or a numeric window_bits value.

Block Encoding

block := "b" ":" ( "{" block_spec { "," block_spec } "}" | encoding )
block_spec := size [ "*" count ] "=" encoding
size := number [ unit ]
unit := "K" | "M"
count := number | "*"

A block table can omit braces when it contains a single encoding with no size specification: b:z is equivalent to a single block with no explicit size.

Complex Encodings

encoding := plain | zlib | encryption | block | bcpack | gdeflate
bcpack := "c" [ ":" "{" bcn "}" ]
gdeflate := "g" [ ":" "{" level "}" ]

Examples

Simple Block Encoding

b:{495=z,9673=n}

This specifies:

  • First block: 495 bytes, zlib compressed

  • Second block: 9673 bytes, uncompressed

Variable Block Sizes

b:{16K*=z}

This specifies:

  • Variable number of 16KB blocks

  • All blocks use zlib compression

Encrypted Blocks

b:{256K*=e:{key,iv,z}}

This specifies:

  • Variable number of 256KB blocks

  • Each block is encrypted with specified key and IV

  • Content is zlib compressed before encryption

Compression Levels

b:{16K*=z:{6,mpq}}

This specifies:

  • Variable number of 16KB blocks

  • Zlib compression level 6

  • MPQ-compatible compression settings

Mixed Block Types

b:{1K=n,4K*=z,2K=n}

This specifies:

  • First block: 1KB uncompressed

  • Variable number of 4KB zlib-compressed blocks

  • Final block: 2KB uncompressed

Zlib Compression Levels

Level Specification

Zlib compression supports level, variant, and window bits parameters:

z:{level}
z:{level,window_bits}
z:{level,variant}
z:{level,variant,window_bits}

Standard Levels

Valid levels are 1-9:

  • 1: Fastest compression

  • 6: Default compression (balance of speed/size)

  • 9: Maximum compression

Level 0 is not accepted.

Variant Specifications

  • mpq: MPQ-compatible compression settings

  • zlib: Standard zlib settings

  • lz4hc: LZ4HC-compatible compression settings

Window Bits

Zlib window bit count can be specified in range [8, 15]. Two values can be provided (must match). Default is 15.

Compression Examples

z:{1}           # Fast compression
z:{9}           # Maximum compression
z:{6,mpq}       # MPQ-compatible level 6
z:{6,zlib,15}   # Zlib variant with explicit window bits

Encryption Specification

Format

e:{key,iv,content_encoding}

Components

  • key: Encryption key identifier or value

  • iv: Initialization vector

  • content_encoding: Encoding applied before encryption

Key Format

Keys must be exactly 16 hex characters (8 bytes):

e:{0123456789abcdef,fedcba98,z}

This specifies:

  • Encryption key: 0123456789abcdef (16 hex chars, 8 bytes)

  • IV: fedcba98 (8 hex chars, 4 bytes)

  • Content: zlib compressed before encryption

The parser rejects keys that are not exactly 16 hex characters. The IV must be exactly 8 hex characters (4 bytes).

BCPack Compression

BCPack Usage

c
c:{3}

BCPack compression uses a proprietary algorithm optimized for specific content types. An optional BCn (block compression number) parameter selects the mode, in range [1, 7]:

bcpack := "c" [ ":" "{" bcn "}" ]

Block-Based BCPack

b:{64K*=c}
b:{64K*=c:{5}}

Variable 64KB blocks using BCPack compression.

GDeflate Compression

GDeflate Usage

g
g:{6}

GDeflate is a GPU-accelerated deflate variant designed for DirectStorage. An optional compression level parameter can be specified in range [1, 12]:

gdeflate := "g" [ ":" "{" level "}" ]

Block-Based GDeflate

b:{32K*=g}
b:{32K*=g:{8}}

Variable 32KB blocks using GDeflate compression.

Usage Contexts

PatchConfig Files

ESpec appears in patch-entry lines:

patch-entry = source_hash target_hash size espec

Example:

patch-entry = 1234567890abcdef abcdef1234567890 524288 b:{16K*=z}

Encoding Files

Encoding files use ESpec for content encoding specifications:

content_key encoded_key size espec

BLTE Data Blocks

BLTE headers contain ESpec for block processing instructions:

graph TD
    A[BLTE Header] --> B[Block Count]
    A --> C[ESpec]
    C --> D[Block 1 Processing]
    C --> E[Block 2 Processing]
    C --> F[Block N Processing]

Parser Implementation

Tokenization

ESpec parsing requires tokenization of:

  1. Identifiers: Single characters (n, z, e, b, c, g)
  2. Numbers: Decimal integers
  3. Units: Size suffixes (K, M)
  4. Delimiters: Braces, colons, commas, equals, asterisks

Grammar Rules

#![allow(unused)]
fn main() {
// Example parser structure
enum ESpec {
    Plain,
    Zlib { level: Option<u8>, variant: Option<String> },
    Encryption { key: String, iv: String, content: Box<ESpec> },
    Block { specs: Vec<BlockSpec> },
    BCPack,
    GDeflate,
}

struct BlockSpec {
    size: u64,
    count: BlockCount,
    encoding: ESpec,
}

enum BlockCount {
    Exact(u32),
    Variable,
}
}

Error Handling

Common parsing errors:

  • Invalid identifier characters

  • Malformed block specifications

  • Missing required parameters

  • Invalid size or count values

  • Unbalanced braces or parentheses

Validation Rules

Size Constraints

  • Block sizes must be positive integers

  • Maximum block size typically limited to several MB

  • Minimum block size typically 1 byte

Count Constraints

  • Block counts must be positive integers when specified

  • Variable count (*) requires size specification

  • Total content size must be consistent

Encoding Constraints

  • Encryption requires valid key and IV lengths

  • Compression levels must be within algorithm-specific ranges

  • Nested encodings must be logically valid

Performance Considerations

Block Size Selection

Block sizes depend on usage:

  • Small blocks (1-4KB): Better for streaming, higher overhead

  • Medium blocks (16-64KB): Balanced performance

  • Large blocks (256KB+): Better compression ratios, higher memory usage

Compression Algorithm Selection

Algorithm characteristics:

  • zlib: Universal compatibility, good compression

  • BCPack: Optimized for specific content types

  • GDeflate: Fast compression with good ratios

  • None (n): Maximum speed, no space savings

Memory Usage

#![allow(unused)]
fn main() {
// Example memory-efficient processing
fn process_blocks(espec: &ESpec, data: &[u8]) -> Result<Vec<u8>> {
    match espec {
        ESpec::Block { specs } => {
            let mut output = Vec::new();
            let mut offset = 0;

            for spec in specs {
                let block_data = &data[offset..offset + spec.size as usize];
                let processed = process_encoding(&spec.encoding, block_data)?;
                output.extend(processed);
                offset += spec.size as usize;
            }

            Ok(output)
        }
        // Other encoding types...
    }
}
}

Common Patterns

Streaming-Optimized

b:{16K*=z}

Small, consistent block sizes for streaming applications.

Storage-Optimized

b:{1M*=z:{9}}

Large blocks with maximum compression for storage efficiency.

Mixed Content

b:{4K=n,64K*=z,4K=n}

Headers and footers uncompressed, bulk content compressed.

Encrypted Streaming

b:{32K*=e:{key,iv,z:{6}}}

Moderate block sizes with encryption and balanced compression.

Debugging and Validation

ESpec Validation

#![allow(unused)]
fn main() {
fn validate_espec(espec: &str) -> Result<ESpec, ESpecError> {
    let parsed = parse_espec(espec)?;
    validate_constraints(&parsed)?;
    Ok(parsed)
}

fn validate_constraints(espec: &ESpec) -> Result<(), ESpecError> {
    match espec {
        ESpec::Zlib { level: Some(level), .. } if *level > 9 => {
            Err(ESpecError::InvalidCompressionLevel(*level))
        }
        ESpec::Block { specs } if specs.is_empty() => {
            Err(ESpecError::EmptyBlockSpec)
        }
        // Additional validation rules...
        _ => Ok(())
    }
}
}

Round-Trip Testing

#![allow(unused)]
fn main() {
#[test]
fn test_espec_round_trip() {
    let original = "b:{16K*=z:{6}}";
    let parsed = parse_espec(original).unwrap();
    let serialized = serialize_espec(&parsed);
    assert_eq!(original, serialized);
}
}

Integration Examples

BLTE Block Processing

#![allow(unused)]
fn main() {
fn process_blte_block(espec: &ESpec, input: &[u8]) -> Result<Vec<u8>> {
    match espec {
        ESpec::Plain => Ok(input.to_vec()),
        ESpec::Zlib { level, .. } => decompress_zlib(input),
        ESpec::Encryption { key, iv, content } => {
            let decrypted = decrypt(input, key, iv)?;
            process_blte_block(content, &decrypted)
        }
        ESpec::Block { specs } => process_block_specs(specs, input),
    }
}
}

Patch Application

#![allow(unused)]
fn main() {
fn apply_patch_with_espec(
    source: &[u8],
    patch: &[u8],
    espec: &ESpec
) -> Result<Vec<u8>> {
    let processed_patch = process_blte_block(espec, patch)?;
    apply_binary_patch(source, &processed_patch)
}
}

Reference Implementation

Complete Parser

#![allow(unused)]
fn main() {
use nom::{
    branch::alt,
    bytes::complete::tag,
    character::complete::{alphanumeric1, char, digit1},
    combinator::{map, opt},
    multi::separated_list0,
    sequence::{delimited, preceded, separated_pair, tuple},
    IResult,
};

pub fn parse_espec(input: &str) -> IResult<&str, ESpec> {
    alt((
        parse_plain,
        parse_zlib,
        parse_encryption,
        parse_block,
        parse_bcpack,
        parse_gdeflate,
    ))(input)
}

fn parse_plain(input: &str) -> IResult<&str, ESpec> {
    map(char('n'), |_| ESpec::Plain)(input)
}

fn parse_zlib(input: &str) -> IResult<&str, ESpec> {
    map(
        tuple((
            char('z'),
            opt(preceded(
                char(':'),
                delimited(
                    char('{'),
                    separated_pair(
                        digit1,
                        opt(char(',')),
                        opt(alphanumeric1)
                    ),
                    char('}')
                )
            ))
        )),
        |(_, params)| match params {
            Some((level, variant)) => ESpec::Zlib {
                level: level.parse().ok(),
                variant: variant.map(|s| s.to_string()),
            },
            None => ESpec::Zlib { level: None, variant: None },
        }
    )(input)
}
}

Implementation Status

Rust Implementation (cascette-formats)

ESpec parser:

  • Plain (n) - Uncompressed content
  • ZLib compression (z) - Level [1,9], variant (mpq/zlib/lz4hc), window bits [8,15]; all optional, 3-param syntax supported
  • Encryption (e) - Key, IV, and nested content encoding
  • Block-based (b) - Variable and fixed block specifications
  • BCPack (c) - Optional BCn version [1,7]; bare c accepted
  • GDeflate (g) - Optional level [1,12]; bare g accepted

Parser Features:

  • Safe integer casting with try_from to prevent truncation
  • Display trait implementation for round-trip string conversion
  • Test suite covering production ESpec patterns and edge cases
  • Integration with BLTE and Encoding file processing

Analysis and Validation

ESpec patterns are validated across all CASC formats to ensure correct parsing and processing of compression and encryption specifications.

Salsa20 Encryption in CASC

Salsa20 is the primary stream cipher used for encrypting sensitive content in CASC archives. It provides fast, secure encryption for game assets while maintaining streaming capabilities.

Overview

CASC uses Salsa20 with 128-bit (16-byte) keys and the tau (“expand 16-byte k”) constants. Each encrypted BLTE block specifies a 64-bit key name for key store lookup and a 4-byte IV that is extended to 8 bytes by zero-padding.

Algorithm Details

Salsa20 Core

Salsa20 is a stream cipher designed by Daniel J. Bernstein:

  • Key size: 128 bits (16 bytes) in CASC; 256 bits (32 bytes) in standard Salsa20

  • Nonce/IV size: 64 bits (8 bytes)

  • Block size: 512 bits (64 bytes)

  • Rounds: 20 (reduced variants use 8 or 12)

Core Function

#![allow(unused)]
fn main() {
fn salsa20_core(input: &[u32; 16]) -> [u32; 16] {
    let mut x = *input;

    // 20 rounds (10 double-rounds)
    for _ in 0..10 {
        // Column round
        quarter_round(&mut x, 0, 4, 8, 12);
        quarter_round(&mut x, 5, 9, 13, 1);
        quarter_round(&mut x, 10, 14, 2, 6);
        quarter_round(&mut x, 15, 3, 7, 11);

        // Row round
        quarter_round(&mut x, 0, 1, 2, 3);
        quarter_round(&mut x, 5, 6, 7, 4);
        quarter_round(&mut x, 10, 11, 8, 9);
        quarter_round(&mut x, 15, 12, 13, 14);
    }

    // Add input to output
    for i in 0..16 {
        x[i] = x[i].wrapping_add(input[i]);
    }

    x
}

fn quarter_round(x: &mut [u32; 16], a: usize, b: usize, c: usize, d: usize) {
    x[b] ^= (x[a].wrapping_add(x[d])).rotate_left(7);
    x[c] ^= (x[b].wrapping_add(x[a])).rotate_left(9);
    x[d] ^= (x[c].wrapping_add(x[b])).rotate_left(13);
    x[a] ^= (x[d].wrapping_add(x[c])).rotate_left(18);
}
}

CASC Implementation

BLTE Encryption Block

In BLTE files, encrypted blocks use format:

[0x45] [key_name_size:1] [key_name:8] [iv_size:1] [iv:4] [type:1]
[encrypted_data...]

Where:

  • 0x45: ‘E’ marker for encrypted block

  • key_name: 64-bit key identifier

  • iv: Initialization vector (1-8 bytes, typically 4)

  • type: 0x53 (‘S’) for Salsa20. 0x41 (‘A’) for ARC4 in legacy CASC versions (not used in TACT 3.13.3+)

Key Lookup

CASC uses a 64-bit key name to look up the 16-byte encryption key from a key store. The agent calls a key getter callback with the key name; there is no key derivation in the encryption path.

#![allow(unused)]
fn main() {
struct CASCKeyManager {
    keys: HashMap<u64, [u8; 16]>,  // key_name -> 16-byte key
}

impl CASCKeyManager {
    pub fn get_key(&self, key_name: u64) -> Option<[u8; 16]> {
        self.keys.get(&key_name).copied()
    }
}
}

IV Modification for Chunks

For multi-chunk BLTE files, the IV is modified per chunk:

#![allow(unused)]
fn main() {
fn modify_iv_for_chunk(base_iv: u32, chunk_index: usize) -> u32 {
    let mut iv_bytes = base_iv.to_le_bytes();

    // XOR with chunk index
    for i in 0..4 {
        iv_bytes[i] ^= ((chunk_index >> (i * 8)) & 0xFF) as u8;
    }

    u32::from_le_bytes(iv_bytes)
}
}

Salsa20 State Setup

State Initialization

#![allow(unused)]
fn main() {
struct Salsa20State {
    state: [u32; 16],
    counter: u64,
}

impl Salsa20State {
    pub fn new(key: &[u8; 16], nonce: &[u8; 8]) -> Self {
        let mut state = [0u32; 16];

        // Tau constants "expand 16-byte k" (CASC uses 16-byte keys)
        state[0]  = 0x61707865; // "expa"
        state[5]  = 0x3120646e; // "nd 1"
        state[10] = 0x79622d36; // "6-by"
        state[15] = 0x6b206574; // "te k"

        // 16-byte key placed at positions 1-4 and duplicated at 11-14
        for i in 0..4 {
            let word = u32::from_le_bytes([
                key[i * 4],
                key[i * 4 + 1],
                key[i * 4 + 2],
                key[i * 4 + 3],
            ]);
            state[1 + i] = word;
            state[11 + i] = word;  // Duplicate for 16-byte key mode
        }

        // Counter (initially 0)
        state[8] = 0;
        state[9] = 0;

        // Nonce
        state[6] = u32::from_le_bytes([nonce[0], nonce[1], nonce[2], nonce[3]]);
        state[7] = u32::from_le_bytes([nonce[4], nonce[5], nonce[6], nonce[7]]);

        Salsa20State { state, counter: 0 }
    }
}
}

Encryption/Decryption

Stream Generation

#![allow(unused)]
fn main() {
impl Salsa20State {
    pub fn generate_keystream(&mut self, output: &mut [u8]) {
        let mut pos = 0;

        while pos < output.len() {
            // Generate next block
            let block = salsa20_core(&self.state);

            // Convert to bytes
            let block_bytes = unsafe {
                std::slice::from_raw_parts(
                    block.as_ptr() as *const u8,
                    64
                )
            };

            // Copy to output
            let copy_len = std::cmp::min(64, output.len() - pos);
            output[pos..pos + copy_len]
                .copy_from_slice(&block_bytes[..copy_len]);

            // Increment counter
            self.increment_counter();
            pos += copy_len;
        }
    }

    fn increment_counter(&mut self) {
        self.counter += 1;
        self.state[8] = (self.counter & 0xFFFFFFFF) as u32;
        self.state[9] = (self.counter >> 32) as u32;
    }
}
}

Decryption Process

#![allow(unused)]
fn main() {
pub fn decrypt_salsa20(
    ciphertext: &[u8],
    key: &[u8; 32],
    nonce: &[u8; 8]
) -> Vec<u8> {
    let mut state = Salsa20State::new(key, nonce);
    let mut keystream = vec![0u8; ciphertext.len()];
    state.generate_keystream(&mut keystream);

    // XOR ciphertext with keystream
    let mut plaintext = Vec::with_capacity(ciphertext.len());
    for i in 0..ciphertext.len() {
        plaintext.push(ciphertext[i] ^ keystream[i]);
    }

    plaintext
}
}

CASC-Specific Usage

BLTE Decryption

#![allow(unused)]
fn main() {
fn decrypt_blte_chunk(
    chunk_data: &[u8],
    chunk_index: usize,
    key_manager: &CASCKeyManager
) -> Result<Vec<u8>> {
    // Parse encryption header
    let key_name_size = chunk_data[0] as usize;
    let key_name = u64::from_le_bytes(
        chunk_data[1..1 + key_name_size].try_into()?
    );

    let iv_offset = 1 + key_name_size;
    let iv_size = chunk_data[iv_offset] as usize;
    let base_iv = u32::from_le_bytes(
        chunk_data[iv_offset + 1..iv_offset + 1 + iv_size].try_into()?
    );

    let cipher_type = chunk_data[iv_offset + 1 + iv_size];

    if cipher_type != 0x53 {  // 'S' for Salsa20
        return Err("Not Salsa20 encrypted");
    }

    // Get encryption key
    let key = key_manager.get_key(key_name)
        .ok_or("Key not found")?;

    // Modify IV for chunk
    let iv = modify_iv_for_chunk(base_iv, chunk_index);
    let mut nonce = [0u8; 8];
    nonce[..4].copy_from_slice(&iv.to_le_bytes());

    // Decrypt data
    let encrypted_offset = iv_offset + 1 + iv_size + 1;
    let ciphertext = &chunk_data[encrypted_offset..];

    Ok(decrypt_salsa20(ciphertext, &key, &nonce))
}
}

Known Keys

CASC uses various encryption keys for different content:

#![allow(unused)]
fn main() {
// Example key names (actual keys not included for legal reasons)
const CINEMATIC_KEY: u64 = 0xFAC5C7F366D20C85;
const ACHIEVEMENT_KEY: u64 = 0x0123456789ABCDEF;
const PVP_KEY: u64 = 0xDEADBEEFCAFEBABE;
}

Performance Optimization

SIMD Implementation

Using SIMD for parallel processing:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")]
use std::arch::x86_64::*;

unsafe fn salsa20_core_simd(input: &[u32; 16]) -> [u32; 16] {
    // Load state into SIMD registers
    let mut row0 = _mm_loadu_si128(input[0..4].as_ptr() as *const __m128i);
    let mut row1 = _mm_loadu_si128(input[4..8].as_ptr() as *const __m128i);
    let mut row2 = _mm_loadu_si128(input[8..12].as_ptr() as *const __m128i);
    let mut row3 = _mm_loadu_si128(input[12..16].as_ptr() as *const __m128i);

    // Perform rounds using SIMD operations
    // ... (implementation details)

    // Store results
    let mut output = [0u32; 16];
    _mm_storeu_si128(output[0..4].as_mut_ptr() as *mut __m128i, row0);
    _mm_storeu_si128(output[4..8].as_mut_ptr() as *mut __m128i, row1);
    _mm_storeu_si128(output[8..12].as_mut_ptr() as *mut __m128i, row2);
    _mm_storeu_si128(output[12..16].as_mut_ptr() as *mut __m128i, row3);

    output
}
}

Buffered Decryption

For large files:

#![allow(unused)]
fn main() {
struct BufferedSalsa20 {
    state: Salsa20State,
    buffer: [u8; 4096],
    buffer_pos: usize,
}

impl BufferedSalsa20 {
    pub fn decrypt_stream<R: Read, W: Write>(
        &mut self,
        input: &mut R,
        output: &mut W
    ) -> Result<()> {
        let mut cipher_buffer = [0u8; 4096];

        loop {
            let bytes_read = input.read(&mut cipher_buffer)?;
            if bytes_read == 0 {
                break;
            }

            self.state.generate_keystream(&mut self.buffer[..bytes_read]);

            for i in 0..bytes_read {
                self.buffer[i] ^= cipher_buffer[i];
            }

            output.write_all(&self.buffer[..bytes_read])?;
        }

        Ok(())
    }
}
}

Security Considerations

  1. IV Uniqueness: IVs must not be reused with the same key (CASC handles this via chunk index XOR)
  2. Side Channels: Use constant-time operations for key comparison
  3. Key Storage: CASC encryption keys are static and community-maintained; the TactKeyStore keeps them in memory with redacted debug output

Testing

Test Vectors

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    #[test]
    fn test_salsa20_encryption() {
        let key = [0u8; 32];
        let nonce = [0u8; 8];
        let plaintext = b"Hello, World!";

        let ciphertext = encrypt_salsa20(plaintext, &key, &nonce);
        let decrypted = decrypt_salsa20(&ciphertext, &key, &nonce);

        assert_eq!(plaintext, &decrypted[..]);
    }
}
}

cascette-crypto API

The cascette-crypto crate provides CASC-specific Salsa20 implementation.

Basic Decryption

#![allow(unused)]
fn main() {
use cascette_crypto::salsa20::{decrypt_salsa20, Salsa20Cipher};

// CASC uses 16-byte keys and 4-byte IVs
let key: [u8; 16] = [0x01; 16];
let iv: [u8; 4] = [0x02, 0x03, 0x04, 0x05];
let block_index = 0; // First block in BLTE file

let ciphertext = &[/* encrypted data */];
let plaintext = decrypt_salsa20(ciphertext, &key, &iv, block_index)
    .expect("decryption failed");
}

In-Place Processing

#![allow(unused)]
fn main() {
use cascette_crypto::Salsa20Cipher;

let key: [u8; 16] = [0x42; 16];
let iv: [u8; 4] = [0x11, 0x22, 0x33, 0x44];

let mut cipher = Salsa20Cipher::new(&key, &iv, 0)
    .expect("cipher creation failed");

let mut data = vec![0u8; 1024];
cipher.apply_keystream(&mut data);
}

TACT Key Management

#![allow(unused)]
fn main() {
use cascette_crypto::{TactKeyStore, TactKey};

// Create store with hardcoded WoW keys
let store = TactKeyStore::new();

// Look up key by ID
let key_id = 0xFA505078126ACB3E_u64;
if let Some(key) = store.get(key_id) {
    // Use key for decryption
    println!("Found key: {:02X?}", key);
}

// Add custom key
let mut store = TactKeyStore::empty();
let key = TactKey::from_hex(
    0x1234567890ABCDEF,
    "0123456789ABCDEF0123456789ABCDEF"
).expect("invalid key hex");
store.add(key);

// Load keys from string content (file I/O is caller's responsibility)
let csv_content = "FA505078126ACB3E,BDC51862ABED79B2DE48C8E7E66C6200";
store.load_from_csv(csv_content);

let txt_content = "FA505078126ACB3E BDC51862ABED79B2DE48C8E7E66C6200";
store.load_from_txt(txt_content);
}

Custom Storage Backends

The TactKeyProvider trait allows implementing custom key storage:

#![allow(unused)]
fn main() {
use cascette_crypto::{TactKeyProvider, TactKey, CryptoError};

// Implement for keyring, database, encrypted files, etc.
struct MyKeyStore { /* ... */ }

impl TactKeyProvider for MyKeyStore {
    fn get_key(&self, id: u64) -> Result<Option<[u8; 16]>, CryptoError> {
        // Look up key from your storage backend
        todo!()
    }

    fn add_key(&mut self, key: TactKey) -> Result<(), CryptoError> {
        // Store key in your backend
        todo!()
    }

    // ... other trait methods
}
}

ARC4 (Legacy)

#![allow(unused)]
fn main() {
use cascette_crypto::Arc4Cipher;

// ARC4 used in older BLTE encrypted blocks
let key = b"encryption_key";
let mut cipher = Arc4Cipher::new(key)
    .expect("cipher creation failed");

let encrypted = cipher.encrypt(b"plaintext");

// Decrypt requires fresh cipher instance
let mut cipher = Arc4Cipher::new(key)
    .expect("cipher creation failed");
let decrypted = cipher.decrypt(&encrypted);
}

Implementation Details

CASC-Specific Differences

The CASC Salsa20 variant differs from standard Salsa20:

AspectStandard Salsa20CASC Salsa20
Key size32 bytes16 bytes (duplicated internally)
IV/Nonce size8 bytes4 bytes (extended internally)
Constants“expand 32-byte k”“expand 16-byte k”
Block indexCounter-basedXORed with IV

Key Duplication

CASC uses 16-byte keys with the “expand 16-byte k” (tau) constants:

#![allow(unused)]
fn main() {
// Tau constants for 16-byte keys
state[0]  = 0x61707865; // "expa"
state[5]  = 0x3120646e; // "nd 1"
state[10] = 0x79622d36; // "6-by"
state[15] = 0x6b206574; // "te k"

// Key bytes 0-15 placed at positions 1-4
// Key bytes 0-15 repeated at positions 11-14
}

IV Extension

The IV modification and zero-padding algorithm is documented in the CASC Implementation section above.

Validation Status

  • Integration tests with real WoW encryption keys

  • Test suite validates against known BLTE ‘E’ mode samples

  • Zero-allocation keystream generation for performance

Note: CascLib duplicates the IV (same bug as was in cascette-rs before the fix). The correct behavior is zero-padding.

TACT Key Coverage

The cascette-crypto crate includes hardcoded TACT keys for major WoW expansions:

  • Battle for Azeroth, Shadowlands, The War Within, Classic Era

Keys are stored with redacted debug output to prevent accidental logging.

References

CDN Architecture Documentation

Overview

NGDP uses a Content Delivery Network (CDN) architecture for distributing game content. The system provides geographical distribution of content through HTTP/HTTPS endpoints, with automatic failover and load balancing capabilities.

Note: Code examples in this document illustrate concepts. For working implementations, see the cascette CLI or the cascette-protocol crate.

Discovery and Access Flow

Product Discovery

Product discovery begins with a v1/summary query to the Ribbit TCP service:

sequenceDiagram
    participant Client
    participant Ribbit
    participant CDN

    Client->>Ribbit: v1/summary (TCP)
    Ribbit-->>Client: Available products

    Client->>Ribbit: v2/versions/{product}
    Ribbit-->>Client: Version manifests

    Client->>Ribbit: v2/cdns/{product}
    Ribbit-->>Client: CDN configurations

    Client->>CDN: HTTP GET config files
    CDN-->>Client: BuildConfig, CDNConfig

    Client->>CDN: HTTP GET content files
    CDN-->>Client: Game data

Region Selection

NGDP supports the following regions:

  • us: United States

  • eu: Europe

  • kr: Korea

  • tw: Taiwan

  • cn: China (restricted access)

  • sg: Singapore

HTTPS v2 Endpoints

The v2 API provides three primary endpoints:

  • versions: Product version information and build manifests

  • cdns: CDN server configurations and endpoints

  • bgdl: Background download configurations

Configuration Retrieval Process

  1. Query product versions to get current build information
  2. Retrieve CDN configurations to get the correct Path value
  3. Download BuildConfig and CDNConfig files using the Path from step 2
  4. Parse configuration to locate content files
  5. Begin content download from CDN servers

CRITICAL: Always extract the Path field from CDN responses. Never assume paths based on product names. For example, all WoW products (wow, wow_classic, wow_classic_era, wow_classic_titan, wow_anniversary) use tpr/wow despite having different product codes.

Content Download Workflow

flowchart TD
    A[Get Product Versions] --> B[Select Build]
    B --> C[Get CDN Config]
    C --> D[Download BuildConfig]
    D --> E[Download CDNConfig]
    E --> F[Parse Archive Lists]
    F --> G[Download Content Files]
    G --> H[Verify Content Hashes]

    style A stroke-width:3px
    style H stroke-width:3px
    style C stroke-width:2px,stroke-dasharray:5 5
    style E stroke-width:2px,stroke-dasharray:5 5

CDN URL Construction

URL Pattern

http(s)://{cdn_server}/{cdn_path}/{type}/{hash[0:2]}/{hash[2:4]}/{full_hash}

Component Breakdown

  • cdn_server: CDN hostname from the Hosts field (e.g., level3.blizzard.com)

  • cdn_path: Path from the Path field - MUST be extracted from CDN response

  • type: Content type (config, data, patch)

  • hash[0:2]: First two characters of content hash

  • hash[2:4]: Next two characters of content hash

  • full_hash: Complete content hash

Path vs ProductPath Distinction

IMPORTANT: The CDN response contains two path fields that serve different purposes:

  • Path (e.g., tpr/wow): Used for ALL game content including:

    • Build configuration files (/config/)
    • CDN configuration files (/config/)
    • Encoding files (/data/)
    • Root files (/data/)
    • Archive files (/data/)
    • Patch files (/patch/)
    • All other game data
  • ProductPath (e.g., tpr/configs): Used ONLY for:

    • Product configuration files that Battle.net agent/launcher use
    • These are JSON files containing product metadata and settings
    • Example: http://cdn.arctium.tools/tpr/configs/data/{hash}

Common mistake: Do NOT use ProductPath for build configs, CDN configs, or any game data files. ProductPath is exclusively for Battle.net launcher product configuration.

Directory Sharding

The two-level directory structure (hash[0:2]/hash[2:4]) distributes files across 65,536 directories, keeping per-directory file counts low for filesystem and CDN edge server performance.

Example URLs

# Configuration file
http://level3.blizzard.com/tpr/wow/config/12/34/1234567890abcdef1234567890abcdef

# Game data file
http://level3.blizzard.com/tpr/wow/data/ab/cd/abcdef1234567890abcdef1234567890

# Patch data
http://level3.blizzard.com/tpr/wow/patch/56/78/567890abcdef1234567890abcdef123456

Real-World Examples

Examples from wow_classic_era version 1.15.7.61582 (archived on Arctium CDN):

# Build configuration (hash: ae66faee0ac786fdd7d8b4cf90a8d5b9)
http://cdn.arctium.tools/tpr/wow/config/ae/66/ae66faee0ac786fdd7d8b4cf90a8d5b9

# CDN configuration (hash: 63eee50d456a6ddf3b630957c024dda0)
http://cdn.arctium.tools/tpr/wow/config/63/ee/63eee50d456a6ddf3b630957c024dda0

# Patch configuration (hash: 474b9630df5b46df5d98ec27c5f78d07)
http://cdn.arctium.tools/tpr/wow/config/47/4b/474b9630df5b46df5d98ec27c5f78d07

# Product configuration (different path structure)
http://cdn.arctium.tools/tpr/configs/data/c9/93/c9934edfc8f217a2e01c47e4deae8454

# Encoding file (using encoding key, not content key!)
# From build config: encoding = b07b881f4527bda7cf8a1a2f99e8622e bbf06e7476382cfaa396cff0049d356b
# Must use the SECOND hash (encoding key): bbf06e7476382cfaa396cff0049d356b
http://cdn.arctium.tools/tpr/wow/data/bb/f0/bbf06e7476382cfaa396cff0049d356b

# Root file: Cannot be fetched directly!
# The root file's encoding key must be looked up in the encoding file first.
# The hash ea8aefdebdbd6429da905c8c6a2b1813 is the content key, not the encoding key.

Note the different path structures:

  • Most files use /tpr/wow/{type}/

  • Product configurations use /tpr/configs/data/

  • Patch files would be under /tpr/wow/patch/

Configuration Files

BuildConfig, CDNConfig, PatchConfig

See Configuration File Formats for the authoritative documentation of BuildConfig, CDNConfig, and PatchConfig fields, formats, and examples.

The key point for CDN access: most BuildConfig fields contain <content-key> <encoding-key> pairs. Use the encoding key (second hash) for CDN fetches. The encoding file must be fetched first to resolve encoding keys for other files.

CDN Response Structure

Field Definitions

  • Name: CDN configuration identifier

  • Path: Base path for content requests

  • Hosts: List of CDN hostnames

  • Servers: Legacy server configuration

  • ConfigPath: Path to configuration files

Special Parameters

  • maxhosts: Maximum number of hosts to use simultaneously

  • fallback: Fallback CDN configuration

Example CDN Response

Name!STRING:0|Path!STRING:0|Hosts!STRING:0|Servers!STRING:0|ConfigPath!STRING:0
us|tpr/wow|level3.blizzard.com edgecast.blizzard.com|http://level3.blizzard.com/ http://edgecast.blizzard.com/|tpr/configs/data
eu|tpr/wow|eu.cdn.blizzard.com|http://eu.cdn.blizzard.com/|tpr/configs/data

Path Types

Content Types

  • config: Configuration files (BuildConfig, CDNConfig, etc.)

  • data: Game content files and archives

  • patch: Differential patch data

Usage Patterns

# Configuration files
/{cdn_path}/config/{hash_dirs}/{hash}

# Game data
/{cdn_path}/data/{hash_dirs}/{hash}

# Patch data
/{cdn_path}/patch/{hash_dirs}/{hash}

Implementation Requirements

Mandatory Components

Both BuildConfig AND CDNConfig are required for proper NGDP operation:

  • BuildConfig provides system file references

  • CDNConfig specifies content storage locations

  • Missing either file prevents content access

CDN Path Resolution

Extract the Path field from CDN responses as described in the Configuration Retrieval Process section. Cache the path per product for the session duration.

Fallback Logic

Implement fallback mechanisms:

  1. CDN Rotation: Cycle through available CDN servers
  2. Region Fallback: Fall back to alternate regions if available
  3. Protocol Fallback: HTTPS preferred, HTTP as fallback
  4. Retry Logic: Exponential backoff for failed requests

Rate Limiting

Implement client-side rate limiting:

  • Respect CDN server limitations

  • Implement connection pooling

  • Use appropriate request timeouts

  • Avoid overwhelming CDN infrastructure

Regional Restrictions

China (cn) region has special considerations:

  • Limited CDN access

  • Different server infrastructure

  • Potential connectivity restrictions

  • Require region-specific handling

Backup Servers

Community Mirrors

Several community-maintained mirrors provide NGDP content:

cdn.arctium.tools

  • Protocol: HTTP only

  • Status: Active

  • Coverage: Full NGDP content mirror

casc.wago.tools

  • Protocol: HTTP with HTTPS redirects

  • Status: Active

  • Coverage: Full NGDP mirror

archive.wow.tools

  • Protocol: HTTPS

  • Status: Active

  • Coverage: Historical NGDP content archive

Mirror Usage

# Primary CDN (preferred)
curl http://level3.blizzard.com/tpr/wow/data/12/34/1234567890abcdef

# Backup mirror
curl http://cdn.arctium.tools/tpr/wow/data/12/34/1234567890abcdef

File Types

Core Manifests

System files that define content structure:

  • root: Maps file paths to content keys

  • encoding: Maps content keys to encoded storage keys

  • install: Defines installation requirements and file tags

  • download: Specifies download priorities for streaming

  • size: Contains file size information

Storage Files

Content storage and indexing:

  • archives: Bulk content storage containers

  • indexes: Index files for locating content within archives

Encryption Files

Content protection and key management:

  • KeyRing: Encryption key storage format for protected content

File Type Usage

graph TD
    A[BuildConfig] --> B[Root File]
    A --> C[Encoding File]
    A --> D[Install Manifest]
    A --> E[Download Manifest]

    B --> F[Game Files]
    C --> G[Archive Content]
    D --> H[Installation Tags]
    E --> I[Download Priorities]

    J[CDNConfig] --> K[Archive Files]
    K --> L[Archive Indices]

    M[KeyRing] --> N[Encryption Keys]
    N --> O[Protected Content]

    style A stroke-width:4px
    style J stroke-width:4px
    style M stroke-width:3px,stroke-dasharray:5 5
    style B stroke-width:2px
    style C stroke-width:2px
    style D stroke-width:2px
    style E stroke-width:2px

Error Handling

HTTP Status Codes

  • 200: Successful content retrieval

  • 404: Content not found (may require fallback)

  • 416: Range not satisfiable (check request headers)

  • 503: Service unavailable (implement retry with backoff)

Retry Strategies

#![allow(unused)]
fn main() {
// Example retry logic
async fn download_with_retry(url: &str, max_retries: u32) -> Result<Vec<u8>> {
    let mut attempts = 0;

    loop {
        match download(url).await {
            Ok(data) => return Ok(data),
            Err(e) if attempts < max_retries => {
                attempts += 1;
                let delay = Duration::from_secs(2_u64.pow(attempts));
                tokio::time::sleep(delay).await;
            }
            Err(e) => return Err(e),
        }
    }
}
}

Content Verification

Always verify downloaded content:

  1. Check HTTP response status
  2. Verify content length if provided
  3. Validate content hash against expected value
  4. Retry from alternate CDN on mismatch

Streaming Architecture Implementation

Connection Pooling Architecture

#![allow(unused)]
fn main() {
/// Connection-pooled CDN client with retry logic
pub struct PooledCdnClient {
    /// Inner CDN client
    inner: CdnClient,
    /// Maximum concurrent connections
    max_connections: usize,
    /// Maximum retry attempts
    max_retries: usize,
    /// Initial retry delay
    retry_delay: Duration,
}

impl PooledCdnClient {
    /// Fetch range with exponential backoff retry logic
    pub async fn fetch_range_with_retry(
        &self,
        archive_hash: &str,
        offset: u64,
        size: u64,
    ) -> ArchiveResult<Vec<u8>> {
        let mut last_error = None;

        for attempt in 0..=self.max_retries {
            match self.inner.fetch_range(archive_hash, offset, size).await {
                Ok(data) => return Ok(data),
                Err(e) if attempt < self.max_retries && e.is_retryable() => {
                    // Exponential backoff: 100ms, 200ms, 400ms, 800ms...
                    let delay = self.retry_delay * (1u32 << attempt);
                    tokio::time::sleep(delay).await;
                    last_error = Some(e);
                }
                Err(e) => return Err(e),
            }
        }

        Err(last_error.unwrap_or_else(||
            ArchiveError::NetworkError("All retries exhausted".to_string())))
    }
}
}

CDN Failover Mechanisms

#![allow(unused)]
fn main() {
/// Resilient archive resolver with fallback support
pub struct ResilientArchiveResolver {
    /// Primary resolver
    primary: CdnArchiveResolver,
    /// Fallback resolvers
    fallbacks: Vec<CdnArchiveResolver>,
    /// Error threshold before switching to fallback
    error_threshold: usize,
    /// Current error count (atomic for thread safety)
    error_count: AtomicUsize,
}

impl ResilientArchiveResolver {
    /// Fetch content with automatic fallback
    pub async fn fetch_content_resilient(&self, encoding_key: &[u8; 16]) -> ArchiveResult<Vec<u8>> {
        // Try primary resolver first
        match self.primary.fetch_content(encoding_key).await {
            Ok(content) => {
                // Reset error count on success
                self.error_count.store(0, Ordering::Relaxed);
                return Ok(content);
            }
            Err(e) if e.is_permanent() => return Err(e),
            Err(e) => {
                self.error_count.fetch_add(1, Ordering::Relaxed);

                // Try fallback resolvers if error threshold exceeded
                if self.error_count.load(Ordering::Relaxed) >= self.error_threshold {
                    for fallback in &self.fallbacks {
                        if let Ok(content) = fallback.fetch_content(encoding_key).await {
                            return Ok(content);
                        }
                    }
                }

                Err(e)
            }
        }
    }
}
}

Range Request Coalescing

#![allow(unused)]
fn main() {
/// Streaming archive reader for network content
pub struct StreamingArchiveReader {
    /// CDN client for network operations
    client: Arc<PooledCdnClient>,
    /// Current archive being read
    archive_hash: String,
    /// Current offset in archive
    current_offset: u64,
    /// Remaining size to read
    remaining_size: u64,
    /// Chunk size for streaming reads (default 64KB)
    chunk_size: u64,
}

impl StreamingArchiveReader {
    /// Read next chunk with automatic coalescing
    pub async fn read_chunk(&mut self) -> ArchiveResult<Option<Vec<u8>>> {
        if self.remaining_size == 0 {
            return Ok(None);
        }

        let chunk_size = self.chunk_size.min(self.remaining_size);

        let data = self
            .client
            .fetch_range_with_retry(&self.archive_hash, self.current_offset, chunk_size)
            .await?;

        // Verify response size matches request
        if data.len() as u64 != chunk_size {
            return Err(ArchiveError::IncompleteRangeResponse {
                requested: chunk_size,
                received: data.len() as u64,
            });
        }

        self.current_offset += chunk_size;
        self.remaining_size -= chunk_size;

        Ok(Some(data))
    }

    /// Read all remaining data in one request (coalescing)
    pub async fn read_all(&mut self) -> ArchiveResult<Vec<u8>> {
        if self.remaining_size == 0 {
            return Ok(Vec::new());
        }

        let data = self
            .client
            .fetch_range_with_retry(&self.archive_hash, self.current_offset, self.remaining_size)
            .await?;

        self.current_offset += self.remaining_size;
        self.remaining_size = 0;

        Ok(data)
    }
}
}

Circuit Breaker Pattern

#![allow(unused)]
fn main() {
/// Circuit breaker states for CDN resilience
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum CircuitState {
    Closed,    // Normal operation
    Open,      // Failing fast, not attempting requests
    HalfOpen,  // Testing if service recovered
}

/// Circuit breaker for CDN endpoints
pub struct CdnCircuitBreaker {
    state: Arc<Mutex<CircuitState>>,
    failure_count: Arc<AtomicUsize>,
    failure_threshold: usize,
    timeout: Duration,
    last_failure: Arc<Mutex<Option<Instant>>>,
}

impl CdnCircuitBreaker {
    /// Execute request with circuit breaker protection
    pub async fn execute<F, T, E>(&self, request: F) -> Result<T, E>
    where
        F: Future<Output = Result<T, E>>,
        E: std::fmt::Debug,
    {
        // Check circuit state
        match *self.state.lock().unwrap() {
            CircuitState::Open => {
                // Check if timeout period has passed
                if let Some(last_failure) = *self.last_failure.lock().unwrap() {
                    if last_failure.elapsed() > self.timeout {
                        // Transition to half-open
                        *self.state.lock().unwrap() = CircuitState::HalfOpen;
                    } else {
                        return Err(/* circuit open error */);
                    }
                }
            }
            CircuitState::HalfOpen => {
                // Allow one test request
            }
            CircuitState::Closed => {
                // Normal operation
            }
        }

        // Execute request
        match request.await {
            Ok(result) => {
                // Success - reset failure count and close circuit
                self.failure_count.store(0, Ordering::Relaxed);
                *self.state.lock().unwrap() = CircuitState::Closed;
                Ok(result)
            }
            Err(error) => {
                // Failure - increment count and possibly open circuit
                let failures = self.failure_count.fetch_add(1, Ordering::Relaxed) + 1;
                if failures >= self.failure_threshold {
                    *self.state.lock().unwrap() = CircuitState::Open;
                    *self.last_failure.lock().unwrap() = Some(Instant::now());
                }
                Err(error)
            }
        }
    }
}
}

Caching Strategy

Implement efficient caching:

  • Cache configuration files with appropriate TTL

  • Use content-addressed storage for game files

  • Implement cache invalidation for updated content

  • Support offline operation with cached content

Security Considerations

  • Transport: Use HTTPS with certificate validation for all CDN requests
  • Content Integrity: Verify MD5 content hashes after download; reject mismatches and retry from an alternate CDN
  • Encryption Keys: CASC uses static community-maintained keys; see Salsa20 Encryption for key management details

Ribbit Protocol

Ribbit is a TCP-based protocol operating on port 1119 that serves as the discovery mechanism for NGDP. It provides version information, CDN endpoints, and configuration data for Blizzard products.

Protocol Variants

Ribbit has three access methods:

TCP Ribbit

Direct TCP connection to tcp://{region}.version.battle.net:1119

  • V1 Protocol: MIME-formatted responses with ASN.1 signatures and SHA-256 checksums

  • V2 Protocol: Raw BPSV responses without metadata

  • Endpoints: summary, products, certificates, and OCSP

HTTP TACT v1

HTTP wrapper at http://{region}.patch.battle.net:1119

  • Endpoints: /{product}/versions, /{product}/cdns, /{product}/bgdl

  • Response format: BPSV directly without MIME wrapping

  • No authentication: Public access

  • Connection pooling: Reusable HTTP connections

HTTPS TACT v2

Secure wrapper at https://{region}.version.battle.net (standard HTTPS port 443)

  • Same endpoints as HTTP TACT v1

  • TLS encryption: Standard HTTPS security

  • HTTP/2 support: Multiplexing for concurrent requests

  • Response format: BPSV directly

Protocol Flow

sequenceDiagram
    participant Client
    participant Ribbit as Ribbit Server
    participant Cache as Local Cache

    Client->>Cache: Check cached sequence
    Cache-->>Client: Return cached seqn

    Client->>Ribbit: TCP Connect (port 1119)
    Client->>Ribbit: Send command + \n
    Ribbit->>Ribbit: Process request
    Ribbit-->>Client: Send response
    Ribbit->>Client: Close connection

    Client->>Client: Parse response
    Client->>Client: Extract sequence number

    alt Sequence changed
        Client->>Cache: Update cache
        Client->>Client: Process new data
    else Sequence unchanged
        Client->>Client: Use cached data
    end

Endpoints

Endpoint Comparison

EndpointTCP RibbitHTTP TACT v1HTTPS TACT v2
Summaryv1/summary
Product versionsv1/products/{product}/versions/{product}/versions/{product}/versions
CDN configv1/products/{product}/cdns/{product}/cdns/{product}/cdns
Background downloadv1/products/{product}/bgdl/{product}/bgdl/{product}/bgdl
Certificatesv1/certs/{id}
OCSPv1/ocsp/{id}

Response Format Comparison

ProtocolResponse FormatSignatureChecksum
TCP Ribbit V1MIME multipart with BPSVPKCS#7/CMSSHA-256
TCP Ribbit V2Raw BPSVNoneNone
HTTP TACT v1Raw BPSVNoneNone
HTTPS TACT v2Raw BPSVNoneNone

Note: The certificate and OCSP endpoints were part of Blizzard’s custom PKI infrastructure, now replaced by system trust stores.

Certificate and Signature Verification

V1 Signature Structure

V1 responses include PKCS#7/CMS signatures for authenticity:

SignedData Structure

  • Content Type: PKCS#7 SignedData (OID: 1.2.840.113549.1.7.2)

  • Signer Identification: IssuerAndSerialNumber or SubjectKeyIdentifier

  • Certificates: Embedded in CertificateSet or fetched via SKI

  • Signed Attributes: Optional, DER-encoded as SET for verification

Supported Algorithms

Digest Algorithms:

  • SHA-256 (OID: 2.16.840.1.101.3.4.2.1)

  • SHA-384 (OID: 2.16.840.1.101.3.4.2.2)

  • SHA-512 (OID: 2.16.840.1.101.3.4.2.3)

Signature Algorithms:

  • RSA with SHA-256 (OID: 1.2.840.113549.1.1.11)

  • RSA with SHA-384 (OID: 1.2.840.113549.1.1.12)

  • RSA with SHA-512 (OID: 1.2.840.113549.1.1.13)

Verification Process

Basic Flow

  1. Extract Signature: From MIME part with Content-Disposition: signature
  2. Parse PKCS#7 Structure: Extract SignedData from ContentInfo
  3. Identify Signer: Match via IssuerAndSerialNumber or SubjectKeyIdentifier
  4. Extract Public Key: From embedded certificate or fetch via endpoint
  5. Verify Signature: Process depends on signed attributes presence
  6. Validate Checksum: SHA-256 of content matches epilogue

Signed Attributes Processing

When signed attributes are present (typical case):

  1. Re-encode as DER SET:

    • Convert from implicit [0] to SET OF (tag 0x31)
    • Sort attributes in DER canonical order
    • Apply proper DER length encoding
  2. Verify Against SET:

    • Signature verifies the DER-encoded SET
    • Message digest attribute must match content hash
  3. Without Signed Attributes:

    • Signature directly verifies message content
    • Direct RSA verification of content hash

RSA Verification Details

  • Padding Scheme: PKCS#1 v1.5

  • Key Format: Parse SubjectPublicKeyInfo to extract RSA public key

  • Signature Format: Raw signature bytes converted to RSA signature object

  • Hash Algorithms: SHA-256, SHA-384, or SHA-512 based on OID

Certificate Fetching

When certificates are not embedded:

  • Extract Subject Key Identifier from signer info

  • Request certificate via /v1/certs/{ski} endpoint

  • Validate SKI matches between signature and certificate

  • Extract public key for verification

Implementation Strategies

Parsing Approaches:

  • Primary: Use ASN.1/CMS parsing libraries

  • Fallback: Pattern-based manual parsing for compatibility

  • Handle both embedded and detached signatures

Key Extraction:

  • Parse SubjectPublicKeyInfo structure

  • Extract RSA public key in PKCS#1 format

  • Determine key size from modulus length

Critical Implementation Details:

  • SET Encoding: Signed attributes MUST be re-encoded as DER SET for verification

  • Canonical Ordering: Attributes sorted for DER canonical form

  • Dual Verification Paths: Different handling for signed vs unsigned attributes

  • Base64 Detection: Signatures may be binary or base64-encoded in MIME

Error Handling:

  • Invalid ASN.1 structures

  • Missing or mismatched certificates

  • Unsupported algorithms

  • Signature verification failures

  • DER encoding errors

Regional Servers

Available regions for {region}.version.battle.net:

  • us - United States

  • eu - Europe

  • kr - Korea

  • tw - Taiwan

  • sg - Singapore

  • cn - China (restricted to China-only access)

BPSV Format

Blizzard Pipe-Separated Values (BPSV) is the data format for responses:

Structure

  1. Header line: Column names with type annotations
  2. Data lines: Pipe-separated values
  3. Sequence line: ## seqn = {number} (exact format with spaces required)

Data Types

  • STRING:0 - Variable-length string

  • HEX:16 - 16-byte hexadecimal value (MD5 hash)

  • DEC:4 - 4-byte decimal integer

Example

Region!STRING:0|BuildConfig!HEX:16|CDNConfig!HEX:16|BuildId!DEC:4|VersionsName!String:0
us|be2bb98dc28aee05bbee519393696cdb|fac77b9ca52c84ac28ad83a7dbe1c829|61491|11.1.7.61491
eu|be2bb98dc28aee05bbee519393696cdb|fac77b9ca52c84ac28ad83a7dbe1c829|61491|11.1.7.61491
## seqn = 2241282

V1 MIME Response Structure

TCP Ribbit V1 responses use MIME multipart format:

MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="{boundary}"

--{boundary}
Content-Type: text/plain
Content-Disposition: data

[BPSV data here]

--{boundary}
Content-Type: application/octet-stream
Content-Disposition: signature

[ASN.1 signature data]

--{boundary}--
Checksum: {64-character SHA-256 hash}

The checksum validation process:

  • Search for “Checksum: “ pattern at end of response

  • Extract 64-character hexadecimal checksum

  • Compute SHA-256 hash of content before checksum line

  • Compare with provided checksum (case-insensitive)

Connection Handling

TCP Ribbit Connection Flow

graph TD
    A[Create TCP Socket] --> B[Connect to server:1119]
    B --> C[Send command + \n]
    C --> D[Read response until EOF]
    D --> E[Server closes connection]
    E --> F{Response type?}
    F -->|V1| G[Parse MIME]
    F -->|V2| H[Parse BPSV]
    G --> I[Validate checksum]
    I --> J[Extract BPSV data]
    H --> K[Process data]
    J --> K

    style E stroke-width:4px
    style A stroke-width:3px
    style F stroke-width:3px,stroke-dasharray:5 5
    style I stroke-width:2px
    style K stroke-width:2px

HTTP/HTTPS TACT Connection Flow

graph TD
    A[Connection Pool] --> B{Existing connection?}
    B -->|Yes| C[Reuse connection]
    B -->|No| D[Create new HTTP connection]
    D --> C
    C --> E[Send HTTP request]
    E --> F[Receive response]
    F --> G[Parse BPSV directly]
    G --> H[Return connection to pool]
    H --> I[Process data]

    style H stroke-width:4px,stroke-dasharray:5 5
    style A stroke-width:3px
    style B stroke-width:3px,stroke-dasharray:5 5
    style G stroke-width:2px
    style I stroke-width:2px

Key differences:

  • TCP: New connection per request, server closes after response

  • HTTP/HTTPS: Connection pooling, keep-alive, multiple requests per connection

Unified Client Architecture

Protocol Abstraction

A unified client should abstract protocol differences:

graph TD
    A[Unified NGDP Client] --> B{Protocol}
    B -->|TCP| C[Ribbit TCP Client]
    B -->|HTTP| D[TACT HTTP Client]
    B -->|HTTPS| E[TACT HTTPS Client]

    C --> F[BPSV Parser]
    D --> F
    E --> F

    C --> G[Response Types]
    D --> G
    E --> G

    style A stroke-width:4px
    style B stroke-width:3px,stroke-dasharray:5 5
    style F stroke-width:3px
    style G stroke-width:2px
    style C stroke-width:2px
    style D stroke-width:2px
    style E stroke-width:2px

Common Interface

All protocol variants share common operations:

  • Get product versions

  • Get CDN configurations

  • Get background download info

  • Parse BPSV responses

Protocol-Specific Features

TCP Ribbit Only:

  • Summary endpoint

  • Certificate/OCSP endpoints

  • MIME response parsing

  • Signature verification

HTTP/HTTPS TACT Only:

  • Connection pooling

  • HTTP/2 multiplexing

  • Standard HTTP features

Configuration Requirements

Host Configuration:

  • Default hosts: {region}.version.battle.net or {region}.patch.battle.net

  • Custom hosts: Support for private servers or testing

  • Port configuration: 1119 for TCP/HTTP, 443 for HTTPS

Connection Settings:

  • Timeout configuration (connect, read, total)

  • Retry logic (count, backoff, jitter)

  • Pool settings (max connections, idle timeout)

  • HTTP/2 settings (multiplexing, window size)

Implementation Requirements

TCP Client

  • Create new connection per request (no pooling)

  • Send ASCII command terminated with \n

  • Read response until server closes connection

  • Default connection timeout: 10 seconds

Retry Logic

Production implementations should include retry logic:

  • Default: 0 retries for backward compatibility

  • Exponential backoff: 100ms initial, 10s maximum, 2x multiplier

  • Jitter: 10% randomization to prevent thundering herd

  • Retryable: Connection, timeout, network failures

  • Non-retryable: Parse errors, validation failures

DNS Caching

Implementations may cache DNS lookups:

  • TTL: 300 seconds (5 minutes) typical

  • Multiple IPs: Try all resolved addresses sequentially

  • Thread-safe: Concurrent access protection required

Response Parsing

  • V1: Parse MIME structure, validate SHA-256 checksum

  • V2/HTTP/HTTPS: Parse BPSV directly

  • Handle empty responses (headers without data rows)

  • Parse typed column headers correctly

Caching

  • Cache responses with key: {endpoint}-{arguments}-{sequence_number}

  • Check sequence numbers to detect updates

  • Sequence numbers only increase (never decrease)

  • Skip re-downloading if sequence unchanged

Product Identifiers

Common product identifiers used with Ribbit:

World of Warcraft

  • wow - Retail

  • wow_beta - Beta

  • wow_classic - Classic

  • wow_classic_era - Classic Era

  • wow_classic_ptr - Classic PTR

  • wow_classic_titan - Classic Titan (CN region only, WotLK 3.80.x with upgraded Classic/TBC raids)

  • wow_anniversary - Classic Anniversary (TBC 2.5.x, progression through Classic branches on a shortened timeline)

  • wowt - Public Test Realm

  • wowz - Internal/Development

Other Products

  • agent - Battle.net Agent

  • bna - Battle.net Application

Version Response Fields

FieldTypeDescription
RegionSTRING:0Region identifier
BuildConfigHEX:16Build configuration hash
CDNConfigHEX:16CDN configuration hash
KeyRingHEX:16Encryption keys hash
BuildIdDEC:4Build number
VersionsNameString:0Version string
ProductConfigHEX:16Product configuration hash

CDN Response Fields

FieldTypeDescription
NameSTRING:0CDN name
PathSTRING:0Base path for content
HostsSTRING:0Space-separated host list
ServersSTRING:0Full URLs with parameters
ConfigPathSTRING:0Path to configuration files

Error Handling

Connection Errors

  • Connection timeout: Implement 10-30 second timeout (not automatic)

  • CN region: Only accessible from within China (will timeout from elsewhere)

  • Network failures: TCP connection may fail or drop

Response Errors

  • Empty responses: Some endpoints return headers only (especially bgdl)

  • 404 errors: Not all products have all endpoints

  • Malformed MIME: V1 responses may have invalid structure

  • Invalid checksum: V1 checksum validation may fail

  • Buffer overflow: No standard response size limit

Parsing Errors

  • Type inconsistency: Handle String:0 vs STRING:0 in BPSV

  • Column mismatch: Data rows may not match header count

  • Invalid sequence format: Must match ## seqn = exactly (with space after equals)

  • Escaped characters: Pipe characters in values not escaped

Implementation Notes

Buffer Management

  • Use appropriate buffer sizes for TCP reads (typically 4KB-8KB)

  • Stream responses to avoid loading entire response in memory

  • No standard maximum response size - implement limits as needed

MIME Parsing Complexity

  • V1 MIME parsing requires multipart message handling

  • Consider using established MIME libraries

  • First chunk typically contains BPSV data

  • Signature chunk identified by Content-Disposition header

Ribbit Server

cascette-ribbit implements a Ribbit protocol server that serves BPSV-formatted game version and CDN configuration data over HTTP and TCP.

For protocol specification details, see Ribbit Protocol.

Architecture

graph TD
    A[cascette-ribbit binary] --> B[Server]
    B --> C[HTTP Server - axum]
    B --> D[TCP Server - tokio]
    C --> E[AppState]
    D --> E
    E --> F[BuildDatabase]
    E --> G[CdnConfig]

    C --> H[HTTP Handlers]
    H --> I[BpsvResponse]

    D --> J{Protocol Version}
    J -->|v1| K[MIME Wrapper + SHA-256]
    J -->|v2| L[Raw BPSV]
    K --> I
    L --> I

Components

ComponentFilePurpose
ServerConfigconfig.rsCLI arguments, env vars, TLS paths
CdnConfigconfig.rsCDN host/path resolution per region
BuildDatabasedatabase.rsJSON build record storage with product indexing
BuildRecorddatabase.rsSingle build entry with MD5 hash validation
AppStateserver.rsShared state (database, CDN config, timestamps)
Serverserver.rsOrchestrates HTTP + TCP listeners
BpsvResponseresponses/bpsv.rsBPSV response builder (versions, cdns, summary)
HTTP handlershttp/handlers.rsaxum route handlers for /{product}/{endpoint}
TCP handlerstcp/handlers.rsCommand routing for v1/ and v2/ prefixes
V1 wrappertcp/v1.rsRFC 2046 MIME wrapping with SHA-256 checksums
V2 handlertcp/v2.rsRaw BPSV TCP responses

Configuration

CLI Arguments and Environment Variables

FlagEnv VarDefaultDescription
--http-bindCASCETTE_RIBBIT_HTTP_BIND0.0.0.0:8080HTTP listen address
--tcp-bindCASCETTE_RIBBIT_TCP_BIND0.0.0.0:1119TCP listen address
--buildsCASCETTE_RIBBIT_BUILDS./builds.jsonPath to build database JSON
--cdn-hostsCASCETTE_RIBBIT_CDN_HOSTScdn.arctium.toolsCDN host(s)
--cdn-pathCASCETTE_RIBBIT_CDN_PATHtpr/wowCDN base path
--tls-certCASCETTE_RIBBIT_TLS_CERTnoneTLS certificate path (enables HTTPS)
--tls-keyCASCETTE_RIBBIT_TLS_KEYnoneTLS private key path

Build Database Format

The server reads build records from a JSON file. Each record represents a product build:

FieldTypeRequiredDescription
idu64yesUnique build identifier
productstringyesProduct code (e.g., wow, wowt)
versionstringyesVersion string (e.g., 1.14.2.42597)
buildstringyesBuild number
build_configstringyes32-char hex MD5 hash
cdn_configstringyes32-char hex MD5 hash
keyringstringno32-char hex MD5 hash
product_configstringno32-char hex MD5 hash
build_timestringyesISO 8601 timestamp
encoding_ekeystringyes32-char hex encoding key
root_ekeystringyes32-char hex root key
install_ekeystringyes32-char hex install key
download_ekeystringyes32-char hex download key

MD5 hash fields are validated to be exactly 32 lowercase hexadecimal characters.

HTTP Endpoints

The HTTP server uses axum with gzip compression and CORS support.

Routes

RouteHandlerResponse
GET /{product}/versionshandle_versionsBPSV versions table
GET /{product}/cdnshandle_cdnsBPSV CDN configuration
GET /{product}/bgdlhandle_bgdlBPSV background download (same as versions)

All responses use Content-Type: text/plain; charset=utf-8.

Returns HTTP 404 if the product is not found in the database.

TCP Protocol

The TCP server accepts one command per connection. After sending the response, the server closes the connection. A 10-second read timeout applies.

V2 Commands (Raw BPSV)

  • v2/products/{product}/versions
  • v2/products/{product}/cdns
  • v2/products/{product}/bgdl

V1 Commands (MIME-wrapped)

  • v1/products/{product}/versions
  • v1/products/{product}/cdns
  • v1/products/{product}/bgdl
  • v1/summary

V1 responses wrap BPSV data in RFC 2046 MIME multipart format with a SHA-256 checksum epilogue. The server does not include PKCS#7 signatures (unlike Blizzard’s production servers).

BPSV Response Format

Versions Response

7 rows, one per region (us, eu, cn, kr, tw, sg, xx):

Region!STRING:0|BuildConfig!HEX:16|CDNConfig!HEX:16|KeyRing!HEX:16|BuildId!DEC:4|VersionsName!STRING:0|ProductConfig!HEX:16
us|0123456789abcdef...|fedcba9876543210...|<keyring>|42597|1.14.2.42597|<product_config>
eu|...|...|...|...|...|...
...
## seqn = 1730534400

CDNs Response

5 rows, one per CDN region (us, eu, kr, tw, cn):

Name!STRING:0|Path!STRING:0|Hosts!STRING:0|Servers!STRING:0|ConfigPath!STRING:0
us|tpr/wow|cdn.arctium.tools|https://cdn.arctium.tools/?maxhosts=4|tpr/wow/config
...
## seqn = 1730534400

Summary Response (TCP v1 only)

One row per product:

Product!STRING:0|Seqn!DEC:4
wow|1730534400
wowt|1730534400
## seqn = 1730534400

Running

Binary

cargo run --bin cascette-ribbit -- --builds ./builds.json

Library

#![allow(unused)]
fn main() {
use cascette_ribbit::{Server, ServerConfig};

let config = ServerConfig {
    http_bind: "127.0.0.1:8080".parse()?,
    tcp_bind: "127.0.0.1:1119".parse()?,
    builds: "./builds.json".into(),
    cdn_hosts: "cdn.arctium.tools".to_string(),
    cdn_path: "tpr/wow".to_string(),
    tls_cert: None,
    tls_key: None,
};

config.validate()?;
let server = Server::new(config)?;
server.run().await?;
}

Example

cargo run --example simple_server

Then test with:

# HTTP
curl http://localhost:8080/wow/versions
curl http://localhost:8080/wow/cdns

# TCP v2
echo "v2/products/wow/versions" | nc localhost 1119

# TCP v1
echo "v1/products/wow/versions" | nc localhost 1119

Testing

The crate has four test suites:

SuiteFileCoverage
HTTP integrationtests/http_test.rsHTTP endpoints, status codes, BPSV format
TCP v1 integrationtests/tcp_v1_test.rsMIME wrapping, checksums, summary
TCP v2 integrationtests/tcp_v2_test.rsRaw BPSV over TCP, connection lifecycle
Contract teststests/contract_test.rscascette-protocol client against server

Contract tests verify that cascette-protocol’s RibbitTactClient can query the server and parse responses correctly. This ensures wire-level compatibility between client and server implementations.

cargo test -p cascette-ribbit
cargo bench -p cascette-ribbit

TLS Support

Enable TLS with the tls feature flag:

cargo run --bin cascette-ribbit --features tls -- \
  --tls-cert /path/to/cert.pem \
  --tls-key /path/to/key.pem

When TLS is enabled, the HTTP server serves HTTPS. The TCP server is not affected (Ribbit TCP does not use TLS).

Battle.net Agent

The Battle.net Agent is a local HTTP service that manages game installations and updates. It runs on port 1120 and provides an API for downloading, installing, and managing Blizzard products.

Overview

The agent serves as the bridge between Blizzard’s CDN infrastructure and the local CASC storage. It handles:

  • Product installation and updates
  • Download management and prioritization
  • Local CASC storage maintenance
  • Installation verification and repair

HTTP API

The agent exposes a REST API on http://127.0.0.1:1120.

Endpoints

Documentation of the agent’s HTTP endpoints is pending.

Installation Flow

When installing a product, the agent:

  1. Queries Ribbit for product version information
  2. Downloads build and CDN configuration
  3. Fetches encoding and root manifests
  4. Downloads required archives from CDN
  5. Writes data to local CASC storage
  6. Updates local indices

cascette-agent

cascette-agent is a replacement implementation of the Battle.net Agent. It provides the same HTTP API on port 1120 and can be used as a drop-in replacement for:

  • Downloading products from official Blizzard CDNs
  • Fallback to community archive mirrors (cdn.arctium.tools)
  • Managing local CASC installations

Differences from Official Agent

  • Open source implementation
  • Supports community CDN mirrors
  • Cross-platform (Linux, macOS, Windows)
  • No Battle.net account required for public content

References

CASC Local Storage

Local CASC storage is the on-disk format used by the Battle.net client to store game data. Unlike CDN archives which are content-addressed, local storage uses optimized indices for fast file lookups.

Directory Structure

A typical CASC installation has the following structure:

<install-dir>/
├── .build.info               # Build configuration (BPSV format)
├── Data/
│   ├── data/
│   │   ├── 0000000001.idx    # Local index files (16 buckets)
│   │   ├── 0100000001.idx
│   │   ├── ...
│   │   ├── 0f00000001.idx
│   │   ├── data.000          # Combined archive data
│   │   ├── data.001
│   │   ├── ...
│   │   └── *.shmem           # Shared memory control file (temp)
│   ├── indices/
│   │   └── ...               # CDN index files (not local storage)
│   ├── residency/            # Download state tracking tokens
│   ├── ecache/               # Encoding cache
│   └── hardlink/             # Hard link trie directory
└── Cache/
    └── ADB/                  # Hotfix database cache
        └── *.bin

Local .idx index files and .data archive files both reside in Data/data/. The Data/indices/ directory holds CDN index files, which are a separate concern from local storage.

Container Types

CASC manages four container types for local storage:

TypeSizePurpose
Dynamic0x3c bytesRead/write CASC archives (.data files)
StaticRead-only archives (shared installations)
Residency0x30 bytesFile state tracking (.residency tokens)
Hard Link0x30 bytesFilesystem hard links (trie directory)

The Dynamic container is the primary read-write storage. It manages archive segments, key state tracking, and shared memory coordination. Access modes: 0=none, 1=read-only, 2=read-write, 3=exclusive.

Index Files (.idx)

Local indices use IDX Journal v7 format with little-endian headers (unlike most NGDP formats which use big-endian).

  • Key size: 9 bytes (truncated encoding keys)
  • Location size: 5 bytes (1 byte archive high + 4 bytes packed)
  • Entry size: 18 bytes (9 key + 5 location + 4 size)
  • Bucket distribution: 16 index buckets (0x00-0x0F)

The 9-byte key truncation saves space while maintaining sufficient uniqueness for local lookups. Keys are encoding keys, not content keys.

Index File Format

Each .idx file contains guarded blocks with Jenkins hash validation:

[GuardedBlockHeader]  (8 bytes: size + Jenkins hash)
[IndexHeaderV2]       (16 bytes: version, bucket, field sizes, segment_size)
[padding]             (8 bytes: hash/alignment)
[GuardedBlockHeader]  (8 bytes: entry block size + Jenkins hash)
[IndexEntry[]]        (N * 18 bytes: sorted by key)

Index Filename Format

{bucket:02x}{version:08x}.idx

Example: 0a00000003.idx = bucket 0x0A, version 3. Total filename length is 14 characters (10 hex digits + .idx).

Bucket Assignment

Files are assigned to index buckets using the XOR-fold algorithm on the first 9 bytes of the encoding key:

hash = key[0] ^ key[1] ^ key[2] ^ key[3] ^ key[4] ^ key[5] ^ key[6] ^ key[7] ^ key[8]
bucket = (hash & 0x0F) ^ (hash >> 4)

Agent uses a flush-and-bind pattern with 3-retry atomic commits when writing index files.

Key Mapping Table (KMT)

Below the index files, CASC maintains a Key Mapping Table (KMT) as the primary on-disk structure for key-to-location resolution:

  • Two-tier LSM-tree: sorted section (0x12-byte entries) + update section (0x200-byte pages)
  • Jenkins lookup3 hashes for bucket distribution
  • 9-byte EKey prefix binary search within sorted sections
  • Update section uses 0x200-byte (512-byte) pages with 0x15 (21) entries per page (minimum 0x7800 bytes)

Data Files (data.NNN)

Data files contain BLTE-encoded content. Each entry has a 30-byte (0x1E) local header before the BLTE data:

Offset  Size  Field
0x00    16    Encoding key (reversed byte order)
0x10    4     Size including header (big-endian)
0x14    2     Flags
0x16    4     ChecksumA
0x1A    4     ChecksumB
0x1E    ...   BLTE data

Archive Location Packing

The 5-byte archive location in index entries encodes both archive ID and offset:

Byte 0:      archive_id >> 2 (high 8 bits)
Bytes 1-4:   (archive_id_low << 30) | (offset & 0x3FFFFFFF) (big-endian)

This gives 10-bit archive IDs (max 1023) and 30-bit offsets (max ~1 GiB).

Container Index

Agent maintains a ContainerIndex with 16 segments and supports frozen/thawed archive management:

  • Segments can be frozen (read-only) or thawed (writable)
  • 0x1E-byte reconstruction headers per archive entry
  • Segment limit configurable up to 0x3FF (1023)
  • Per-segment tracking: 0x40 (64) bytes per segment in compactor state

Shared Memory (shmem)

The shmem file provides memory-mapped coordination between the Agent process and game clients:

  • Protocol versions 4 (base) and 5 (exclusive access flag at DWORD index 0x54)
  • Free space table format identifier at DWORD index 0x42 (value 0x2AB8)
  • V5 PID tracking: slot array with PID (u32) and mode (u32) per slot
  • Writer lock: named global mutex with Global\ prefix
  • DACL: D:(A;;GA;;;WD)(A;;GA;;;AN) (grant all to Everyone + Anonymous)
  • Retry logic: 10 attempts with Sleep(0) between failures
  • .lock file with 10-second backoff for coordination

LRU Cache

Agent maintains an LRU cache in shared memory:

  • Linked-list table structure
  • Generation-based checkpoints for eviction
  • 20-character hex filenames with .lru extension

.build.info

The .build.info file contains installation metadata in BPSV format:

  • Product code and region
  • Active build configuration hash
  • CDN configuration hash
  • Installation tags and flags

Residency Tracking

The Residency container tracks which content keys are fully downloaded:

  • .residency token files mark valid containers
  • Byte-span tracking for partial downloads (header and data residency)
  • Reserve, mark-resident, remove, query operations
  • Scanner API for enumeration
  • Drive type check prevents unsupported storage media

The Hard Link container uses a TrieDirectory for content sharing:

  • Hard links allow multiple keys to reference the same physical file
  • 32-character hex filename validation
  • Unlinked key collection (link count <= 1)
  • Recursive compaction
  • LRU file descriptor cache with two open modes (handle vs async IO)
  • 3-retry delete before hard link creation
  • Falls back to residency when hard links are unsupported

Maintenance Operations

Compaction

Two-phase process: archive merge then extract-compact.

  • Defrag algorithm: removes gaps between files, reorganizes positions
  • Fillholes algorithm: estimates free space without moving data
  • Merge threshold: float in [0.0, 0.4]
  • Async read/write pipeline with 128 KB minimum buffer
  • Per-segment span validation with overlap detection

Garbage Collection

4-stage pipeline:

  1. Remove unreferenced keys from dynamic container
  2. Remove obsolete config files
  3. Remove CDN index files
  4. Clean up empty directories recursively

Build Repair

Multi-stage pipeline using marker files for crash recovery:

  • RepairMarker.psv (pipe-separated, writable keys)
  • CASCRepair.mrk (V2 marker format)
  • Stages: read config, init CDN index, repair containers (data/ecache/hardlink sequentially), data repair, post-repair cleanup

Differences from CDN Storage

AspectCDNLocal
Key size16 bytes9 bytes (truncated)
Key typeContent keysEncoding keys
OrganizationPer-archive indices16-bucket index files
Entry headerNone30-byte local header
Index formatCDN index footerIDX Journal v7 with guarded blocks
MutabilityImmutableUpdated during patches
ContainersSingle type4 types (dynamic/static/residency/hardlink)

References

CDN Content Caching

The cascette-cache crate provides multi-layer caching for NGDP/CDN content. It optimizes network bandwidth and latency by caching frequently accessed data at multiple levels.

Architecture

graph TD
    A[Application] --> B[Multi-Layer Cache]
    B --> C[L1: Memory Cache]
    B --> D[L2: Disk Cache]
    D --> E[CDN]
    C --> E

    subgraph "Cache Layers"
        C
        D
    end

L1: Memory Cache

Fast in-memory cache with LRU eviction:

  • Immediate access for hot data
  • Size-based eviction when memory limit reached
  • TTL-based expiration for stale data
  • Zero-copy data sharing with bytes::Bytes

L2: Disk Cache

Persistent disk cache for larger datasets:

  • Survives application restarts
  • Atomic writes with fsync for durability
  • Configurable storage limits
  • Asynchronous I/O with tokio

NGDP-Specific Caches

The crate provides specialized caches for NGDP content types:

Resolution Cache

Caches the NGDP resolution chain:

Root File → Content Key
Content Key → Encoding Key
Encoding Key → CDN Location

Content-Addressed Cache

Stores content by its MD5 hash (ContentKey):

  • Automatic validation on retrieval
  • Deduplication across builds
  • Supports partial content access

BLTE Block Cache

Caches individual BLTE blocks for large files:

  • Enables partial file access without full download
  • Block-level validation
  • Decompressed and raw block storage

Archive Range Cache

Caches byte ranges from CDN archives:

  • Coalesces nearby requests into larger ranges
  • Reduces CDN round-trips
  • Supports range request optimization

Memory Pooling

NGDP files have predictable size distributions. The memory pool uses size classes optimized for these patterns:

Size ClassRangeTypical Content
Small< 16 KBConfig files, small assets
Medium< 256 KBMost game files
Large< 8 MBTextures, models
Huge> 8 MBLarge archives, cinematics

Benefits:

  • Reduced allocation overhead
  • Better memory locality
  • Thread-local pools for zero-contention

Content Validation

All cached content is validated on retrieval:

MD5 Validation

Content keys are MD5 hashes of the data:

#![allow(unused)]
fn main() {
let content_key = ContentKey::from_data(&data);
// Cache validates: MD5(data) == content_key
}

Jenkins96 Validation

Archive indices use Jenkins96 for fast hashing:

#![allow(unused)]
fn main() {
let hash = Jenkins96::hash(path.as_bytes());
// Validates archive index lookups
}

TACT Key Validation

Encrypted content requires TACT key verification before decryption.

SIMD Optimizations

Hash operations use SIMD acceleration when available:

Instruction SetVector WidthSpeedup
SSE2128-bit2x
SSE4.1128-bit2x
AVX2256-bit4x
AVX-512512-bit8x

Runtime CPU detection selects the best available implementation.

Configuration

Memory Cache

#![allow(unused)]
fn main() {
MemoryCacheConfig {
    max_size: 256 * 1024 * 1024,  // 256 MB limit
    ttl: Duration::from_secs(3600), // 1 hour TTL
    eviction_batch_size: 100,      // Evict 100 items at a time
}
}

Disk Cache

#![allow(unused)]
fn main() {
DiskCacheConfig {
    cache_dir: PathBuf::from("/var/cache/cascette"),
    max_size: 10 * 1024 * 1024 * 1024, // 10 GB limit
    sync_writes: true,                  // fsync after writes
}
}

Multi-Layer

#![allow(unused)]
fn main() {
MultiLayerConfig {
    l1: MemoryCacheConfig::default(),
    l2: DiskCacheConfig::default(),
    write_through: true,  // Write to both layers
    promote_on_hit: true, // Copy L2 hits to L1
}
}

CDN Integration

The cache integrates with CDN clients for miss handling:

sequenceDiagram
    participant App
    participant L1 as Memory Cache
    participant L2 as Disk Cache
    participant CDN

    App->>L1: get(key)
    alt L1 Hit
        L1-->>App: data
    else L1 Miss
        L1->>L2: get(key)
        alt L2 Hit
            L2-->>L1: data
            L1-->>App: data
        else L2 Miss
            L2->>CDN: fetch(key)
            CDN-->>L2: data
            L2-->>L1: data
            L1-->>App: data
        end
    end

Features:

  • Automatic CDN fallback on cache miss
  • Retry logic with exponential backoff
  • Multiple CDN endpoint failover
  • Range request support for partial content

Streaming

Large files are processed in chunks to avoid memory exhaustion:

#![allow(unused)]
fn main() {
StreamingConfig {
    chunk_size: 64 * 1024,      // 64 KB chunks
    max_buffered_chunks: 16,    // 1 MB max buffer
    validate_chunks: true,      // Validate each chunk
}
}

Streaming enables:

  • Processing files larger than available memory
  • Progressive validation during download
  • Early error detection

Metrics

The cache tracks performance metrics:

  • Hit rate (L1, L2, overall)
  • Miss rate and CDN fallback frequency
  • Eviction counts and reasons
  • Memory and disk usage
  • Validation success/failure rates

References

CDN Mirroring and Archival Strategy

Overview

This document outlines strategies for mirroring Blizzard’s CDN content for WoW using NGDP/CASC.

Note: Python code examples in this document are conceptual pseudocode illustrating mirroring workflows. For working code, see the cascette mirror CLI command or reference implementations in References.

Rationale for Mirroring

Blizzard removes older builds from CDN within days to weeks of new patches (see Archival Urgency below). Mirroring preserves builds that would otherwise be lost, enabling:

  • Preservation: Maintain access to historical builds after CDN removal
  • Development: Test CASC implementations against known data offline
  • Performance: Local access avoids CDN latency and bandwidth limits

Target Products

Focus on World of Warcraft products:

Product CodeDescriptionUpdate Frequency
wowRetail/LiveWeekly patches
wowtPublic Test RealmFrequent updates
wow_betaBeta serversDaily during beta
wow_classicClassic (Wrath/Cata)Bi-weekly
wow_classic_eraClassic Era (Vanilla)Rare updates
wow_classic_ptrClassic PTRDuring test cycles
wow_classic_titanClassic Titan (CN only, WotLK 3.80.x)Unknown
wow_anniversaryClassic Anniversary (TBC 2.5.x)Unknown

Archival Urgency

Based on testing CDN retention windows:

ProductRetention WindowArchival Priority
wow (Retail)14-15 daysHigh - Daily checks
wow_classic2-4 weeksMedium - Weekly checks
wow_classic_era~3 monthsLow - Monthly checks
wow_beta7-10 daysCritical - Continuous
wowt (PTR)10-14 daysHigh - Every 2-3 days

Critical Finding: Retail builds disappear within 2 weeks of new patches.

Build Discovery

Track new builds via Ribbit protocol:

Sequence Number Monitoring

# Query summary endpoint
echo -e "v1/summary\r\n" | nc us.version.battle.net 1119

# Response includes sequence numbers
## seqn = 2241282

Monitor sequence number changes:

async def check_for_updates():
    summary = await ribbit_client.get_summary()

    for product in summary.products:
        stored_seqn = database.get_sequence(product.name)

        if product.seqn > stored_seqn:
            # New build detected!
            await process_new_build(product)
            database.update_sequence(product.name, product.seqn)

Version Information

# Get specific product versions
echo -e "v1/products/wow/versions\r\n" | nc us.version.battle.net 1119

CDN Path Discovery

Critical: Always Extract CDN Paths

# Get CDN information - NEVER hardcode paths!
echo -e "v1/products/wow/cdns\r\n" | nc us.version.battle.net 1119

Example response:

Region!STRING:0|Hosts!STRING:0|Path!STRING:0|ConfigPath!STRING:0
us|level3.blizzard.com edgecast.blizzard.com|tpr/wow|tpr/configs/data
eu|level3.blizzard.com edgecast.blizzard.com|tpr/wow|tpr/configs/data

CRITICAL: The Path field (tpr/wow) must be used for URL construction:

# CORRECT - Uses path from CDN response
cdn_url = f"http://{host}/{path}/data/{hash[:2]}/{hash[2:4]}/{hash}"

# WRONG - Hardcoded path
cdn_url = f"http://{host}/tpr/wow/data/{hash[:2]}/{hash[2:4]}/{hash}"

All WoW products use tpr/wow regardless of product code:

  • wow, wow_classic, wow_classic_era, wow_classic_titan, wow_anniversary all use tpr/wow

  • Never assume paths based on product names

Essential Files

Priority order for archival:

1. Configuration Files (Critical)

  • BuildConfig: Build-specific settings

  • CDNConfig: CDN and archive information

  • ProductConfig: Product metadata

2. System Files (Required)

  • Encoding: Content key mappings (~500MB-2GB)

  • Root: File manifest

  • Install: Installation manifest

  • Download: Download priority

3. Indices (Important)

  • Archive indices (.index files)

  • Patch indices for updates

4. Data Archives (Bulk)

  • Archive files (data.###)

  • Largest storage requirement

  • Can be fetched on-demand

Mirroring Architecture

Storage Structure

/mirror
├── configs/
│   └── data/
│       ├── {hash[0:2]}/
│       │   └── {hash[2:4]}/
│       │       └── {hash}
├── data/
│   ├── {hash[0:2]}/
│   │   └── {hash[2:4]}/
│   │       └── {hash}
├── indices/
│   └── *.index
└── metadata.db

Database Schema

CREATE TABLE builds (
    id SERIAL PRIMARY KEY,
    product VARCHAR(50),
    build_config VARCHAR(32),
    cdn_config VARCHAR(32),
    build_name VARCHAR(100),
    detected_at TIMESTAMP,
    archived BOOLEAN DEFAULT FALSE
);

CREATE TABLE files (
    hash VARCHAR(32) PRIMARY KEY,
    size BIGINT,
    type VARCHAR(20),
    downloaded_at TIMESTAMP
);

Download Strategy

Priority-Based Downloading

class MirrorStrategy:
    def __init__(self):
        self.priorities = {
            'configs': 1,      # Highest priority
            'encoding': 2,
            'root': 3,
            'install': 4,
            'indices': 5,
            'data': 10        # Lowest priority
        }

    async def mirror_build(self, build_info):
        # 1. Download configs first
        await self.download_configs(build_info)

        # 2. Get encoding file
        encoding = await self.download_encoding(build_info)

        # 3. Download indices
        indices = await self.download_indices(build_info)

        # 4. Optional: Download data archives
        if self.full_mirror:
            await self.download_archives(indices)

Bandwidth Management

  • Concurrent downloads: 4-8 connections

  • Rate limiting: Respect CDN limits

  • Retry logic: Handle transient failures

  • Resume support: Continue interrupted downloads

Incremental Updates

Track changes efficiently:

async def incremental_update(product):
    current_build = await get_current_build(product)
    stored_build = database.get_latest_build(product)

    if current_build != stored_build:
        # Download only new/changed files
        new_files = await diff_builds(current_build, stored_build)
        await download_files(new_files)

        database.update_build(product, current_build)

Verification

Ensure data integrity:

Hash Verification

def verify_file(filepath, expected_hash):
    actual_hash = calculate_md5(filepath)
    if actual_hash != expected_hash:
        raise IntegrityError(f"Hash mismatch: {filepath}")

Archive Integrity

  • Verify BLTE headers

  • Check chunk checksums

  • Validate encoding entries

Storage Optimization

Deduplication

Content-addressed storage automatically deduplicates:

def store_file(content, hash):
    path = get_path_from_hash(hash)
    if not os.path.exists(path):
        # Only store if not already present
        write_file(path, content)

Compression

  • Keep BLTE files compressed

  • Use filesystem compression for configs

  • Consider archive formats for old builds

Historical Build Recovery

Using External Sources

  1. Community Archives:

    • Shared build collections
    • Private archives
  2. Wayback Machine:

    • Historical Ribbit responses
    • Cached configuration files
  3. Torrent archives:

    • Community-shared build collections
    • Distributed preservation efforts

Reconstruction

Rebuild missing content:

flowchart TD
    A[Partial Build] --> B[Identify Missing]
    B --> C[Search Mirrors]

    C --> D{Found?}
    D -->|Yes| E[Download Missing]
    D -->|No| F[Check Archives]

    F --> G{In Archive?}
    G -->|Yes| H[Extract Content]
    G -->|No| I[Search Community]

    E --> J[Verify Hashes]
    H --> J

    I --> K{Available?}
    K -->|Yes| L[Request Copy]
    K -->|No| M[Document Gap]

    L --> J
    J --> N[Update Archive]
    M --> O[Gap Report]

    style A stroke-width:4px
    style N stroke-width:4px
    style O stroke-width:3px,stroke-dasharray:5 5
    style D stroke-width:3px,stroke-dasharray:5 5
    style G stroke-width:3px,stroke-dasharray:5 5
    style K stroke-width:3px,stroke-dasharray:5 5
    style J stroke-width:2px
    style B stroke-width:2px

Fair Use

Archival under fair use principles:

  • Research: Academic study of game development

  • Education: Teaching game architecture

  • Preservation: Cultural heritage of gaming

  • Non-commercial: No monetization of archives

Best Practices

  • Respect intellectual property

  • Don’t distribute copyrighted content

  • Use for personal/research purposes

  • Cooperate with takedown requests

Reference Implementations

For detailed analysis of NGDP/CASC reference implementations, see references.md.

Key implementations examined:

  • CascLib: Complete C++ library with 10+ years of development

  • TACT.Net: C# architecture with modular design

  • rustycasc: Rust implementation with type safety

  • BlizzTrack: Production monitoring with database persistence

  • blizztools: Rust CLI for NGDP operations

  • blizzget: C++ downloader with custom version support

  • tactmon: Advanced C++ monitoring with template ORM

  • TACTSharp: .NET extraction library with memory-mapped files

These implementations informed cascette-rs design for CDN interaction and content resolution.

Implementation Examples

Build Tracker

class BuildTracker:
    def __init__(self, products):
        self.products = products
        self.check_interval = 300  # 5 minutes

    async def run(self):
        while True:
            for product in self.products:
                await self.check_product(product)
            await asyncio.sleep(self.check_interval)

    async def check_product(self, product):
        try:
            versions = await ribbit.get_versions(product)
            cdns = await ribbit.get_cdns(product)

            for region in versions.regions:
                build_config = region.build_config
                if not self.is_archived(build_config):
                    await self.archive_build(product, region, cdns)
        except Exception as e:
            logger.error(f"Failed to check {product}: {e}")

Archive Manager

class ArchiveManager:
    def __init__(self, storage_path):
        self.storage = storage_path
        self.cdn_client = CDNClient()

    async def archive_build(self, build_info):
        # Create build directory
        build_dir = self.storage / build_info.product / build_info.build_config
        build_dir.mkdir(parents=True, exist_ok=True)

        # Download in priority order
        await self.download_configs(build_info)
        await self.download_encoding(build_info)
        await self.download_root(build_info)

        # Mark as archived
        self.mark_archived(build_info)

Monitoring and Alerts

Health Checks

class MirrorHealth:
    async def check_health(self):
        return {
            'disk_space': self.check_disk_space(),
            'cdn_connectivity': await self.check_cdn(),
            'database': self.check_database(),
            'last_check': datetime.now()
        }

    def check_disk_space(self):
        usage = shutil.disk_usage(self.storage_path)
        return {
            'used': usage.used,
            'free': usage.free,
            'percent': (usage.used / usage.total) * 100
        }

Disaster Recovery

Backup Strategy

  1. Primary Mirror: Fast SSD storage
  2. Secondary Backup: HDD archive
  3. Cloud Backup: Critical configs only
  4. Community Sharing: Torrent distribution

Recovery Procedures

# Restore from backup
rsync -av /backup/mirror/ /primary/mirror/

# Verify integrity
find /mirror -type f -name "*.index" | xargs -I {} md5sum {}

# Rebuild database
python rebuild_metadata.py /mirror

Community Coordination

Shared Resources

  • Mirror status: Track who has what builds

  • Gap identification: Find missing builds

  • Bandwidth sharing: Distribute download load

  • Verification: Cross-check integrity

Future Considerations

  • Automated build discovery with predictive downloading before CDN removal
  • Differential compression between builds to reduce storage
  • Geographic replication for redundancy

Tools and Resources

Existing Tools

  • CASCExplorer: Browse CASC archives

  • WoW.tools: Online CASC viewer

  • TACTSharp: .NET extraction library

  • CascLib: C++ CASC library

Monitoring Services

  • BlizzTrack: Real-time build tracking

  • Wago.tools: API for build information

Community

  • Discord servers: Coordinate archival efforts

  • GitHub repos: Share tools and scripts

  • Forums: Technical discussions

The 14-15 day retention window for retail WoW makes automated monitoring and archival essential.

Reference Implementations

This document lists NGDP/CASC implementations useful for understanding the system. These projects have informed cascette-rs development and serve as references for format details and edge cases.

C++ Implementations

ladislav-zezula/CascLib

The original C++ CASC library by the author of StormLib (MPQ library).

heksesang/CascLib

C++17 header-only library from the WoW 6.0 era.

  • Repository: https://github.com/heksesang/CascLib
  • Use for: Simplified CASC reading, header-only integration
  • Note: Early implementation, lacks modern features (LZMA, LZ4, Zstd, encryption)

C# Implementations

Marlamin/CascLib

C# fork with WoW-specific enhancements, used by wow.tools.

  • Repository: https://github.com/Marlamin/CascLib
  • Use for: Encryption keys, root handlers, CDN index parsing, BLTE decoding
  • Features: Game-specific root handlers for 20+ Blizzard titles

wowdev/TACTSharp

Memory-mapped C# implementation focused on performance.

wowdev/TACT.Net

C# library for TACT extraction operations.

  • Repository: https://github.com/wowdev/TACT.Net
  • Use for: Extraction patterns, multiple input/output formats
  • Features: EKey, CKey, FileDataID, and filename-based extraction

WowDevTools/CASCHost

Server-side CASC hosting for modding.

  • Repository: https://github.com/WowDevTools/CASCHost
  • Use for: CASC building, CDN structure generation, content serving
  • Note: Server-focused (produces content), opposite of cascette-rs (consumes content)

danielsreichenbach/BuildBackup

C# CDN backup tool (maintained fork of TACTAdder).

Rust Implementations

ferronn-dev/rustycasc

Rust CASC types and FrameXML extractor.

ohchase/blizztools

Rust CLI for NGDP/TACT operations.

  • Repository: https://github.com/ohchase/blizztools
  • Use for: Ribbit protocol, install manifest parsing, async download patterns
  • Features: Version queries, manifest parsing, file downloads

Other Tools

Warpten/tactmon

C++ CDN tracker with Ribbit monitoring.

  • Repository: https://github.com/Warpten/tactmon
  • Use for: Ribbit protocol implementation, CDN monitoring, product tracking
  • Features: Template-based ORM, database persistence, production monitoring

funjoker/blizzget

Windows GUI CDN downloader.

Kruithne/wow.export

Node.js/TypeScript export toolkit.

Marlamin/wow.tools.local

Local wow.tools implementation.

Community Resources

wowdev.wiki

Community wiki documenting WoW file formats and systems.

wago.tools

Build database with 1,900+ WoW builds.

Community CDN Mirrors

Community-operated mirrors preserving historical WoW builds. These provide access to game data after Blizzard removes it from official CDNs.

cdn.arctium.tools

casc.wago.tools

archive.wow.tools

cascette-rs supports automatic fallback between these mirrors when official Blizzard CDNs are unavailable.

Project Setup

This page covers the requirements and setup for developing cascette-rs.

Requirements

Rust Toolchain

  • Minimum Supported Rust Version (MSRV): 1.92.0
  • Edition: Rust 2024

Install the required toolchain:

rustup install 1.92.0
rustup default 1.92.0

Required components:

rustup component add rustfmt clippy

For WASM development:

rustup target add wasm32-unknown-unknown

Development Tools

ToolPurposeInstallation
cargo-denyDependency auditingcargo install cargo-deny
cargo-nextestTest runnercargo install cargo-nextest
cargo-llvm-covCode coveragecargo install cargo-llvm-cov
mdbookDocumentationcargo install mdbook or via mise install

Optional Tools

ToolPurposeInstallation
ripgrepCode searchcargo install ripgrep or system package
hyperfineBenchmarkingcargo install hyperfine
cargo-watchAuto-rebuildcargo install cargo-watch

Repository Structure

cascette-rs/
├── crates/                    # Workspace members
│   ├── cascette-crypto/       # Cryptographic primitives
│   ├── cascette-formats/      # Binary format parsers
│   └── ...
├── docs/                      # mdBook documentation
│   ├── src/                   # Documentation source
│   └── book.toml              # mdBook configuration
├── deny.toml                  # cargo-deny configuration
├── Cargo.toml                 # Workspace manifest
└── AGENTS.md                  # AI assistant guidance

First-Time Setup

  1. Clone the repository:

    git clone https://github.com/wowemulation-dev/cascette-rs.git
    cd cascette-rs
    
  2. Verify the toolchain:

    rustc --version  # Should be 1.92.0 or later
    cargo --version
    
  3. Build the workspace:

    cargo build --workspace
    
  4. Run tests:

    cargo nextest run --workspace
    
  5. Verify lints pass:

    cargo fmt --all -- --check
    cargo clippy --workspace --all-targets
    

IDE Configuration

VS Code

Recommended extensions:

  • rust-analyzer - Rust language support
  • Even Better TOML - TOML file support
  • crates - Dependency version management

Settings (.vscode/settings.json):

{
  "rust-analyzer.check.command": "clippy",
  "rust-analyzer.check.allTargets": true,
  "editor.formatOnSave": true,
  "[rust]": {
    "editor.defaultFormatter": "rust-lang.rust-analyzer"
  }
}

JetBrains (RustRover/IntelliJ)

  • Install the Rust plugin
  • Enable “Run rustfmt on save”
  • Configure clippy as the external linter

Quality Gate

All changes must pass the CI workflow before merging. Run these checks locally:

# Full CI check (run before committing)
cargo fmt --all -- --check && \
cargo clippy --workspace --all-targets && \
cargo nextest run --profile ci --workspace && \
cargo doc --workspace --no-deps

Individual checks:

CommandPurpose
cargo fmt --all -- --checkFormat verification
cargo clippy --workspace --all-targetsLint checks
cargo nextest run --profile ci --workspaceUnit and integration tests
cargo doc --workspace --no-depsDocumentation build
cargo deny checkDependency audit

WASM Compatibility

Core libraries must compile to WASM:

cargo check --target wasm32-unknown-unknown -p cascette-crypto
cargo check --target wasm32-unknown-unknown -p cascette-formats

Documentation

Build and serve the documentation locally:

# Build HTML documentation
mdbook build docs

# Serve locally with auto-reload
mdbook serve docs --open

The documentation will be available at http://localhost:3000.

Workspace Configuration

The workspace uses strict linting. Key settings from Cargo.toml:

[workspace.lints.clippy]
# Lint groups
all = { level = "warn", priority = -1 }
pedantic = { level = "warn", priority = -1 }
nursery = { level = "warn", priority = -1 }
cargo = { level = "warn", priority = -1 }

# Safety lints (higher priority)
unwrap_used = { level = "warn", priority = 2 }
panic = { level = "warn", priority = 2 }
todo = { level = "warn", priority = 2 }
unimplemented = { level = "warn", priority = 2 }
expect_used = { level = "warn", priority = 2 }

Library code should avoid unwrap(), expect(), and panic!(). Use Result types and proper error handling instead.

Testing Guidelines

This page covers testing conventions and practices for cascette-rs.

Test Organization

Module Structure

Tests live in the same file as the code they test, using a #[cfg(test)] module:

#![allow(unused)]
fn main() {
pub fn parse_header(data: &[u8]) -> Result<Header, ParseError> {
    // Implementation
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_parse_header_with_valid_data_returns_header() {
        // Test implementation
    }
}
}

Nested Modules for Large Files

For files with many tests, use nested modules to group related tests:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;

    mod parsing {
        use super::*;

        #[test]
        fn test_parse_entry_from_valid_bytes() { ... }

        #[test]
        fn test_parse_entry_from_truncated_bytes_returns_error() { ... }
    }

    mod building {
        use super::*;

        #[test]
        fn test_builder_with_entries_produces_sorted_output() { ... }
    }

    mod edge_cases {
        use super::*;

        #[test]
        fn test_edge_empty_input_returns_empty_result() { ... }
    }
}
}

Test Naming Convention

Pattern

Use this naming pattern for test functions:

test_<subject>_<condition>_<expected_outcome>

Components:

PartDescriptionExample
subjectWhat is being testedparser, builder, entry
conditionThe scenario or inputwith_valid_data, from_empty_input
expected_outcomeWhat should happenreturns_struct, returns_error

Examples

Parsing tests:

#![allow(unused)]
fn main() {
// Good - specific and descriptive
fn test_parse_header_with_valid_magic_returns_header() { ... }
fn test_parse_header_with_invalid_magic_returns_error() { ... }
fn test_parse_entry_from_truncated_data_returns_incomplete_error() { ... }

// Bad - too vague
fn test_parse() { ... }
fn test_header() { ... }
fn test_error() { ... }
}

Building tests:

#![allow(unused)]
fn main() {
// Good
fn test_builder_with_single_entry_creates_valid_output() { ... }
fn test_builder_with_unsorted_entries_sorts_before_writing() { ... }

// Bad
fn test_builder() { ... }
fn test_build() { ... }
}

Round-trip tests:

#![allow(unused)]
fn main() {
// Good - suffix with _round_trip
fn test_index_entry_round_trip_preserves_all_fields() { ... }
fn test_blte_compression_round_trip_matches_original() { ... }

// Bad
fn test_round_trip() { ... }  // Round trip of what?
}

Category Prefixes

Use consistent prefixes for special test categories:

PrefixUse CaseExample
test_edge_*Edge cases and boundary conditionstest_edge_empty_input_handled
test_error_*Error path validationtest_error_invalid_checksum_detected
*_round_tripSerialization/deserializationtest_config_round_trip

Edge case examples:

#![allow(unused)]
fn main() {
fn test_edge_empty_index_builds_successfully() { ... }
fn test_edge_single_entry_is_searchable() { ... }
fn test_edge_max_u32_offset_handled() { ... }
fn test_edge_zero_length_data_returns_empty() { ... }
}

Error handling examples:

#![allow(unused)]
fn main() {
fn test_error_truncated_footer_returns_parse_error() { ... }
fn test_error_invalid_checksum_returns_mismatch() { ... }
fn test_error_unsorted_entries_rejected() { ... }
}

Test Types

Unit Tests

Test individual functions in isolation:

#![allow(unused)]
fn main() {
#[test]
fn test_jenkins96_hash_with_known_input_produces_expected_output() {
    let result = Jenkins96::hash(b"test");
    assert_eq!(result.hash32, 0x12345678);  // Known value
}
}

Integration Tests

Place in tests/ directory for testing public APIs:

crates/cascette-formats/
├── src/
│   └── lib.rs
└── tests/
    └── archive_integration.rs

Property-Based Tests

Use proptest for testing invariants across many inputs:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod proptest_tests {
    use proptest::prelude::*;

    proptest! {
        #[test]
        fn round_trip_preserves_entries(entries in prop::collection::vec(any::<Entry>(), 0..100)) {
            let built = build(&entries);
            let parsed = parse(&built)?;
            prop_assert_eq!(entries, parsed);
        }
    }
}
}

Property test naming (inside proptest! macro):

  • No test_ prefix needed (macro adds it)
  • Describe the property being verified
  • Examples: round_trip_preserves_entries, checksum_detects_corruption

Assertions

Use pretty_assertions

Import pretty_assertions for better diff output on failures:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use pretty_assertions::assert_eq;

    #[test]
    fn test_something() {
        assert_eq!(expected, actual);  // Shows colored diff on failure
    }
}
}

Common Assertions

AssertionUse Case
assert_eq!(expected, actual)Value equality
assert_ne!(a, b)Values differ
assert!(condition)Boolean conditions
assert!(result.is_ok())Success check
assert!(result.is_err())Error check
matches!(value, pattern)Pattern matching

Error Assertions

Test specific error types:

#![allow(unused)]
fn main() {
#[test]
fn test_parse_with_invalid_data_returns_checksum_error() {
    let result = parse(invalid_data);

    assert!(matches!(
        result,
        Err(ParseError::ChecksumMismatch { .. })
    ));
}
}

Running Tests

This project uses cargo-nextest for faster, parallel test execution with better output formatting.

Basic Commands

# Run all tests with nextest (recommended)
cargo nextest run --workspace

# Run tests with CI profile (stricter timeouts, immediate output on failures)
cargo nextest run --profile ci --workspace

# Run tests for a specific crate
cargo nextest run -p cascette-formats
cargo nextest run --profile ci -p cascette-formats

# Run tests matching a pattern
cargo nextest run --workspace edge_          # All edge case tests
cargo nextest run --workspace error_         # All error tests
cargo nextest run --workspace round_trip     # All round-trip tests

# Run a specific test
cargo nextest run -p cascette-formats test_parse_header_with_valid_data

Feature Combinations

Test with different feature combinations:

# Default features
cargo test --workspace

# No default features (minimal build)
cargo test --workspace --no-default-features

# All features
cargo test --workspace --all-features

Code Coverage

Generate coverage reports:

# Generate LCOV report
cargo llvm-cov --workspace --lcov --output-path lcov.info

# Generate HTML report
cargo llvm-cov --workspace --html

# Open HTML report
open target/llvm-cov/html/index.html

Test Data

Embedded Test Data

For small test cases, embed data directly in tests:

#![allow(unused)]
fn main() {
#[test]
fn test_parse_minimal_header() {
    let data = [
        0x42, 0x4C, 0x54, 0x45,  // Magic: "BLTE"
        0x00, 0x00, 0x00, 0x10,  // Header size: 16
    ];

    let header = parse_header(&data).expect("should parse");
    assert_eq!(header.magic, b"BLTE");
}
}

Test Fixtures

For larger test files, use the include_bytes! macro or test fixtures:

#![allow(unused)]
fn main() {
const TEST_INDEX: &[u8] = include_bytes!("fixtures/sample.index");

#[test]
fn test_parse_real_index_file() {
    let index = ArchiveIndex::parse(TEST_INDEX).expect("should parse");
    assert!(!index.entries.is_empty());
}
}

Property Test Strategies

Define reusable strategies for property tests:

#![allow(unused)]
fn main() {
fn valid_entry_strategy() -> impl Strategy<Value = IndexEntry> {
    (
        prop::array::uniform16(any::<u8>()),  // 16-byte key
        0u32..u32::MAX,                        // offset
        1u32..1_000_000,                       // size
    ).prop_map(|(key, offset, size)| {
        IndexEntry { key: key.to_vec(), offset, size, archive_index: None }
    })
}
}

CI Integration

Tests run automatically on every pull request using cargo-nextest. The CI workflow:

  1. Runs cargo nextest run --profile ci --workspace with default features
  2. Runs tests with --no-default-features on changed crates
  3. Tests each changed crate individually on stable Rust
  4. Collects code coverage using cargo llvm-cov --nextest and uploads to Codecov

See .github/workflows/ci.yml for the full configuration.

Nextest Profiles

The project uses three nextest profiles configured in .config/nextest.toml:

ProfileDescriptionUse Case
defaultStandard timeouts, final output on completionLocal development
ciStricter timeouts, immediate output on failuresCI, PR checks
releaseRelease build with optimizationsPerformance testing

Cargo Aliases

Convenient cargo aliases are defined in .cargo/config.toml:

cargo nextest-all          # All tests with default profile
cargo nextest-lib          # Library tests only
cargo nextest-ci           # All tests with CI profile
cargo nextest-release      # All tests with release profile
cargo nextest-unit        # Unit tests only
cargo nextest-integration  # Integration tests only

Performance Profiling

Flamegraphs

The project supports flamegraph generation using cargo-flamegraph. Flamegraphs help visualize CPU time spent in different functions during execution.

Generating Flamegraphs Locally

# Generate flamegraph for benchmarks
cargo flamegraph --bench throughput -- --bench

# Generate flamegraph for a binary
cargo flamegraph --bin cascette-ribbit -- --help

# Generate flamegraph for tests
cargo flamegraph --test integration

# Specify output location (flamegraph.svg is created in working directory by default)
cargo flamegraph --output target/flamegraphs/flamegraph.svg --bench throughput -- --bench

Flamegraph outputs are stored in target/flamegraphs/ and ignored by git.

CI Flamegraph Generation

The .github/workflows/profiling.yml workflow generates flamegraphs automatically:

  • Trigger: Manual via workflow_dispatch or commits with [perf] in the message
  • Targets: bench (default), test, binary
  • Output: Uploaded as artifacts and posted to PR comments

To trigger a flamegraph run:

git commit -m "Add performance optimization [perf]"
git push

Or manually trigger via GitHub Actions UI with a target selector.

Benchmarking

The project uses criterion for benchmarking.

# Run all benchmarks
cargo bench

# Run specific benchmark
cargo bench --bench throughput

# Generate HTML report
cargo bench --bench throughput -- --output-format html
open target/criterion/report/index.html

Benchmark Regression Detection

The profiling workflow automatically detects performance regressions:

  • Runs on main branch pushes
  • Uses benchmark-action/github-action-benchmark to store results
  • Alerts when performance degrades by >200%
  • Posts comments to commits with regression alerts

Benchmark data is stored in GitHub Actions cache for historical comparison.

Coding Standards

This page covers coding conventions and style guidelines for cascette-rs.

Formatting

All code must be formatted with rustfmt. Run before committing:

cargo fmt --all

The workspace uses default rustfmt settings. No custom configuration is needed.

Linting

The workspace enables strict clippy lints. All warnings must be resolved:

cargo clippy --workspace --all-targets

Lint Configuration

From Cargo.toml:

[workspace.lints.clippy]
# Lint groups at low priority
all = { level = "warn", priority = -1 }
pedantic = { level = "warn", priority = -1 }
nursery = { level = "warn", priority = -1 }
cargo = { level = "warn", priority = -1 }

# Safety lints at higher priority
unwrap_used = { level = "warn", priority = 2 }
panic = { level = "warn", priority = 2 }
todo = { level = "warn", priority = 2 }
unimplemented = { level = "warn", priority = 2 }
expect_used = { level = "warn", priority = 2 }

Error Handling

Library Code

Library crates must use proper error handling:

#![allow(unused)]
fn main() {
// Good - returns Result
pub fn parse(data: &[u8]) -> Result<Header, ParseError> {
    if data.len() < HEADER_SIZE {
        return Err(ParseError::InsufficientData {
            expected: HEADER_SIZE,
            actual: data.len(),
        });
    }
    // ...
}

// Bad - panics
pub fn parse(data: &[u8]) -> Header {
    assert!(data.len() >= HEADER_SIZE);  // Don't do this
    // ...
}
}

Error Types

Use thiserror for error definitions:

#![allow(unused)]
fn main() {
use thiserror::Error;

#[derive(Debug, Error)]
pub enum ParseError {
    #[error("insufficient data: expected {expected} bytes, got {actual}")]
    InsufficientData { expected: usize, actual: usize },

    #[error("invalid magic: expected {expected:?}, got {actual:?}")]
    InvalidMagic { expected: [u8; 4], actual: [u8; 4] },

    #[error("checksum mismatch")]
    ChecksumMismatch { expected: [u8; 8], actual: [u8; 8] },
}
}

Avoiding unwrap() and expect()

Library code should avoid unwrap() and expect(). Use these alternatives:

#![allow(unused)]
fn main() {
// Instead of unwrap(), propagate errors
let value = map.get(&key).ok_or(Error::KeyNotFound)?;

// Instead of expect(), use ok_or_else() with context
let value = map.get(&key)
    .ok_or_else(|| Error::KeyNotFound { key: key.clone() })?;

// For truly impossible cases, use unreachable!() with comment
match validated_enum {
    Known::Variant => { /* ... */ }
    // Validation already checked all variants
}
}

When expect() is unavoidable (e.g., in binrw map functions), add a file-level allow with documentation:

#![allow(unused)]
fn main() {
//! Module description
//!
//! Uses expect in binrw map functions where Result types cannot be used.
#![allow(clippy::expect_used)]
}

Test Code

Test code may use unwrap() and expect() with the allow attribute:

#![allow(unused)]
fn main() {
#[cfg(test)]
#[allow(clippy::unwrap_used, clippy::expect_used, clippy::panic)]
mod tests {
    // Tests can use unwrap/expect/panic freely
}
}

Binary Format Parsing

Use binrw

All binary formats use the binrw crate for parsing and building:

#![allow(unused)]
fn main() {
use binrw::{BinRead, BinWrite};

#[derive(Debug, BinRead, BinWrite)]
#[brw(big)]  // NGDP uses big-endian
pub struct Header {
    #[brw(magic = b"BLTE")]
    pub magic: (),

    pub header_size: u32,
    pub flags: u8,
}
}

Big-Endian Default

NGDP/CASC formats use big-endian byte order. Always specify:

#![allow(unused)]
fn main() {
#[derive(BinRead, BinWrite)]
#[brw(big)]  // Required for NGDP formats
pub struct Entry {
    pub offset: u32,
    pub size: u32,
}
}

If a field uses little-endian (rare), annotate explicitly:

#![allow(unused)]
fn main() {
#[derive(BinRead, BinWrite)]
#[brw(big)]
pub struct MixedEntry {
    pub big_endian_field: u32,

    #[brw(little)]
    pub little_endian_field: u32,  // Exception - document why
}
}

Round-Trip Testing

Every format must have round-trip tests:

#![allow(unused)]
fn main() {
#[test]
fn test_header_round_trip_preserves_all_fields() {
    let original = Header {
        header_size: 16,
        flags: 0x01,
    };

    let mut buffer = Vec::new();
    original.write(&mut Cursor::new(&mut buffer)).unwrap();

    let parsed = Header::read(&mut Cursor::new(&buffer)).unwrap();

    assert_eq!(original, parsed);
}
}

Documentation

Public API Documentation

All public items require documentation:

#![allow(unused)]
fn main() {
/// Parses a BLTE header from the given data.
///
/// # Arguments
///
/// * `data` - Raw bytes containing the BLTE header
///
/// # Returns
///
/// The parsed header on success, or an error if parsing fails.
///
/// # Errors
///
/// Returns `ParseError::InsufficientData` if the data is too short.
/// Returns `ParseError::InvalidMagic` if the magic bytes don't match.
///
/// # Examples
///
/// ```
/// use cascette_formats::blte::parse_header;
///
/// let data = include_bytes!("../fixtures/sample.blte");
/// let header = parse_header(data)?;
/// println!("Header size: {}", header.header_size);
/// # Ok::<(), cascette_formats::blte::ParseError>(())
/// ```
pub fn parse_header(data: &[u8]) -> Result<Header, ParseError> {
    // ...
}
}

Binary Format Documentation

Document binary formats with exact byte layouts:

#![allow(unused)]
fn main() {
/// Archive index entry.
///
/// ## Binary Layout
///
/// | Offset | Size | Field | Description |
/// |--------|------|-------|-------------|
/// | 0x00 | 16 | key | Encoding key (MD5 hash) |
/// | 0x10 | 4 | size | Compressed size in bytes |
/// | 0x14 | 4 | offset | Offset into archive file |
///
/// Total size: 24 bytes (0x18)
///
/// All multi-byte fields are big-endian.
#[derive(Debug, BinRead, BinWrite)]
#[brw(big)]
pub struct IndexEntry {
    pub key: [u8; 16],
    pub size: u32,
    pub offset: u32,
}
}

Naming Conventions

Types and Traits

ItemConventionExample
StructsPascalCaseArchiveIndex, BlteHeader
EnumsPascalCaseCompressionType, ParseError
TraitsPascalCaseCascFormat, KeyStore
Type aliasesPascalCaseContentKey, EncodingKey

Functions and Methods

ItemConventionExample
Functionssnake_caseparse_header, build_index
Methodssnake_caseself.get_entry(), self.is_valid()
Constructorsnew or from_*Header::new(), Key::from_hex()
Conversionsto_* or into_*to_bytes(), into_vec()
Gettersno prefixfn size(&self) not fn get_size(&self)
Boolean gettersis_* or has_*is_empty(), has_entries()

Constants and Statics

#![allow(unused)]
fn main() {
// Constants: SCREAMING_SNAKE_CASE
pub const HEADER_SIZE: usize = 16;
pub const MAGIC_BYTES: [u8; 4] = *b"BLTE";

// Statics (rare): SCREAMING_SNAKE_CASE
static GLOBAL_CONFIG: Lazy<Config> = Lazy::new(Config::default);
}

Modules

Module names use snake_case:

#![allow(unused)]
fn main() {
mod archive;
mod blte;
mod encoding;
mod root;
}

File structure mirrors module structure:

src/
├── archive/
│   ├── mod.rs
│   ├── index.rs
│   └── builder.rs
├── blte/
│   ├── mod.rs
│   ├── header.rs
│   └── compression.rs
└── lib.rs

Memory and Performance

Zero-Copy When Possible

Prefer borrowing over copying:

#![allow(unused)]
fn main() {
// Good - borrows data
pub fn parse<'a>(data: &'a [u8]) -> Result<Entry<'a>, Error> {
    Ok(Entry {
        key: &data[0..16],
        // ...
    })
}

// Less efficient - copies data
pub fn parse(data: &[u8]) -> Result<Entry, Error> {
    Ok(Entry {
        key: data[0..16].to_vec(),
        // ...
    })
}
}

Avoid Loading Large Files Into Memory

Stream large files instead of loading entirely:

#![allow(unused)]
fn main() {
// Good - streams data
pub fn process_archive<R: Read + Seek>(reader: &mut R) -> Result<(), Error> {
    loop {
        let entry = read_entry(reader)?;
        process_entry(&entry)?;
    }
}

// Bad - loads everything
pub fn process_archive(data: &[u8]) -> Result<(), Error> {
    let archive = parse_entire_archive(data)?;  // Out of memory for large files
    // ...
}
}

Use Appropriate Collection Types

Use CaseType
Ordered, indexed accessVec<T>
Key-value lookupHashMap<K, V> or BTreeMap<K, V>
Unique valuesHashSet<T> or BTreeSet<T>
Small fixed-size[T; N] or ArrayVec<T, N>
BytesBytes (from bytes crate) for shared ownership

Unsafe Code

Unsafe code requires explicit documentation:

#![allow(unused)]
fn main() {
/// # Safety
///
/// Caller must ensure:
/// - `ptr` is valid for reads of `len` bytes
/// - `ptr` is properly aligned for `T`
/// - The memory is not mutated during this call
pub unsafe fn read_from_ptr<T>(ptr: *const u8, len: usize) -> T {
    // ...
}
}

Prefer safe abstractions when possible. Use unsafe only when necessary for performance or FFI.

WASM Compatibility

Core libraries must compile to WASM. Avoid:

  • C dependencies (use pure Rust implementations)
  • File system access in library code
  • Platform-specific code without #[cfg] guards

Test WASM compilation:

cargo check --target wasm32-unknown-unknown -p cascette-crypto
cargo check --target wasm32-unknown-unknown -p cascette-formats