cascette-rs Documentation

About This Project

cascette-rs is part of the wowemulation-dev initiative to build open source tooling for World of Warcraft emulation. The project focuses on modern WoW Classic clients (1.13+, 2.5+, 3.4+) which use Blizzard’s NGDP content distribution system.

Why Modern Clients?

The WoW emulation and modding community has historically focused on the 3.3.5a client from 2008. While functional, this approach has limitations:

Outdated technology: MPQ archives, no content addressing, manual patching
Fragmented tooling: Many tools exist only as abandoned Windows binaries
Limited modding: Technical constraints restrict what can be modified

Modern Classic clients differ from 3.3.5a:

Active development: Blizzard continues updating these clients
Better architecture: NGDP/CASC enables content addressing and streaming
Cross-platform: Same content format works on Windows, macOS, and Linux
Preservation: Community CDN mirrors ensure historical builds remain available

What You Can Do with cascette-rs

For Emulator Developers

Download specific WoW Classic builds for server development
Extract game data files (DBCs, maps, models) for server-side use
Verify client installations match expected versions
Serve game content to clients via the Agent API

For Archivists

Mirror complete WoW builds from Blizzard’s CDN
Preserve historical game versions before they disappear
Access builds from community CDN mirrors when Blizzard removes them
Track version history across all WoW products

For Modders

Extract assets from any WoW build for modification
Understand file relationships through encoding and root manifests
Work with modern file formats instead of legacy MPQ tools
Build custom content distribution for modified clients

For Tool Developers

Parse all NGDP/CASC binary formats with the cascette-formats library
Build applications on top of cascette’s CDN and protocol layers
Integrate CASC reading into existing toolchains
Create cross-platform tools that work on Linux, macOS, and Windows

What is NGDP?

NGDP (Next Generation Distribution Pipeline) is Blizzard’s content distribution system. It replaced MPQ/P2P/Torrent distribution with World of Warcraft 6.0 in 2014.

For technical details, see NGDP on wowdev.wiki.

System Overview

NGDP consists of three components:

Ribbit API: Provides product versions, CDN endpoints, and configuration data
CDN Distribution: Delivers game content through HTTP/HTTPS
Agent: Local HTTP service (port 1120) that manages downloads and installations

Key Differences from MPQ

Distribution Method: CDN-based delivery instead of P2P/Torrent
Content Addressing: Files identified by content hashes rather than names
Update Mechanism: Incremental updates through partial file downloads
Archive Format: CASC (Content Addressable Storage Container) replaces MPQ archives
Content Protection: Encryption support for secure pre-release distribution

Benefits of NGDP

For Distribution

Reduced Server Load: CDN infrastructure handles content delivery
Faster Downloads: Users connect to nearest CDN nodes
Incremental Updates: Only changed content needs downloading
Parallel Downloads: Multiple files retrieved simultaneously
Pre-release Distribution: Encrypted content can be distributed before launch

For Development

Content Deduplication: Identical files stored once across versions
Stream Installation: Games playable before download completes
Platform Independence: Same content system across operating systems

Core Concepts

Content Addressing

Files are identified by MD5 hashes of their content. Identical content produces identical hashes, enabling deduplication, integrity verification, and cache efficiency.

System Files

NGDP uses metadata files to manage content:

Root File: Maps game files to content keys
Encoding File: Maps content to compressed versions
Install Manifest: Defines installation requirements
Download Manifest: Sets download priorities

BLTE Format

BLTE (Block Table Encoded) is the container format for game data. It supports:

Block-based compression
Multiple compression algorithms
Encryption per block
Chunked processing

Content Encryption

NGDP supports Salsa20 encryption for distributing content before its release date. Files can be pre-positioned on CDN while remaining inaccessible until decryption keys are provided.

Technical Specifications

Byte Order: Big-endian (network byte order)
Hash Algorithm: MD5 for content identification
Key Size: 128-bit (16 bytes)
Compression: zlib, lz4, and other algorithms per block
Encryption: Salsa20 stream cipher for content protection

Format Organization

NGDP/CASC formats are organized by their storage location and usage context:

1. CDN Formats (Network/Remote)

Formats served by Blizzard CDN servers via HTTP/HTTPS.

2. CASC Formats (Local/Client)

Formats created and managed by the Battle.net client on local storage.

3. Shared Formats

Formats used in both CDN and local contexts.

Component Documentation

Service Discovery

Service discovery components handle version information, CDN endpoint discovery, and product configuration metadata:

Ribbit Protocol - TCP-based discovery and version information API
BPSV Format - Blizzard Pipe-Separated Values format for API responses

CDN Formats

Configuration Files (Text)

Build Config - Build-specific settings (/config/{hash})
CDN Config - CDN server and archive lists (/config/{hash})
Product Config - Product settings and versions (/config/{hash})
Patch Config - Differential patch information (/config/{hash})

Content Files (Binary)

Immutable, content-addressed files served from CDN:

CDN Archives - BLTE containers with game content (/data/{prefix}/{hash}.archive)
CDN Indices - Maps keys to archive locations (/data/{prefix}/{hash}.index)
Encoding File - Maps content to encoding keys (/data/{prefix}/{hash})
Root File - Maps files to content keys (/data/{prefix}/{hash})
Install Manifest - Installation requirements (/data/{prefix}/{hash})
Download Manifest - Download priorities (/data/{prefix}/{hash})
Patch Archives - Delta patches (/patch/{prefix}/{hash}.archive)
Patch Indices - Patch archive index (/patch/{prefix}/{hash}.index)

Modern Additions (WoW 8.2+)

TVFS - Virtual file system manifest (via vfs-* fields in BuildConfig)

CASC Local Formats

Client-side storage structures created and managed by Battle.net:

Local Indices

IDX Journal - Bucket-based local index (Data/indices/{bucket}.idx)
Archive Groups - Combined archive index (client-generated optimization)
Shadow Memory - Memory-mapped cache (Data/shmem)

Local Archives

data.### - Combined CDN archives (Data/data/data.###)
patch.### - Combined patch archives (Data/patch/patch.###)

Local Configuration

.build.info - Local build configuration (root directory), BPSV-formatted
DBCache - Hotfix database cache (Cache/ADB/*.bin)

Shared Formats

Container Formats

BLTE Format - Block compression/encryption (all content storage)
ESpec Format - Encoding specifications (compression definitions)

Cryptographic

MD5 Keys - Content addressing (all key references)
Salsa20 Encryption - Stream cipher (content protection)
TACT Keys - Key management (decryption keys)

Supporting Systems

CDN Architecture - Content distribution network structure
CDN Mirroring - Historical preservation strategies
FileDataId - Persistent file identification across builds

Format Relationships

CDN Download Flow

flowchart TB
    subgraph Discovery
        Ribbit["Ribbit (BPSV)"]
        ProductConfig["Product Config"]
        CDNConfig["CDN Config"]
        BuildConfig["Build Config"]
    end

    subgraph Content
        Archives["CDN Archives + Indices"]
        Encoding["Encoding File"]
        Root["Root File"]
        Manifests["Install/Download Manifests"]
    end

    Ribbit --> ProductConfig --> CDNConfig --> BuildConfig
    BuildConfig --> Archives
    Archives --> Encoding --> Root
    Root --> Manifests

Content Resolution

flowchart LR
    subgraph Input
        File["Filename/FileDataId"]
    end

    subgraph Lookup
        Root["Root File"]
        CKey["Content Key"]
        Encoding["Encoding File"]
        EKey["Encoding Key + ESpec"]
    end

    subgraph Retrieval
        Index["CDN Index"]
        Location["Archive Location"]
        Archive["CDN Archive"]
        BLTE["BLTE Data"]
    end

    subgraph Output
        Decompress["Decompression"]
        Raw["Raw Content"]
    end

    File --> Root --> CKey
    CKey --> Encoding --> EKey
    EKey --> Index --> Location
    Location --> Archive --> BLTE
    BLTE --> Decompress --> Raw

Glossary

Key terms used throughout this documentation. If you’re coming from the MPQ/3.3.5a modding scene, pay attention to the “MPQ Equivalent” notes.

Content Identification

Content Key (CKey)

MD5 hash of a file’s uncompressed content. Used to identify files regardless of how they’re compressed or stored.

Size: 16 bytes (128 bits)
MPQ Equivalent: Similar to how MPQ uses filenames, but content-based
Example: a1b2c3d4e5f6... (32 hex characters)

Encoding Key (EKey)

MD5 hash of a file’s compressed/encoded BLTE data. Used to locate files on CDN and in archives.

Size: 16 bytes (128 bits)
Relationship: CKey → Encoding File → EKey
Example: Files with identical content share a CKey but may have different EKeys

FileDataID (FDID)

Numeric identifier for a file, persistent across game versions. Replaced filename-based lookups in WoW 8.0+.

Size: 4 bytes (32-bit integer)
Range: 0 to ~4 million (as of 2024)
MPQ Equivalent: None - MPQ used filenames exclusively
Example: 1234567 refers to a specific texture, model, or data file

Name Hash

Jenkins96 hash of a file’s path. Used in older builds (pre-8.0) to look up files by name.

Algorithm: Jenkins96 (lookup3)
MPQ Equivalent: Similar to MPQ’s hash table for filename lookup
Note: Deprecated in favor of FileDataID in modern builds

File Formats

BLTE (Block Table Encoded)

Container format that wraps all CASC content. Provides compression and optional encryption.

MPQ Equivalent: Similar to MPQ’s sector-based compression
Key difference: BLTE supports multiple compression algorithms per file
Compression: None, zlib, LZMA, LZ4, Zstd
Encryption: Salsa20, ARC4 (older builds)

Encoding File

Maps CKeys to EKeys. The central lookup table for content resolution.

Purpose: Find where a file’s compressed data lives
MPQ Equivalent: None - MPQ stored files directly by name

Root File

Maps FileDataIDs (or name hashes) to CKeys. The entry point for file lookup.

Purpose: Find what content hash a file has
MPQ Equivalent: Combines MPQ’s hash table and block table functions
Contains: FileDataID, locale flags, content flags, CKey

Install Manifest

Lists files required for a minimal installation (enough to launch the game).

Purpose: Prioritize essential files for streaming installs
MPQ Equivalent: None - MPQ required full downloads

Download Manifest

Prioritizes files for background downloading after initial install.

Purpose: Order non-essential downloads by importance
MPQ Equivalent: None

Storage Concepts

Archive Index

Maps EKeys to offsets within an archive file.

CDN index: .index file paired with each archive
Local index: .idx files in Data/indices/
MPQ Equivalent: Similar to MPQ’s block table

Archive Group

Combined index covering multiple archives. Optimization for faster lookups.

Location: Generated locally by the client from downloaded archive indices
Purpose: Single lookup instead of checking each archive index
Note: Never downloaded from CDN - always client-generated

CASC (Content Addressable Storage Container)

The local storage system. Everything is identified by content hash.

MPQ Equivalent: Replaces MPQ archives entirely
Key difference: Files found by hash, not by name

Network Concepts

CDN (Content Delivery Network)

Servers that host game content. Blizzard uses Akamai, Level3, and others.

Structure: https://{cdn}/{product}/{type}/{hash[:2]}/{hash[2:4]}/{hash}
Types: config, data, patch

Ribbit

Protocol for querying product versions and CDN information.

Port: 1119 (TCP) or HTTP
Purpose: Discover what versions exist and where to download them
MPQ Equivalent: None - MPQ versions were distributed manually

Agent

Local HTTP service (port 1120) that manages downloads and installations.

Purpose: Background downloading, installation management
MPQ Equivalent: None - MPQ required manual patching

Configuration

Build Config

Per-build settings including root/encoding file hashes and encryption keys.

Location: CDN /config/{hash}
Contains: Root CKey, encoding CKey, patch info, VFS info

CDN Config

Lists available CDN servers and archive hashes.

Location: CDN /config/{hash}
Contains: Archive list, server URLs, file groups

Product Config

Product-wide settings spanning multiple builds.

Location: CDN /config/{hash}
Contains: Decryption keys, feature flags

Encryption

TACT Key

Encryption key for protected content. Named keys are published, unnamed are secret.

Size: 16 bytes
Algorithm: Used with Salsa20 stream cipher
Source: Community-maintained key databases

Salsa20

Stream cipher used for content encryption in modern builds.

Key size: 256 bits (16-byte key + 16-byte name as nonce)
Replaces: ARC4 (used in older builds)

MPQ to CASC Quick Reference

MPQ Concept	CASC Equivalent
.mpq file	Archive (data.xxx)
Filename	FileDataID or CKey
Hash table	Root file
Block table	Archive index
Sector compression	BLTE blocks
Patch MPQ	Patch archives + encoding
listfile.txt	Community listfiles
Manual patching	Agent + CDN

Encoding File Format

The encoding file is the gateway to all CASC content. It maps content keys (unencoded file hashes) to encoding keys (encoded/compressed file hashes) and provides essential metadata for content resolution.

Overview

The encoding file serves multiple critical functions:

Content Resolution: Maps content keys to encoding keys for CDN retrieval
Compression Metadata: Specifies ESpec encoding for each file
Size Information: Tracks both compressed and decompressed sizes
Multi-Version Support: Handles multiple encoding keys per content key

File Structure

The encoding file is BLTE-encoded and consists of:

[BLTE Container]
  [Header]           (22 bytes)
  [ESpec Table]      (variable)
  [CKey Page Index]  (variable)
  [CKey Pages]       (variable)
  [EKey Page Index]  (variable)
  [EKey Pages]       (variable)
  [File ESpec]       (variable) - The encoding file's own ESpec

Binary Format

Header (22 bytes)

struct EncodingHeader {
    uint16_t magic;           // 0x00: 'EN' (0x454E)
    uint8_t  version;         // 0x02: Version (1)
    uint8_t  ckey_size;       // 0x03: Content key size (16)
    uint8_t  ekey_size;       // 0x04: Encoding key size (16)
    uint16_t ckey_page_size;  // 0x05: CKey page size in KB (BE)
    uint16_t ekey_page_size;  // 0x07: EKey page size in KB (BE)
    uint32_t ckey_page_count; // 0x09: Number of CKey pages (BE)
    uint32_t ekey_page_count; // 0x0D: Number of EKey pages (BE)
    uint8_t  flags;            // 0x11: Flags (must be 0)
    uint32_t espec_size;      // 0x12: ESpec table size (BE)
};

ESpec String Table

Immediately follows the header. Contains null-terminated strings referenced by entries:

"z\0b:{0,4}\0b:{0,4},z\0b:{0,2},z:{0,6}\0...\0"

Common ESpec patterns:

z - ZLib compression
n - No compression
b:{start,size} - Block encoding (see ESpec)
Empty string for uncompressed files

Page Index Tables

CKey Page Index

For each CKey page:

struct PageIndex {
    uint8_t first_key[ckey_size];  // First key in the page
    uint8_t page_hash[16];         // MD5 of the page data
};

EKey Page Index

Similar structure but uses ekey_size for the first key.

Content Key (CKey) Pages

Pages are sorted by content key for binary search. Each page contains multiple entries:

struct CKeyEntry {
    uint8_t  ekey_count;                    // Number of encoding keys
    uint8_t  file_size[5];                  // Decompressed size (40-bit BE)
    uint8_t  ckey[ckey_size];               // Content key
    uint8_t  ekeys[ekey_size * ekey_count]; // Encoding keys
};

Entry layout (sizes from header):

[count:1] [size:5] [ckey:ckey_size] [ekey1:ekey_size] [ekey2:ekey_size] ...

Multiple EKeys: A single content key can map to multiple encoding keys, allowing:

Different compression algorithms for the same content
Regional variations with different encryption
Platform-specific optimizations

Encoding Key (EKey) Pages

Maps encoding keys to ESpec entries:

struct EKeyEntry {
    uint8_t  ekey[ekey_size];     // Encoding key
    uint32_t espec_index;          // Index into ESpec table (BE)
    uint8_t  file_size[5];         // Encoded file size (40-bit BE)
};

Padding Detection: EKey pages may contain padding entries that must be skipped. Two sentinel patterns indicate padding:

espec_index == 0xFFFFFFFF (Agent.exe sentinel)
espec_index == 0 with all key bytes 0x00 (zero-fill padding)

Content Resolution Process

Find CKey Entry:
- Binary search CKey page index for target page
- Linear search within page for content key
- Extract encoding key(s) and decompressed size
Find EKey Entry (optional):
- Binary search EKey page index
- Locate entry to get ESpec index and compressed size
Parse ESpec:
- Index into ESpec string table
- Parse encoding specification for compression details

Usage

Parsing

#![allow(unused)]
fn main() {
use cascette_formats::encoding::EncodingFile;

// From decompressed data
let encoding = EncodingFile::parse(&data)?;

// From BLTE-encoded CDN data
let encoding = EncodingFile::parse_blte(&blte_data)?;
}

Content Key Lookup

#![allow(unused)]
fn main() {
use cascette_crypto::ContentKey;

// Single lookup (binary search on page index, linear within page)
if let Some(ekey) = encoding.find_encoding(&content_key) {
    println!("Encoding key: {:?}", ekey);
}

// Get all encoding keys for a content key
let ekeys = encoding.find_all_encodings(&content_key);

// Batch lookup (sort-merge across pages)
let results = encoding.batch_find_encodings(&content_keys);
}

EKey to ESpec Lookup

#![allow(unused)]
fn main() {
use cascette_crypto::EncodingKey;

if let Some(espec) = encoding.find_espec(&encoding_key) {
    println!("Compression spec: {}", espec);
}
}

Building

#![allow(unused)]
fn main() {
use cascette_formats::encoding::{EncodingBuilder, CKeyEntryData, EKeyEntryData};

let mut builder = EncodingBuilder::new(); // 4KB pages
builder.add_ckey_entry(CKeyEntryData {
    content_key,
    file_size: 524_288,
    encoding_keys: vec![encoding_key],
});
builder.add_ekey_entry(EKeyEntryData {
    encoding_key,
    espec: "z".to_string(),
    file_size: 187_234,
});
let encoding_file = builder.build()?;
}

Page Structure

All pages are loaded eagerly. Each page preserves its original binary data for byte-exact round-trip reconstruction:

#![allow(unused)]
fn main() {
// Page<T> holds parsed entries and raw bytes
pub struct Page<T> {
    pub entries: Vec<T>,
    pub original_data: Vec<u8>,
}

// IndexEntry holds first key + MD5 checksum for integrity
pub struct IndexEntry {
    pub first_key: [u8; 16],
    pub checksum: [u8; 16],
}
}

All multi-byte header and page fields are big-endian.

ESpec Integration

The ESpec strings define how files are encoded:

Common Patterns

Uncompressed: Empty string or n
ZLib: z
Partial compression: b:{0,1000},z,b:{1000,500},n
- Bytes 0-1000: ZLib compressed
- Bytes 1000-1500: Uncompressed

Parsing ESpec

#![allow(unused)]
fn main() {
enum ESpecOp {
    None,
    ZLib,
    ByteRange { start: u32, size: u32 },
}

fn parse_espec(spec: &str) -> Vec<ESpecOp> {
    if spec.is_empty() || spec == "n" {
        return vec![ESpecOp::None];
    }

    spec.split(',')
        .map(|part| match part {
            "z" => ESpecOp::ZLib,
            "n" => ESpecOp::None,
            s if s.starts_with("b:") => {
                // Parse "b:{start,size}"
                let nums = parse_range(s);
                ESpecOp::ByteRange {
                    start: nums.0,
                    size: nums.1
                }
            }
            _ => ESpecOp::None,
        })
        .collect()
}
}

Multi-Version Support

Files can have multiple encoding keys (different compression/encryption):

#![allow(unused)]
fn main() {
struct CKeyEntry {
    ekey_count: u8,        // Usually 1, can be 2+
    file_size: u64,        // Same for all versions
    ckey: [u8; 16],        // Content key
    ekeys: Vec<[u8; 16]>,  // Multiple encoding keys
}
}

Use cases include different regional encryption and progressive quality levels.

Performance Considerations

Memory-Mapped Access

For large encoding files (100MB+):

#![allow(unused)]
fn main() {
use memmap2::MmapOptions;

struct EncodingFile {
    mmap: Mmap,
    header: EncodingHeader,
    // ...
}

impl EncodingFile {
    fn open(path: &Path) -> Result<Self> {
        let file = File::open(path)?;
        let mmap = unsafe { MmapOptions::new().map(&file)? };

        // Parse header from mmap
        let header = EncodingHeader::read(&mmap[..22])?;

        Ok(Self { mmap, header })
    }
}
}

Page Caching

Cache frequently accessed pages:

#![allow(unused)]
fn main() {
struct PageCache {
    entries: LruCache<u32, Arc<CKeyPage>>,
}
}

Validation

Checksums

Each page has an MD5 checksum in the index:

#![allow(unused)]
fn main() {
fn validate_page(index: &PageIndex, data: &[u8]) -> bool {
    let computed = md5::compute(data);
    computed.0 == index.page_hash
}
}

Size Constraints

Page sizes must be > 0 (no power-of-2 requirement enforced)
Key sizes in range [1, 16] bytes
Page counts must be > 0
ESpec size must be > 0
File sizes use 40-bit integers (up to 1TB)

File’s Own ESpec

After all the data structures, the encoding file contains its own ESpec string describing how it’s compressed. This self-referential metadata is an intentional, documented feature of the NGDP format.

Official Documentation

The wowdev.wiki TACT specification explicitly lists this as the 5th component:

Header
Encoding specification data (ESpec)
Content key → encoding key table
Encoding key → encoding spec table
“Encoding specification data for the encoding file itself”

Reference Implementation

TACT.Net explicitly handles this in EncodingFile.cs:

Line 151: // remainder is an ESpec block for the file itself
Implements GetFileESpec() method to generate this when writing

Real-World Examples

wow_classic 5.5.0.62655 (60 bytes):

b:{22=n,76025=z,223424=n,28598272=n,146656=n,18771968=n,*=z}

wow_classic_era 1.15.7.61582 (55 bytes):

b:{22=n,2069=z,65536=n,8388608=n,43008=n,5505024=n,*=z}

Meaning:

22=n: Header (22 bytes) uncompressed
76025=z: ESpec table compressed with ZLib
223424=n: CKey index uncompressed
28598272=n: CKey pages uncompressed
146656=n: EKey index uncompressed
18771968=n: EKey pages uncompressed
*=z: Remainder (the file’s own ESpec) compressed

This self-referential design allows files to describe their own compression structure using the same ESpec format as all other files.

Common Issues

Page Boundary Errors: Entries can span pages
Endianness: All multi-byte values are big-endian
ESpec Index: Zero-based into string table
CKey Padding: Entries with ekey_count = 0 indicate end of page data
EKey Padding: Entries with espec_index = 0xFFFFFFFF or all-zero keys indicate padding (see Padding Detection above)
File Size: Remember to account for the file’s own ESpec at the end

Real-World Example

Using wow_classic_era 1.15.7.61582:

Encoding file: bbf06e7476382cfaa396cff0049d356b

Header:
  Magic: 0x454E ('EN')
  Version: 1
  CKey/EKey size: 16 bytes each
  CKey pages: 4KB × 127 pages
  EKey pages: 4KB × 127 pages
  ESpec table: 1,234 bytes

Example CKey entry:
  Content Key: 3ce96e7a9e3b6f5c9d99c8b4e0a4f3d2
  EKey count: 1
  File size: 524,288 bytes (512KB)
  Encoding Key: 7f8a9b3c4d5e6f7081929a3b4c5d6e7f

Corresponding EKey entry:
  Encoding Key: 7f8a9b3c4d5e6f7081929a3b4c5d6e7f
  ESpec index: 1 (points to "z" - ZLib)
  Compressed size: 187,234 bytes

This shows a typical game asset compressed from 512KB to 183KB using ZLib.

Implementation Flow

#![allow(unused)]
fn main() {
use cascette_formats::encoding::EncodingFile;
use cascette_crypto::ContentKey;

// 1. Parse encoding file from BLTE-encoded CDN data
let encoding = EncodingFile::parse_blte(&cdn_data)?;

// 2. Look up content by content key
let ekey = encoding.find_encoding(&content_key)
    .ok_or("content key not found")?;

// 3. Optionally get the compression spec
let espec = encoding.find_espec(&ekey);

// 4. Fetch actual file from CDN using encoding key, then decompress
}

Version History

The Encoding file format currently has only one version:

Version 1 (Current)

Header Size: 22 bytes
Magic: “EN” (0x454E)
Features:
- Content key to encoding key mapping
- Dual page index system (CKey and EKey pages)
- ESpec string table for compression metadata
- 40-bit file sizes (up to 1TB per file)
- Multiple encoding keys per content key support
- Page-based binary search
- MD5 page checksums for integrity

Version Detection

All known encoding files use version 1. The version field is at offset 2 in the header. If future versions are introduced, parsers should check this field after validating the “EN” magic bytes.

References

See ESpec Documentation for encoding specifications
See BLTE Format for container structure
See CDN Architecture for retrieval patterns
See Format Transitions for format evolution tracking

Root File Format

The Root file is the primary catalog of all files stored in CASC archives. It maps file paths or FileDataIDs to content keys, enabling game clients to locate and retrieve specific assets.

Overview

The Root file serves as the master index for all game content:

Maps FileDataIDs to content keys
Supports multiple locales and content flags
Groups files into blocks for efficient lookup
Handles both named and unnamed entries

File Structure

The Root file is BLTE-encoded and organized into blocks:

[BLTE Container]
  [Header]
  [Block 1]
  [Block 2]
  ...
  [Block N]

Binary Format

Version Detection

The Root file format has evolved significantly:

Pre-30080: No MFST magic, raw block data
Build 30080+ (v2): MFST magic with file counts
Build 50893+ (v3): Added header_size/version fields
Build 58221+ (v4): Extended content flags to 40 bits

Header Structures

Version 2 (Build 30080+)

struct RootHeaderV2 {
    uint32_t magic;              // 'MFST' (0x4D465354) or 'TSFM' (0x5453464D)
    uint32_t total_file_count;   // Total number of files
    uint32_t named_file_count;   // Number of named entries
};

Note: Some builds use ‘TSFM’ magic instead of ‘MFST’. This appears to be a little-endian representation. Both should be accepted as valid.

Version 3 (Build 50893+)

struct RootHeaderV3 {
    uint32_t magic;              // 'MFST' (0x4D465354) or 'TSFM' (0x5453464D)
    uint32_t header_size;        // Size of header (20 bytes)
    uint32_t version;            // Version (1)
    uint32_t total_file_count;   // Total number of files
    uint32_t named_file_count;   // Number of named entries
    uint32_t padding;            // Padding (0)
};

Note: Version 3 also uses TSFM magic in observed builds, maintaining consistency with Version 2.

Version Detection Heuristic: After reading the magic, check the next two u32 values. If the first value (header_size) is in range [16, 100) and the second value (version) is less than 10, the file is v3+. Otherwise treat the first value as total_file_count (v2). Version 1 maps to V2 block format.

Block Structure

Each block contains file entries for specific locale and content flag combinations. Important: The block header format changed significantly between V1 and V2+.

V1 Block Header (Pre-30080, 12 bytes)

V1 files have no MFST/TSFM magic and use a 12-byte block header with interleaved record format:

struct RootBlockHeaderV1 {
    uint32_t num_records;        // Number of records in block
    uint32_t content_flags;      // Content flags (32-bit)
    uint32_t locale_flags;       // Locale flags (language/region)

    // FileDataID deltas (delta-encoded)
    int32_t fileDataIDDeltas[num_records];

    // Interleaved record data (content_key + name_hash per record)
    RootRecordInterleaved records[num_records];
};

V2+ Block Header (Build 30080+, 17 bytes)

V2 and later versions have MFST/TSFM magic and use a 17-byte block header with separated arrays. Per wowdev.wiki documentation for Version 2 (11.1.0+):

#pragma pack(push, 1)
struct RootBlockHeaderV2 {
    uint32_t num_records;        // Number of records in block
    uint32_t locale_flags;       // Locale flags (MOVED - was third in V1!)
    uint32_t content_flags;      // Content flags (was second in V1)
    uint32_t unk2;               // Unknown field 2
    uint8_t  unk3;               // Unknown field 3 (flags via bit-shift)

    // FileDataID deltas (delta-encoded)
    int32_t fileDataIDDeltas[num_records];

    // Separated arrays (all content_keys, then all name_hashes)
    uint8_t content_keys[num_records][16];
    uint8_t name_hashes[num_records][8];  // Optional based on flags
};
#pragma pack(pop)

Critical Implementation Note: The field order change from V1 to V2+ is a common source of parsing bugs. In V1, the order is num_records, content_flags, locale_flags. In V2+, the order is num_records, locale_flags, content_flags, unk2, unk3.

V4 Extended Content Flags

V4 (Build 58221+) extends content flags to 40 bits, increasing the block header to 18 bytes (the content_flags field grows from 4 to 5 bytes). The 40-bit value is read as a u32 (4 bytes) plus a u8 (1 byte):

uint32_t content_flags_low;   // Bits 0-31
uint8_t  content_flags_high;  // Bits 32-39
// Combined: content_flags = content_flags_low | (content_flags_high << 32)

Record Formats

Old Format (Interleaved)

struct RootRecordOld {
    uint8_t content_key[16];     // MD5 content key
    uint8_t name_hash[8];        // Jenkins96 name hash (optional)
};

New Format (Separated)

struct RootRecordNew {
    // Arrays stored separately
    uint8_t content_keys[num_records][16];
    uint8_t name_hashes[num_records][8];  // Optional
};

Content Flags

Content flags specify platform, architecture, and file attributes:

32-bit Flags (v2-v3)

Values match CascLib (CascLib.h), TACTSharp, and WoWDev wiki:

Value	Flag	Description
0x00000004	Install	Install manifest entry
0x00000008	LoadOnWindows	Windows platform
0x00000010	LoadOnMacOS	macOS platform
0x00000020	x86_32	32-bit x86 architecture
0x00000040	x86_64	64-bit x86 architecture
0x00000080	LowViolence	Censored content
0x00000100	DoNotLoad	Skip file
0x00000800	UpdatePlugin	Launcher plugin
0x00008000	Arm64	ARM64 architecture
0x08000000	Encrypted	Encrypted content
0x10000000	NoNameHash	No name hash in block
0x20000000	UncommonResolution	Non-standard resolution
0x40000000	Bundle	Bundled content
0x80000000	NoCompression	Uncompressed

40-bit Flags (v4+)

Build 58221+ extends to 40 bits, stored as u32 + u8:

Bits 0-31: Standard content flags (same as v2/v3)
Bits 32-39: Extended flags (single byte, shifted left by 32)

Common combinations:

0x00000000: All platforms, default
0x00000008: Windows only
0x00000010: macOS only
0x08000000: Encrypted content
0x10000000: No name hash present

Locale Flags

32-bit field representing language/region:

Value	Locale	Description
0x00000002	enUS	English (US)
0x00000004	koKR	Korean
0x00000010	frFR	French
0x00000020	deDE	German
0x00000040	zhCN	Chinese (Simplified)
0x00000080	esES	Spanish (Spain)
0x00000100	zhTW	Chinese (Traditional)
0x00000200	enGB	English (UK)
0x00000400	enCN	English (China)
0x00000800	enTW	English (Taiwan)
0x00001000	esMX	Spanish (Mexico)
0x00002000	ruRU	Russian
0x00004000	ptBR	Portuguese (Brazil)
0x00008000	itIT	Italian
0x00010000	ptPT	Portuguese (Portugal)
0xFFFFFFFF	All	All locales

FileDataID Delta Encoding

FileDataIDs use delta encoding for compression:

#![allow(unused)]
fn main() {
fn decode_file_data_ids(deltas: &[i32]) -> Vec<u32> {
    let mut ids = Vec::new();
    let mut current_id = 0u32;

    for (i, &delta) in deltas.iter().enumerate() {
        if i == 0 {
            // First entry: direct value, not a delta
            current_id = delta as u32;
        } else {
            // Subsequent entries: add delta to previous ID
            current_id = (current_id as i32 + delta) as u32;
        }
        ids.push(current_id);

        // Important: Increment for next iteration
        current_id += 1;
    }

    ids
}
}

Note: The algorithm increments current_id by 1 after each entry, then applies the next delta. This handles sequential FileDataIDs efficiently.

Lookup Process

Parse Root file: Decompress BLTE, read header and blocks
Filter by flags: Select blocks matching desired locale/content
Find FileDataID: Binary search or iterate through blocks
Extract content key: Retrieve corresponding MD5 hash
Resolve via encoding: Use content key to find encoding key

Name Hash Calculation

For named files, Jenkins96 hash (hashlittle2) is used:

#![allow(unused)]
fn main() {
fn jenkins96_hash(filename: &str) -> u64 {
    // Normalize path: uppercase with backslashes (matching CascLib's
    // NormalizeFileName_UpperBkSlash)
    let normalized = filename.to_uppercase().replace('/', "\\");
    let bytes = normalized.as_bytes();

    // Jenkins hashlittle2 with pc=0, pb=0
    let hash = Jenkins96::hash(bytes);

    // Return (pc << 32) | pb directly (no word swap)
    // Matches CascLib's CalcNormNameHash
    hash.hash64
}
}

Important Jenkins96 Details:

Paths are normalized to uppercase with backslashes (not forward slashes)
The hash is 64-bit (8 bytes) not 96-bit despite the name
Some blocks have NoNameHash flag, omitting name hashes entirely
Uses Bob Jenkins’ lookup3.c algorithm (hashlittle2 function)
Processes data in 12-byte chunks with little-endian byte order
The 0xDEADBEEF constant is added during initialization
Python validation tool available in cascette-py project: https://github.com/wowemulation-dev/cascette-py

Example Hashes:

Empty string: 0xDEADBEEFDEADBEEF
Interface\Icons\INV_Misc_QuestionMark.blp: 0x9EB59E3C76124837

Implementation Example

#![allow(unused)]
fn main() {
struct RootFile {
    header: RootHeader,
    blocks: Vec<RootBlock>,
}

impl RootFile {
    pub fn find_file(&self, file_data_id: u32) -> Option<MD5Hash> {
        for block in &self.blocks {
            // Check if block matches desired flags
            if !self.matches_flags(block) {
                continue;
            }

            // Search for FileDataID
            if let Some(idx) = block.find_file_index(file_data_id) {
                return Some(block.records[idx].content_key);
            }
        }
        None
    }
}
}

Version History

Build 18125 (6.0.1): Initial CASC Root format (V1)
- No magic header
- 12-byte block header: num_records, content_flags, locale_flags
- Interleaved record format: (ckey, name_hash) per record
Build 30080 (8.2.0): Added MFST magic signature (V2)
- MFST/TSFM magic header with file counts
- 17-byte block header: num_records, locale_flags, content_flags_1, content_flags_2, content_flags_3
- Field order changed: locale_flags moved before content_flags
- Combined content flags: content_flags_1 | content_flags_2 | (content_flags_3 << 17)
- Separated array format: all ckeys, then all name_hashes
Build 50893 (10.1.7): Added header_size/version fields (V3)
- Extended header with header_size, version, padding fields
- Same 17-byte block header format as V2
Build 58221 (11.1.0): Extended content flags to 40 bits (V4)
- 18-byte block header (content_flags grows from 4 to 5 bytes)
- 40-bit content flags stored as u32 + u8

Version Detection Code

#![allow(unused)]
fn main() {
fn detect_root_version(data: &[u8]) -> RootVersion {
    if data.len() < 4 {
        return RootVersion::Invalid;
    }

    // Check for MFST or TSFM magic
    let magic = &data[0..4];
    if magic != b"MFST" && magic != b"TSFM" {
        return RootVersion::V1; // Pre-30080, no magic
    }

    // Read the two u32 values after magic
    let value1 = u32::from_le_bytes(data[4..8].try_into().unwrap());
    let value2 = u32::from_le_bytes(data[8..12].try_into().unwrap());

    // Heuristic: header_size in [16, 100) and version < 10
    // indicates v3+ with explicit header_size/version fields
    if (16..100).contains(&value1) && value2 < 10 {
        match value2 {
            4.. => RootVersion::V4,
            _ => RootVersion::V3, // version 1-3 all use V2/V3 block format
        }
    } else {
        RootVersion::V2 // 30080+, value1 is total_file_count
    }
}
}

Parser Implementation Status

The Python parser (cascette-py) currently supports:

Version detection (MFST/TSFM magic)
Version 1-3 parsing
Block-based extraction
Content key retrieval
Delta encoding detection (identifies but doesn’t decode)

The parser can extract FileDataID to content key mappings from all current WoW root file versions.

See https://github.com/wowemulation-dev/cascette-py for the Python implementation.

Common Issues

V2 block header size: V2+ uses a 17-byte block header, not 12 bytes like V1. Using the wrong header size causes all subsequent parsing to fail with garbage FileDataIDs and content keys.
V2 field order change: V2+ swapped locale_flags and content_flags positions. In V1: num_records, content_flags, locale_flags. In V2+: num_records, locale_flags, content_flags, unk2, unk3.
Multiple matches: Same file may exist in multiple blocks with different locales
Missing entries: Not all FileDataIDs have corresponding entries
Flag interpretation: Game-specific flag meanings vary
Delta overflow: Large gaps in FileDataIDs can cause integer overflow

Implementation Notes

Version Detection Heuristic

The version detection uses value2 < 10 to identify extended headers, which is broader than the strict matches!(value2, 1..=4) check. Version 1 is accepted and maps to V2 block format (17-byte header, locale_flags first). This matches CascLib and TACTSharp behavior. The heuristic may need tightening if future versions use values in the 5-9 range for non-version purposes.

Block Header Dispatch

The current dispatch is verified correct:

Plain V1 files (no MFST/TSFM magic) use the 12-byte header (content_flags first)
All MFST/TSFM files (including Classic Era) use the 17-byte header (locale_flags first)
V4 files use the 18-byte header (40-bit content flags)

The V2 17-byte format applies to all MFST/TSFM files regardless of the header version field value. The 12-byte format is only used for pre-magic V1 files.

References

See Encoding Documentation for content key resolution
See BLTE Format for container structure
See CDN Architecture for file retrieval
wowdev.wiki TACT documentation - Authoritative source for CASC/TACT format specifications including Root file structure

Install Manifest Format

The Install manifest tracks which game files should be installed on disk and manages file tags for selective installation based on system requirements and user preferences.

Overview

The Install manifest maps content keys to installation paths and uses a tag bitmap system for selective installation based on platform, architecture, and locale. File sizes in entries support installation size estimation.

File Structure

The Install manifest is BLTE-encoded and contains:

[BLTE Container]
  [Header]
  [Tag Section]
  [File Entries]

Binary Format

struct InstallHeader {
    uint16_t magic;              // 'IN' (0x494E)
    uint8_t  version;            // Version (1 or 2)
    uint8_t  ckey_length;        // Content key length in bytes (16)
    uint16_t tag_count;          // Number of tags (big-endian)
    uint32_t entry_count;        // Number of file entries (big-endian)

    // Version 2+ fields (6 additional bytes, total 16 bytes)
    uint8_t  content_key_size;   // Content key size (Agent.exe) / loose file type (CascLib)
    uint32_t entry_count_v2;     // Additional entry count (big-endian)
    uint8_t  unknown;            // Unknown byte
};

For version 1, the content key size is derived as ckey_length + 4 (content key + 4-byte file size). Version 2 specifies content_key_size explicitly.

Tag Section

Tags categorize files for selective installation. Each tag consists of:

struct InstallTag {
    char     name[];             // Null-terminated tag name
    uint16_t type;               // Tag type (big-endian)
    uint8_t  bit_mask[];         // Bit mask ((entry_count + 7) / 8 bytes)
};

Important: The bit mask uses big-endian (MSB-first) bit ordering within each byte:

Bit 7 (MSB) corresponds to file index byte_index * 8 + 0
Bit 0 (LSB) corresponds to file index byte_index * 8 + 7
The mask for a given file index is 0x80 >> (file_index % 8)

File Entry

File entries follow the tag section:

struct InstallFileEntry {
    char     path[];             // Null-terminated file path
    uint8_t  content_key[16];    // MD5 content key
    uint32_t file_size;          // File size (big-endian)
};

Tag associations are determined by bit positions in each tag’s bit mask.

Tag System

Tag Types

Type	Value	Description	Examples
Platform	0x0001	Operating system tags	Windows, OSX, Android, IOS
Architecture	0x0002	CPU architecture tags	x86_32, x86_64, arm64
Locale	0x0003	Language/region tags	enUS, deDE, frFR
Category	0x0004	Content category tags	speech, text
Unknown	0x0005	Unknown tag type	(seen in manifests)
Component	0x0010	Component tags	game, launcher
Version	0x0020	Version tags	live, ptr, beta
Optimization	0x0040	Optimization tags	retail, debug
Region	0x0080	Region tags	US, EU, KR
Device	0x0100	Device tags	desktop, mobile
Mode	0x0200	Mode tags	online, offline
Branch	0x0400	Branch tags	main, experimental
Content	0x0800	Content tags	cinematics, audio
Feature	0x1000	Feature tags	graphics, physics
Expansion	0x2000	Expansion tags	base, expansion1
Alternate	0x4000	Alternate content	Alternate, HighRes
Option	0x8000	Option tags	(optional features)

Common Tags

Platform Tags:

- Windows, OSX, Android, IOS, Web

Architecture Tags:

- x86_32, x86_64, arm64

Locale Tags:

- enUS, enGB, deDE, frFR, esES, esMX, itIT,
  ruRU, koKR, zhTW, zhCN, ptBR, ptPT

Category Tags:

- speech, text

Alternate Tags:

- Alternate, HighRes

Tag Mask Usage

Tags use bit masks to indicate which files they apply to:

#![allow(unused)]
fn main() {
fn should_install(
    file_index: usize,
    tag: &InstallTag,
    selected: bool
) -> bool {
    let byte_index = file_index / 8;
    let bit_offset = file_index % 8;

    if byte_index >= tag.bit_mask.len() {
        return false;
    }

    // Big-endian (MSB-first) bit ordering within bytes: bit 0 = MSB
    let has_tag = (tag.bit_mask[byte_index] & (0x80 >> bit_offset)) != 0;
    has_tag && selected
}
}

Installation Planning

Size Calculation

Calculate installation size for selected tags:

#![allow(unused)]
fn main() {
fn calculate_install_size(
    entries: &[InstallFileEntry],
    selected_tags: u16
) -> u64 {
    entries.iter()
        .filter(|e| should_install(e, selected_tags))
        .map(|e| e.file_size as u64)
        .sum()
}
}

Path Resolution

Convert relative paths to absolute:

#![allow(unused)]
fn main() {
fn resolve_install_path(
    base_dir: &Path,
    entry: &InstallFileEntry
) -> PathBuf {
    let relative_path = std::str::from_utf8(&entry.path).unwrap();
    base_dir.join(relative_path)
}
}

File Categories

Essential Files

Files with tag mask 0x0000 or 0xFFFF:

Core executables
Essential libraries
Base configuration
Critical game data

Optional Content

Files with specific tag requirements:

High-resolution textures (HighResTextures tag)
Cinematics (Cinematics tag)
Additional languages (locale tags)
Developer tools (DevTools tag)

Implementation Example

#![allow(unused)]
fn main() {
struct InstallFile {
    header: InstallHeader,
    tags: Vec<InstallTag>,
    entries: Vec<InstallFileEntry>,
}

impl InstallFile {
    pub fn get_install_list(&self, tags: &[String]) -> Vec<InstallItem> {
        let tag_mask = self.build_tag_mask(tags);

        self.entries.iter()
            .filter(|e| should_install(e, tag_mask))
            .map(|e| InstallItem {
                content_key: e.content_key,
                install_path: String::from_utf8_lossy(&e.path).to_string(),
                file_size: e.file_size,
            })
            .collect()
    }

    fn build_tag_mask(&self, tag_names: &[String]) -> u16 {
        let mut mask = 0u16;

        for name in tag_names {
            if let Some(tag) = self.tags.iter().find(|t| t.name == name) {
                mask |= 1 << tag.id;
            }
        }

        mask
    }
}
}

Selective Installation

Platform-Specific

Install only files for current platform:

#![allow(unused)]
fn main() {
fn get_platform_tags() -> Vec<String> {
    let mut tags = vec!["Base".to_string()];

    #[cfg(target_os = "windows")]
    tags.push("Windows".to_string());

    #[cfg(target_arch = "x86_64")]
    tags.push("x64".to_string());

    tags
}
}

Language Selection

Install specific language assets:

#![allow(unused)]
fn main() {
fn get_locale_tags(selected_locale: &str) -> Vec<String> {
    vec![
        "Base".to_string(),
        selected_locale.to_string(),
    ]
}
}

Optimization Strategies

Parallel Installation

Install multiple files concurrently:

#![allow(unused)]
fn main() {
use rayon::prelude::*;

fn install_files(items: Vec<InstallItem>) {
    items.par_iter()
        .for_each(|item| {
            download_and_install(item);
        });
}
}

Incremental Updates

Track installed files for patching:

#![allow(unused)]
fn main() {
struct InstalledFiles {
    entries: HashMap<PathBuf, InstalledFileInfo>,
}

struct InstalledFileInfo {
    content_key: [u8; 16],
    file_size: u32,
    modified_time: SystemTime,
}
}

Validation

Post-Installation Verification

#![allow(unused)]
fn main() {
fn verify_installation(
    install_dir: &Path,
    install_file: &InstallFile,
    selected_tags: u16
) -> Result<()> {
    for entry in &install_file.entries {
        if !should_install(entry, selected_tags) {
            continue;
        }

        let path = install_dir.join(&entry.path);

        // Verify file exists
        if !path.exists() {
            return Err("Missing file");
        }

        // Verify file size
        let metadata = fs::metadata(&path)?;
        if metadata.len() != entry.file_size as u64 {
            return Err("Size mismatch");
        }
    }

    Ok(())
}
}

Repair Process

Detect and repair corrupted installations:

#![allow(unused)]
fn main() {
fn repair_installation(
    install_file: &InstallFile,
    install_dir: &Path
) -> Vec<RepairAction> {
    let mut actions = Vec::new();

    for entry in &install_file.entries {
        let path = install_dir.join(&entry.path);

        if !path.exists() {
            actions.push(RepairAction::Download(entry.content_key));
        } else if !verify_file(&path, entry) {
            actions.push(RepairAction::Redownload(entry.content_key));
        }
    }

    actions
}
}

Common Issues

Tag conflicts: Multiple tags may include same file
Path separators: Handle platform-specific separators
Case sensitivity: File systems vary in case handling
Symlink support: Some platforms don’t support symlinks
Permission issues: Installation may require elevation

Special Considerations

Shared Files

Files used by multiple products:

#![allow(unused)]
fn main() {
struct SharedFile {
    content_key: [u8; 16],
    products: Vec<String>,
    ref_count: u32,
}
}

Uninstall Tracking

Track files for clean uninstall:

#![allow(unused)]
fn main() {
struct UninstallManifest {
    files: Vec<PathBuf>,
    directories: Vec<PathBuf>,
    registry_keys: Vec<String>,  // Windows only
}
}

Parser Implementation Status

Python Parser (cascette-py)

Status: Complete

Capabilities:

Version 1 header parsing with IN magic detection
Tag extraction with big-endian (MSB-first) bit ordering
Platform/architecture/locale tag type classification
File entry parsing with path, content key, and size
Tag-to-file association via bitmask resolution
BLTE decompression for compressed manifests

Verified Against:

WoW 11.0.5.57689 (242 entries, 28 tags)
Multiple WoW Classic builds
Cross-platform tag validation (Windows, OSX, mobile)

Known Issues: None

See https://github.com/wowemulation-dev/cascette-py for the Python implementation.

Version History

The Install manifest format has two versions:

Version 1

Header Size: 10 bytes
Magic: “IN” (0x494E)
Entry Size: Derived as ckey_length + 4
Features:
- File path to content key mapping
- Tag-based selective installation
- Platform/architecture/locale filtering
- Bit mask system for tag associations
- Big-endian (MSB-first) bit ordering in tag masks
- Tag type classification (17 types from Platform through Option)

Version 2

Header Size: 16 bytes (10 base + 6 additional)
Added Fields: content_key_size (1 byte), entry_count_v2 (4 bytes BE), unknown (1 byte)
Features: All version 1 features plus explicit content key size

Version Detection

The version field is at offset 2 in the header. The agent accepts versions 1 and 2 (validates non-zero and <= 2).

Implementation Status

cascette-formats: Full support for versions 1 and 2 with validation
cascette-py: Complete parsing for version 1 with tag extraction

References

See Root File for file catalog
See Download Manifest for download prioritization
See Encoding Documentation for content resolution
See Format Transitions for format evolution tracking

Download Manifest Format

The Download manifest manages content streaming and prioritization during game installation and updates. It defines which files are essential for gameplay and their download order.

Overview

The Download manifest assigns a priority to each file entry so the client can download essential content first (enabling play before full download) and stream remaining content in the background. Tag bitmaps enable per-platform and per-locale filtering. File sizes in entries support progress estimation.

File Structure

The Download manifest is BLTE-encoded and contains:

[BLTE Container]
  [Header]
  [File Entries]
  [Tag Section]

Binary Format

Header

struct DownloadHeader {
    char     magic[2];           // "DL" (0x44, 0x4C)
    uint8_t  version;            // Version (1, 2, or 3)
    uint8_t  ekey_size;          // Encoding key size in bytes (16)
    uint8_t  has_checksum;       // Checksum presence flag
    uint32_t entry_count;        // Number of entries (big-endian)
    uint16_t tag_count;          // Number of tags (big-endian)

    // Version 2+ fields (header grows to 12 bytes)
    uint8_t  flag_size;          // Number of flag bytes per entry (max 4)

    // Version 3+ fields (header grows to 16 bytes)
    int8_t   base_priority;      // Base priority offset
    uint8_t  _reserved[3];       // Reserved (agent does not validate these)
};

Entry Order

The download manifest stores data in this order:

Header
All file entries
All tags (appear after entries)

File Entry

struct DownloadEntry {
    uint8_t  ekey[16];           // Encoding key (variable size from header)
    uint8_t  file_size[5];       // 40-bit file size (big-endian)
    int8_t   priority;           // Download priority (adjusted by base_priority)

    // Optional fields
    uint32_t checksum;           // If has_checksum is true (big-endian)
    uint8_t  flags[N];           // If version >= 2, N = flag_size
};

Tag Entry

Tags appear after all file entries in the manifest:

struct DownloadTag {
    char     name[];             // Null-terminated tag name
    uint16_t type;               // Tag type (big-endian)
    uint8_t  bitmap[];           // Bit mask ((entry_count + 7) / 8 bytes)
};

Each bit in the bitmap corresponds to a file entry index. If bit N is set, entry N has this tag.

Priority System

Priority Calculation

In version 3+, priorities are adjusted:

final_priority = entry.priority - header.base_priority

Priority Levels

Lower values indicate higher priority:

Priority	Category	Typical Content
< 0	Critical	Must download before game starts
0	Essential	Required for basic gameplay
1-2	High	Important for full experience
3-5	Normal	Standard content
> 5	Low	Optional/deferred content

Priority-Based Download

#![allow(unused)]
fn main() {
fn get_download_order(entries: &[DownloadFileEntry]) -> Vec<&DownloadFileEntry>
{
    let mut sorted = entries.iter().collect::<Vec<_>>();
    sorted.sort_by_key(|e| (e.priority, e.file_size));
    sorted
}
}

Streaming Strategy

Minimum Playable Set

Calculate minimum download for gameplay:

#![allow(unused)]
fn main() {
fn get_minimum_download(
    download_file: &DownloadFile
) -> (Vec<DownloadFileEntry>, u64) {
    let essential: Vec<_> = download_file.entries
        .iter()
        .filter(|e| e.priority <= 1)  // Essential + Critical
        .cloned()
        .collect();

    let total_size = essential.iter()
        .map(|e| e.file_size as u64)
        .sum();

    (essential, total_size)
}
}

Progressive Download

Download in priority order while game runs:

#![allow(unused)]
fn main() {
struct DownloadManager {
    queue: VecDeque<DownloadItem>,
    active: Vec<DownloadTask>,
    completed: HashSet<[u8; 16]>,
}

impl DownloadManager {
    pub fn start_progressive_download(&mut self) {
        // Sort by priority
        self.queue.sort_by_key(|item| item.priority);

        // Start downloading highest priority
        while self.active.len() < MAX_CONCURRENT {
            if let Some(item) = self.queue.pop_front() {
                self.start_download(item);
            }
        }
    }
}
}

Tag-Based Filtering

Platform-Specific Downloads

Tags are stored separately from entries. Each tag contains a bitmap indicating which entries it applies to. To filter by tag, find the tag by name and check its bitmap:

#![allow(unused)]
fn main() {
fn filter_by_tag<'a>(
    manifest: &'a DownloadManifest,
    tag_name: &str,
) -> Vec<(usize, &'a DownloadFileEntry)> {
    let tag = match manifest.tags.iter().find(|t| t.name == tag_name) {
        Some(t) => t,
        None => return Vec::new(),
    };

    manifest.entries.iter().enumerate()
        .filter(|(index, _)| tag.has_file(*index))
        .collect()
}
}

Language Packs

#![allow(unused)]
fn main() {
fn get_language_pack<'a>(
    manifest: &'a DownloadManifest,
    locale: &str,
) -> Vec<&'a DownloadFileEntry> {
    let tag = match manifest.tags.iter().find(|t| t.name == locale) {
        Some(t) => t,
        None => return Vec::new(),
    };

    manifest.entries.iter().enumerate()
        .filter(|(index, _)| tag.has_file(*index))
        .map(|(_, entry)| entry)
        .collect()
}
}

Download Optimization

Bandwidth Management

#![allow(unused)]
fn main() {
struct BandwidthManager {
    max_bandwidth: u64,      // Bytes per second
    current_usage: u64,
    priority_limits: Vec<u64>, // Per-priority limits
}

impl BandwidthManager {
    pub fn allocate_bandwidth(&mut self, priority: u8) -> u64 {
        let priority_limit = self.priority_limits[priority as usize];
        let available = self.max_bandwidth - self.current_usage;

        std::cmp::min(priority_limit, available)
    }
}
}

Chunk-Based Downloads

For large files, download in chunks:

#![allow(unused)]
fn main() {
struct ChunkedDownload {
    encoding_key: [u8; 16],
    total_size: u64,
    chunk_size: u64,
    chunks_completed: Vec<bool>,
}

impl ChunkedDownload {
    pub fn get_next_chunk(&self) -> Option<(u64, u64)> {
        for (idx, &completed) in self.chunks_completed.iter().enumerate() {
            if !completed {
                let offset = idx as u64 * self.chunk_size;
                let size = std::cmp::min(
                    self.chunk_size,
                    self.total_size - offset
                );
                return Some((offset, size));
            }
        }
        None
    }
}
}

Progress Tracking

Download Statistics

#![allow(unused)]
fn main() {
struct DownloadProgress {
    total_files: u32,
    completed_files: u32,
    total_bytes: u64,
    downloaded_bytes: u64,
    current_speed: f64,
    eta_seconds: u64,
}

impl DownloadProgress {
    pub fn update(&mut self, bytes_downloaded: u64) {
        self.downloaded_bytes += bytes_downloaded;
        self.current_speed = self.calculate_speed();
        self.eta_seconds = self.calculate_eta();
    }

    pub fn completion_percentage(&self) -> f32 {
        (self.downloaded_bytes as f32 / self.total_bytes as f32) * 100.0
    }
}
}

Implementation Example

#![allow(unused)]
fn main() {
struct DownloadFile {
    header: DownloadHeader,
    priorities: Vec<DownloadPriority>,
    tags: Vec<DownloadTag>,
    entries: Vec<DownloadFileEntry>,
}

impl DownloadFile {
    pub fn get_download_plan(
        &self,
        tags: &[String],
        max_priority: u8
    ) -> DownloadPlan {
        let tag_mask = self.build_tag_mask(tags);

        let files: Vec<_> = self.entries
            .iter()
            .filter(|e| e.priority <= max_priority)
            .filter(|e| (e.tag_mask & tag_mask) != 0)
            .cloned()
            .collect();

        let total_size = files.iter()
            .map(|f| f.file_size as u64)
            .sum();

        DownloadPlan {
            files,
            total_size,
            estimated_time: self.estimate_time(total_size),
        }
    }
}
}

On-Demand Streaming

Asset Request Handling

#![allow(unused)]
fn main() {
struct OnDemandManager {
    download_file: DownloadFile,
    cache: LruCache<[u8; 16], Vec<u8>>,
}

impl OnDemandManager {
    pub async fn get_asset(&mut self, encoding_key: &[u8; 16]) -> Result<Vec<u8>> {
        // Check cache first
        if let Some(data) = self.cache.get(encoding_key) {
            return Ok(data.clone());
        }

        // Find in download manifest
        if let Some(entry) = self.find_entry(encoding_key) {
            // Download with high priority
            let data = self.download_immediate(entry).await?;
            self.cache.put(*encoding_key, data.clone());
            return Ok(data);
        }

        Err("Asset not found")
    }
}
}

Verification

Checksum Validation

#![allow(unused)]
fn main() {
fn verify_download(
    data: &[u8],
    entry: &DownloadFileEntry
) -> bool {
    if entry.checksum != [0; 16] {
        let computed = md5::compute(data);
        computed.0 == entry.checksum
    } else {
        true // No checksum to verify
    }
}
}

Common Issues

Priority conflicts: Multiple systems requesting same file
Bandwidth throttling: ISP or network limitations
Incomplete downloads: Handle partial file recovery
Cache corruption: Verify cached files periodically
Tag mismatches: Platform detection errors

Special Features

Differential Downloads

Download only changed portions:

#![allow(unused)]
fn main() {
struct DifferentialDownload {
    old_version: [u8; 16],
    new_version: [u8; 16],
    patches: Vec<PatchInfo>,
}
}

Peer-to-Peer Support

Share downloaded content locally:

#![allow(unused)]
fn main() {
struct P2PManager {
    local_peers: Vec<PeerInfo>,
    shared_files: HashSet<[u8; 16]>,
}
}

Parser Implementation Status

Python Parser (cascette-py)

Status: Complete

Capabilities:

Version 1-3 header parsing with DL magic detection
40-bit big-endian compressed size parsing
Priority system with base priority adjustment (v3)
Tag parsing with bitmap support (tags stored after all entries)
Platform/architecture tag identification with type classification
Sample entry display (first 100 entries)
Format evolution tracking across versions
BLTE decompression for compressed manifests
Correct entry/tag ordering (entries first, then tags)

Verified Against:

WoW 11.0.5.57689 (2.4M entries, 28 tags)
WoW 9.0.2.37176 (Shadowlands)
WoW 7.3.5.25848 (Legion)
WoW Classic builds

Known Issues: None

See https://github.com/wowemulation-dev/cascette-py for the Python implementation.

Version History

The Download manifest format has evolved through 3 versions:

Version 1 (Initial)

Header Size: 11 bytes
Features: Basic download prioritization with encoding keys, file sizes, optional checksums
Fields: magic, version, ekey_size, has_checksum, entry_count, tag_count

Version 2 (Flag Support)

Header Size: 12 bytes
Added Features: Entry-level flags for additional metadata
New Fields: flag_size (number of flag bytes per entry, max 4)
Use Cases: Platform-specific flags, content type markers

Version 3 (Priority System)

Header Size: 16 bytes
Added Features: Base priority adjustment for dynamic prioritization
New Fields: base_priority (signed adjustment), reserved (3 bytes)
Priority Calculation: final_priority = entry.priority - header.base_priority

Version Detection

Parsers detect version by reading the version field at offset 2 in the header. All versions use the same “DL” magic bytes and big-endian encoding.

Implementation Status

cascette-formats: Full support for versions 1-3 with version-aware parsing
cascette-py: Complete parsing for versions 1-3 with validation

References

See Install Manifest for installation management
See Encoding Documentation for key resolution
See CDN Architecture for download sources
See Format Transitions for version evolution timeline

Size Manifest Format

The Size manifest maps encoding keys to estimated file sizes (eSize). It is used when compressed size (cSize) is unavailable, allowing the agent to estimate disk space requirements and report download progress for content that has not yet been downloaded.

Overview

The Size manifest provides:

Estimated file sizes for pre-download space allocation
Progress bar calculations during installation
Disk space requirement checks
Fallback sizing when compressed size is unknown

The agent log message “Loose files will estimate using eSize instead of cSize” indicates when this manifest is active.

Build Configuration Reference

The Size manifest is referenced by the size key in build configuration files:

size = d1d9e612a645cc7a7e4b42628bde21ce 0d5704735f4985e555907a7e7647099a
size-size = 3637629 3076687

The first hash is the content key, the second is the encoding key used for CDN fetch. The size-size field contains the unencoded and encoded sizes. Like other manifests, the Size manifest is BLTE-encoded on CDN.

The config key .tact:size_manifest also references this manifest in the agent’s internal configuration.

Community Documentation

This format is documented on wowdev.wiki as the “Download Size” manifest. The wiki documents version 1 from an older Agent build (6700). The TACT 3.13.3 agent binary supports versions 1 and 2. The wiki’s “EKey Size” byte at offset 3 corresponds to the flags field described below. The version 2 format with its 40-bit total size field is not documented on the wiki.

File Structure

The Size manifest is BLTE-encoded and contains:

[BLTE Container]
  [Header]
  [Entries]

Binary Format

All multi-byte integers are big-endian.

Header

struct SizeManifestHeader {
    char     magic[2];           // "DS" (0x44, 0x53)
    uint8_t  version;            // Version (1 or 2)
    uint8_t  flags;              // Flags byte
    uint32_t entry_count;        // Number of entries (big-endian)
    uint16_t key_size_bits;      // Key size in bits (big-endian)

    // Version-specific fields follow
};

Version 1 Header Extension (offset 10)

struct SizeManifestHeaderV1 {
    // ... base header fields above ...
    uint64_t total_size;         // Total size across all entries (big-endian)
    uint8_t  esize_bytes;        // Byte width of eSize per entry (1-8)
};
// Total header size: 19 bytes (0x13)

The esize_bytes field determines how many bytes each entry’s size value occupies. Valid values are 1 through 8. Invalid values produce: “Invalid eSize byte count ‘%u’ in size manifest header.”

Version 2 Header Extension (offset 10)

struct SizeManifestHeaderV2 {
    // ... base header fields above ...
    uint8_t  total_size[5];      // Total size as 40-bit big-endian integer
};
// Total header size: 15 bytes (0x0F)

Version 2 fixes esize_bytes at 4 (32-bit sizes per entry). The total size uses a 40-bit integer (5 bytes), reducing header size compared to version 1.

Minimum Size Validation

The parser validates two minimum sizes:

15 bytes (0x0F) – enough to read magic, version, entry_count, and key_size_bits
19 bytes (0x13) – full version 1 header (version 2 headers are shorter and pass this check)

If the data is too small: “Detected truncated size manifest. Only got %u bytes, but minimum header size is %u bytes.”

Entry Format

Entries are stored sequentially after the header:

struct SizeManifestEntry {
    uint8_t  key[];              // Encoding key, null-terminated
    uint16_t key_hash;           // 16-bit hash/identifier (big-endian)
    uint8_t  esize[];            // Estimated size (esize_bytes width, big-endian)
};

The key field length in bytes is (key_size_bits + 7) / 8, which rounds the bit count up to the nearest byte. The key is stored as a null-terminated byte string within this field.

Key Hash Validation

The 2-byte key_hash field after the key is validated. Values 0x0000 and 0xFFFF are treated as invalid sentinel values and cause the parser to reject the entry.

Entry Size Field

The esize field width depends on the version:

Version	esize width	Source
1	`esize_bytes` from header (1-8)	Variable
2	4 bytes (fixed)	Hardcoded

Version History

Version	Header size	esize width	total_size width	Notes
1	19 bytes	Variable (1-8)	64-bit	Original format, documented on wowdev.wiki
2	15 bytes	Fixed (4)	40-bit	Compact header, undocumented on wiki

Relationship to Other Manifests

The Size manifest is one of six manifest types in TACT:

Config key	Magic	Format
`encoding`	`EN`	Content key to encoding key mapping
`root`	(varies)	Path to content key mapping
`install`	`IN`	Install manifest with file tags
`download`	`DL`	Download manifest with priorities
`patch`	`PA`	Patch manifest for delta updates
`size`	`DS`	Size manifest (this format)

Validation

The parser validates manifests at parse time and via an explicit validate() method:

Entry count matches the header’s entry_count field
Sum of all entry esize values matches the header’s total_size field
key_size_bits must be > 0
Key hash sentinel values (0x0000, 0xFFFF) are rejected

Error Messages

Condition	Message
Truncated data	“Detected truncated size manifest. Only got %u bytes, but minimum header size is %u bytes.”
Bad magic	“Invalid magic string in size manifest.”
Bad version	“Unsupported size manifest version: %u. This client only supports non-zero versions <= %u”
Bad esize width	“Invalid eSize byte count ‘%u’ in size manifest header.”
Zero key size	“Invalid key size: key_size_bits must be > 0”
Bad key hash	“Invalid key hash sentinel value: 0x{value:04X}”
Entry count mismatch	“Entry count mismatch: header says {expected}, found {actual}”
Total size mismatch	“Total size mismatch: header says {expected}, sum of esizes is {actual}”

Implementation Status

Implemented in cascette-formats crate (crates/cascette-formats/src/size/).

The implementation provides:

Parser and builder for both version 1 and version 2 formats
Manual BinRead/BinWrite implementations for headers and entries
Variable-width esize field support (1-8 bytes for V1, fixed 4 bytes for V2)
40-bit total_size handling for V2 headers
Key hash sentinel validation (rejects 0x0000 and 0xFFFF)
CascFormat trait implementation for round-trip support
Builder pattern for constructing manifests

Archive Files and Indices

CASC/TACT archives are container files that store game content in a packed format. They work with index files to enable efficient content retrieval without unpacking entire archives. The system uses different formats for network (TACT) and local storage (CASC).

Overview

The archive system provides:

Bulk storage of game assets in .archive files
Index files for fast content location
Support for partial downloads via HTTP range requests
Deduplication through content addressing

Archive Files

CDN Archives vs Local Archives

CDN Archives (TACT - served over HTTP):

Named using 32-character hash keys (e.g., 86b6b0daf3d8ef68271b15567c37300c)
Accessed via URL path: /tpr/wow/data/{hash[:2]}/{hash[2:4]}/{hash}
Paired with Archive Index files (.index) for content location
Single BLTE-encoded container format
Part of TACT (Tooling for Archive Content Transfer) protocol

Local Client Archives (CASC - stored on disk):

Named with numeric indices: data.001, data.002, etc.
Use IDX Journal files (.idx) for local content access
Multiple BLTE files concatenated together
Part of CASC (Content Addressable Storage Container) system
Optimized for memory-mapped access

CDN Archive Structure

CDN archives are single BLTE-encoded containers, while local archives contain multiple BLTE files:

CDN Archive Format (TACT):          Local Archive Format (CASC):
┌──────────────────┐                ┌──────────────────┐
│ BLTE Container   │                │ BLTE File 1      │
├──────────────────┤                ├──────────────────┤
│ Header & Blocks  │                │ BLTE File 2      │
├──────────────────┤                ├──────────────────┤
│ Content Blocks   │                │ BLTE File 3      │
│ (concatenated)   │                │      ...         │
└──────────────────┘                └──────────────────┘

Verified Archive Characteristics

Based on examination of sample archives:

File sizes: Range from ~7MB to 268MB when compressed
Compression ratios: 4.9x to 190x compression achieved via BLTE
Content types: WDB Cache files (WDC3), textures, models, and other game assets
Decompressed content: Much smaller than archive size (1-2MB typical)
Access pattern: Content addressed via hash keys in index files

CRITICAL: Two Completely Different Index Systems

⚠️ CDN Archive Index (.index) vs Local Storage Index (.idx)

NEVER CONFUSE THESE TWO FORMATS - THEY ARE COMPLETELY DIFFERENT:

CDN Archive Index Files (.index): TACT format with 28-byte footer, variable-length encoding keys
Local Storage Index Files (.idx): CASC format with header, fixed 9-byte content key buckets

These systems serve different purposes and use entirely different formats, key types, and data structures.

CDN Archive Index Format (TACT Protocol)

File Extension: .index Location: Downloaded from CDN Purpose: Maps variable-length encoding keys to CDN archive locations Key Type: Encoding keys (from Encoding file) Key Length: Variable, as specified in footer’s ekey_length field (typically 16 bytes, sometimes 9) Implementation: cascette-formats/src/archive/index.rs

Archive Index Files (.index) - TACT Protocol

Based on analysis of actual CDN index files from various WoW builds.

CDN archive indexes use a chunk-based format with footer metadata:

Archive Index Structure

Index File Layout:
┌────────────────┐
│ Data Chunks    │ <- 4KB chunks containing entries
│ (4096 bytes)   │
├────────────────┤
│ ...            │
├────────────────┤
│ Last Chunk     │ <- Table of contents + entries
├────────────────┤
│ Footer         │ <- Metadata (variable length)
└────────────────┘

CDN Index Entry Format (Variable Length)

struct CDNArchiveIndexEntry {
    uint8_t  ekey[ekey_length];  // Encoding key (variable length from footer)
    uint32_t encoded_size;       // BLTE encoded size (big-endian)
    uint32_t archive_offset;     // Offset in archive (big-endian)
};

Entry Size: Variable = ekey_length + size_bytes + offset_bytes (from footer) Typical Sizes:

With 16-byte keys: 16 + 4 + 4 = 24 bytes per entry
With 9-byte keys: 9 + 4 + 4 = 17 bytes per entry

Key Properties:

Encoding key length specified in footer’s ekey_length field
All multi-byte fields use big-endian encoding
NEVER assume fixed 9-byte keys - always read from footer

Archive Index files use a 28-byte footer at the end of the file:

struct ArchiveIndexFooter {  // 28 bytes total
    uint8_t  toc_hash[8];     // MD5(toc_keys || block_hashes)[:footer_hash_bytes]
    uint8_t  version;         // Must be 0 or 1
    uint8_t  reserved[2];     // Must be [0, 0]
    uint8_t  page_size_kb;    // Must be 4 (4KB pages)
    uint8_t  offset_bytes;    // Archive offset field size (4, 5, or 6)
    uint8_t  size_bytes;      // Compressed size field size (always 4)
    uint8_t  ekey_length;     // EKey length in bytes (16 for full MD5)
    uint8_t  footer_hash_bytes; // Footer hash length (always 8)
    uint32_t element_count;   // Number of entries (little-endian - special case!)
    uint8_t  footer_hash[8];  // MD5 footer validation (first 8 bytes)
};

Verified Footer Properties:

Standard values: offset_bytes=4, size_bytes=4, ekey_length=16 (1-16 valid)
offset_bytes can be 4 (regular archives), 5 (archives >4GB), or 6 (archive-groups: 2-byte archive index + 4-byte offset)
Page/chunk size consistently 4096 bytes
Item length consistently 24 bytes (0x18)
Archive filename = MD5 hash of the footer
Footer validation uses MD5 hashing (first 8 bytes of hash)
Mixed endianness: element_count field is little-endian while all other

multi-byte fields are big-endian
TOC hash field is present but not validated in practice. No known reference implementation (CascLib, TACT.Net, rustycasc) validates this field. Testing against real files shows the stored values do not match any standard hash algorithm applied to the TOC data

Implementation Notes:

Extended Block Offsets: The agent logs “Archive w/ Extended Block Offset Found” for archive index entries that use larger-than-4-byte offsets (for archives exceeding 4GB)
Archive Count Limit: The agent has a casc_supports_1023_archives configuration flag, indicating a maximum of 1023 archives per CASC storage

Sample Analysis Results

File Sizes Observed:

Small indexes: ~8KB (few hundred entries)
Medium indexes: ~50-200KB (thousands of entries)
Large indexes: ~300KB+ (tens of thousands of entries)

Index Distribution (from sample builds):

WoW retail: 400-1400+ archives per build
WoW Classic: 1000-1400+ archives per build
Beta builds: 400-800 archives per build

Chunk Structure:

All indexes use 4KB chunks. Max entries per chunk = 4096 / (ekey_length + offset_bytes + size_bytes). With default 16+4+4 fields: 170 entries per chunk.
Table of contents (TOC) is stored after data chunks and contains two sections:
1. Last encoding key of each data chunk (for binary search)
2. Per-block MD5 hash of each data chunk (truncated to footer_hash_bytes)
TOC hash = MD5(toc_keys || block_hashes)[:footer_hash_bytes]
Chunk structure enables streaming and memory-efficient processing
Chunks are padded with zeros to maintain 4KB alignment

Archive Index Access Pattern

CDN URL Format:

https://cdn.domain.com/tpr/wow/data/{hash[:2]}/{hash[2:4]}/{hash}.index

Lookup Process:

Get archive content key from CDN configuration
Append ‘.index’ to form index URL
Fetch and parse index file
Search entries for target EKey
Use offset/size to retrieve from corresponding .archive file

Self-Referential Naming:

The archive index filename (hash) is the MD5 of its own footer structure, providing a unique identifier that validates the index contents.

Local Storage Index Format (.idx files)

File Extension: .idx Location: Client-side storage directory (Data/data/) Purpose: Maps content keys to local data file locations using bucket algorithm Key Type: Content keys (MD5 hashes from Root file) Key Length: ALWAYS 9 bytes (truncated for space efficiency in local storage) Implementation: cascette-client-storage/src/index.rs

See the comparison table at the end of this document for a full side-by-side comparison.

IDX Journal Files (.idx) - CASC Local Storage

Local CASC storage uses IDX Journal files for indexing:

IDX Journal Structure

struct IDXJournalHeader {  // 18 bytes + block table
    uint32_t data_size;       // Size of header data
    uint32_t data_hash;       // Jenkins hash validation
    uint16_t version;         // Journal version
    uint8_t  bucket;          // Bucket ID (0x00-0xFF)
    uint8_t  unused;          // Padding
    uint8_t  length_size;     // Size field bytes
    uint8_t  location_size;   // Location field bytes (5 = 1 archive + 4 offset)
    uint8_t  key_size;        // Key field bytes (9 or 16)
    uint8_t  segment_bits;    // Segment size bits
    // Followed by block table entries
};

Key Differences from Archive Indexes:

Bucket-based structure (256 buckets, 00-FF)
Jenkins hash validation instead of footer hash
Fixed key sizes (not truncated)
Header at start instead of footer at end
One journal file per bucket

Loose Files Index

For files not in archives:

struct LooseFilesIndex {
    uint32_t magic;              // 'LIDX'
    uint32_t version;
    uint32_t entry_count;

    struct Entry {
        uint8_t  encoding_key[16];
        uint32_t file_size;
        uint8_t  file_hash[16];  // For verification
    } entries[];
};

Archive Lookup Process

Get encoding key: From encoding file lookup
Check indices: Search all index files for key
Locate in archive: Extract offset and size
Retrieve data: Read from archive at offset
Decompress: Process BLTE container

Implementation Example

#![allow(unused)]
fn main() {
struct ArchiveIndex {
    header: ArchiveIndexHeader,
    entries: Vec<ArchiveIndexEntry>,
}

impl ArchiveIndex {
    pub fn find_file(&self, encoding_key: &[u8]) -> Option<(u64, u32)> {
        // Truncate search key to index key size
        let search_key = &encoding_key[..self.header.key_size as usize];

        // Binary search entries (sorted by key)
        let idx = self.entries.binary_search_by_key(
            &search_key,
            |e| &e.key[..]
        ).ok()?;

        let entry = &self.entries[idx];
        Some((entry.offset, entry.size))
    }
}
}

HTTP Range Requests

For CDN retrieval without downloading entire archives:

GET /data/5e/16/5e16b6ff530b1816c7b32296e0875ed4 HTTP/1.1
Host: cdn.example.com
Range: bytes=1048576-2097151

Response:

HTTP/1.1 206 Partial Content
Content-Range: bytes 1048576-2097151/134217728
Content-Length: 1048576

Archive Creation

When building archives:

Group related files: Minimize seeks during loading
Align boundaries: 4KB alignment for efficient I/O
Order by access: Frequently accessed files first
Compress individually: Each file is BLTE-encoded
Update indices: Generate index entries

Optimization Strategies

Memory Mapping

For local archives:

#![allow(unused)]
fn main() {
use memmap2::MmapOptions;

struct ArchiveReader {
    mmap: Mmap,
}

impl ArchiveReader {
    pub fn read_file(&self, offset: u64, size: u32) -> &[u8] {
        let start = offset as usize;
        let end = start + size as usize;
        &self.mmap[start..end]
    }
}
}

Index Caching

Keep frequently used indices in memory:

#![allow(unused)]
fn main() {
struct IndexCache {
    indices: HashMap<String, Arc<ArchiveIndex>>,
    lru: LruCache<String, ()>,
}
}

Archive Validation

Checksum Verification

When checksums are present:

#![allow(unused)]
fn main() {
fn verify_file(data: &[u8], expected_checksum: &[u8; 16]) -> bool {
    let computed = md5::compute(data);
    computed.0 == *expected_checksum
}
}

Size Validation

Always verify extracted size matches expected:

#![allow(unused)]
fn main() {
if decompressed.len() != expected_size as usize {
    return Err("Size mismatch");
}
}

Common Issues

Key collisions: Truncated keys may collide (handle gracefully)
Archive corruption: Verify checksums when available
Missing indices: Some files may only exist as loose files
Version mismatches: Handle different index versions
Alignment padding: Account for alignment bytes

Archive Groups

Archive Groups are client-generated mega-indices that combine multiple CDN archive indices into a single lookup structure, reducing search time from scanning hundreds of individual .index files to a single binary search. They use 6-byte offset fields (2-byte archive index + 4-byte offset) and are identified by archive-group and patch-archive-group fields in CDN config.

See Archive-Groups for the full format specification.

File Organization

Typical CASC repository structure:

data/
├── config/           # Configuration files
├── data/            # Archive files
│   ├── 00/
│   │   ├── 00/{hash}.archive
│   │   └── ...
│   └── ff/
│       └── ff/{hash}.archive
├── indices/         # Index files
│   ├── {hash}.index
│   └── ...
└── patch/           # Patch archives

Version History

CDN Archive Index Format (.index files)

The CDN Archive Index format currently has only one version:

Version 1 (Current)

Footer Size: 28 bytes
Location: End of file
Features:
- Variable-length encoding keys (footer’s ekey_length field)
- 4KB chunk-based structure with table of contents
- MD5 hash validation (footer hash and TOC hash)
- Self-referential naming (filename = MD5 of footer)
- Mixed endianness (element_count is little-endian, others big-endian)
- Typical entry size: 24 bytes (16-byte key + 4-byte size + 4-byte offset)

Version Detection

The version field is at offset 8 in the 28-byte footer. All known CDN archive indices use version 1.

Implementation Status

cascette-formats: Full support for version 1 with parser
Archive-groups: Client-side mega-indices combine multiple CDN indices (6-byte offset variant)

Local Storage Index Format (.idx files)

The Local Storage Index (IDX Journal) format currently has only one version:

Version 7 (Current - IDX Journal v7)

Header Size: 16 bytes
Location: Start of file
Features:
- Fixed 9-byte truncated content keys (space optimization)
- 18-byte entries (9-byte key + 5-byte location + 4-byte size)
- 256 bucket-based organization (0x00-0xFF)
- Packed 5-byte location field (10-bit archive ID + 30-bit offset)
- Jenkins hash validation
- Mixed endianness (header little-endian, entries mixed)
- Bucket algorithm: XOR first 9 bytes, then XOR nibbles
- Filename format: {bucket:02x}{version:06x}.idx

Version Detection

The version field is at offset 8 in the header (16-bit little-endian). The implementation validates version equals 7 and warns on unexpected versions.

Implementation Status

cascette-client-storage: Full support for version 7 with parser and builder
No earlier versions documented (version 7 is standard for modern CASC)

Key Differences Between Index Systems

Feature	CDN Index (.index)	Local Index (.idx)
Version	1 (footer-based)	7 (header-based)
Protocol	TACT (network)	CASC (local storage)
Key Type	Encoding keys	Content keys
Key Length	Variable (16 typical)	Fixed 9-byte truncated
Structure	Sequential chunks	Bucket algorithm
Validation	MD5 hash	Jenkins hash
Endianness	Mixed (mostly big)	Mixed (header little)
Entry Size	Variable (24 typical)	Fixed 18 bytes
Location	CDN download	Client Data/ directory
Crate	cascette-formats	cascette-client-storage

References

See Encoding Documentation for key lookup
See BLTE Format for archive content structure
See CDN Architecture for remote retrieval
See Format Transitions for format evolution tracking

Archive-Groups

Archive-groups are locally generated mega-indices that combine multiple CDN archive indices into a single unified lookup structure. They are created client-side by merging downloaded archive index files, never downloaded directly from the CDN. They are essential for Battle.net client compatibility and enable efficient content resolution.

Format Specification

Archive-groups use the same binary format as regular CDN archive indices with one critical difference:

Field	Regular Index	Archive-Group
Encoding Key	Variable (9-16 bytes)	16 bytes
Offset	4 bytes	6 bytes
Size	4 bytes	4 bytes

The 6-byte offset field contains:

Bytes 0-1: Archive index (big-endian u16)
Bytes 2-5: Offset within archive (big-endian u32)

Critical Findings - SOLVED

Archive Index Mapping Uses Hash-Based Assignment

CONFIRMED: ALL archive-groups use the full u16 range (0-65535) for archive indices:

archive_index = hash(encoding_key) % 65536

This explains why:

All archive-groups use indices 0-65535 despite only ~606 CDN archives existing
Archive 0 consistently receives 6-8% of entries (hash distribution)
The pattern is universal across all Battle.net installations
Archive-groups are generated locally using this deterministic hash-based assignment algorithm

CDN Configuration

Archive-groups are referenced in CDN config files by their hash:

archive-group = 6d08c5f69f6a2cf70a50cd40efdcd2fb
patch-archive-group = a5fb3ed088333348d93983d7e8693956

These hashes identify the locally generated archive-group files stored in Data/indices/. The client generates these files locally and stores them using the computed hash as the filename.

Size Characteristics

Archive-groups are significantly larger than regular indices:

Regular CDN indices: 4KB - 2MB
Archive-groups: 50MB - 150MB
Entry count: 2-5 million entries

Growth over time (WoW Classic):

Version 1.13.2: 54MB, 2.1M entries
Version 1.14.0: 73MB, 2.8M entries
Version 1.15.2: 126MB, 5.0M entries

Archive Index Distribution

Due to hash-based assignment, archive indices follow a predictable distribution:

Archive 0: ~6-8% of entries (150K-350K entries)
Archive 1: ~0.6% of entries (13K entries)
Archive 2-65535: Distributed based on hash function

This distribution is consistent across all Battle.net installations.

Implementation Requirements

Detection

#![allow(unused)]
fn main() {
fn is_archive_group(data: &[u8]) -> bool {
    if data.len() < 28 {
        return false;
    }
    // Check offset_bytes field at position -16 from end
    data[data.len() - 16] == 6
}
}

Parsing

#![allow(unused)]
fn main() {
// For archive-groups with 6-byte offsets
let archive_index = u16::from_be_bytes([data[pos], data[pos + 1]]);
let offset = u32::from_be_bytes([data[pos + 2], data[pos + 3], data[pos + 4], data[pos + 5]]);
}

Content Resolution

When resolving content in a Battle.net-compatible installation:

Look up encoding key in archive-group
Extract 2-byte archive index from entry
Map archive index to actual CDN archive (requires mapping table)
Read content from archive at specified offset

Implementation Strategy for Cascette

To achieve binary-identical Battle.net installations:

Required Actions

Generate Archive-Groups Locally
- Parse CDN config to find all individual archive index hashes
- Download all individual .index files from CDN
- Merge them locally into unified archive-group structures
- Store generated archive-groups in Data/indices/ using computed hash as filename
Implement Hash-Based Archive Assignment
- Use deterministic algorithm: archive_index = hash(encoding_key) % 65536
- Ensure identical results to Battle.net client generation
- Apply to all entries during archive-group creation
Implement Archive Index Mapping
- Create mapping table: archive_group_index -> actual_cdn_archive_hash
- The 65536 virtual indices map to ~606 actual CDN archives
- Use for content resolution when accessing actual archive data
Support Both Types
- Generate regular archive-group for main content from base archive indices
- Generate patch-archive-group for patch content from patch archive indices
- Both use same local generation process with 6-byte offsets

Why Binary-Identical Matters

For cascette to be a trustworthy Battle.net replacement:

Trust: Users need confidence we produce EXACTLY what Battle.net would
Compatibility: Some third-party tools may depend on exact format
Verification: Binary matching allows easy validation
Completeness: Understanding the full algorithm proves our analysis

Archive-groups are identified by the offset_bytes field in the footer:

Footer (28 bytes):
  [0:8]   TOC hash: MD5(toc_keys || block_hashes)[:footer_hash_bytes]
  [8]     Version (always 1)
  [9:11]  Reserved
  [11]    Page size in KB
  [12]    Offset bytes (4 for regular, 6 for archive-group)
  [13]    Size bytes (always 4)
  [14]    Key bytes (16 for archive-groups)
  [15]    Footer hash bytes
  [16:20] Entry count (little-endian u32)
  [20:28] Footer hash

Example Archive-Group Entry

Entry from 6d08c5f69f6a2cf70a50cd40efdcd2fb.index:
  Key: 000003bafc39011c91accae47b94fb2d (16 bytes)
  Archive: 0 (from first 2 bytes of offset field)
  Offset: 0x5dfd00d7 (from last 4 bytes of offset field)
  Size: 92,211,754 bytes

This entry indicates:

Content is in archive index 0
Starts at offset 0x5dfd00d7 in that archive
Compressed size is 92,211,754 bytes

Validation

Archive-groups contain entries for all game content:

Every encoding key should be findable
Archive indices use full u16 range (0-65535)
Entries are sorted by encoding key for binary search
Total entries match the entry_count in footer

Battle.net Client Behavior

The Battle.net client:

Downloads individual archive index files during installation
Generates archive-group locally by merging multiple archive indices
Stores generated archive-group in Data/indices/{hash}.index
Uses hash-based assignment algorithm for consistent archive index mapping
Uses archive-group for all subsequent content lookups

Common Issues

Incorrect Detection

Checking file size alone is insufficient
Must verify offset_bytes == 6 in footer
Some patch archives are large but not archive-groups

Index Mapping Confusion

Archive index in archive-group ≠ CDN archive position
Indices 0-65535 map to ~600 actual archives
Mapping requires modulo or lookup table

Parser Assumptions

Never hardcode 9-byte keys for archive-groups
Archive-groups always use 16-byte keys
Respect the key_bytes field in footer

References

Analysis of WoW Classic installations (1.13.2 through 1.15.2)
wowdev.wiki Archive documentation
Empirical testing with cascette-py parser

TVFS (TACT Virtual File System)

TVFS is the virtual file system introduced in WoW 8.2 (CASC v3), providing a unified interface for managing content across multiple products and build configurations. It replaces direct file path mappings with a more flexible namespace-based system.

How TVFS is Accessed

TVFS manifests are referenced through vfs-* fields in BuildConfig files:

BuildConfig contains vfs-root and numbered vfs-1 through vfs-N fields
Each VFS field contains two hashes: content key and encoding key
The encoding key (second hash) is used to fetch the TVFS manifest from CDN
The manifest is BLTE-encoded and must be decompressed
Once decoded, the manifest describes the virtual file system structure

Example from BuildConfig:

vfs-root = fd2ea24073fcf282cc2a5410c1d0baef 14d8c981bb49ed169e8558c1c4a9b5e5
vfs-root-size = 50071 33487

Modern builds contain 1,500+ VFS entries for different product/region/platform combinations.

Overview

TVFS organizes content into namespaces rather than per-build file trees. This allows multiple products and regions to share common assets through a single content-addressed storage layer, with deduplication across products.

Architecture

Namespace Hierarchy

TVFS Root
├── Product Namespace (e.g., "wow")
│   ├── Build Namespace (e.g., "1.15.7.61582")
│   │   ├── Root Files
│   │   └── Content Trees
│   └── Shared Namespace
│       └── Common Assets
└── Global Namespace
    └── Cross-Product Assets

File Structure

TVFS manifest is BLTE-encoded:

[BLTE Container]
  [Header]
  [Namespace Definitions]
  [Directory Entries]
  [File Entries]
  [Content Mappings]

Binary Format

Based on analysis of 5 TVFS samples from WoW builds 11.0.2.56313 through 11.2.0.62748.

TVFS Header

struct TvfsHeader {  // 38 bytes minimum, 46 with EST table
    uint8_t  magic[4];           // "TVFS" (0x54564653)
    uint8_t  format_version;     // Format version (1; agent accepts <= 1)
    uint8_t  header_size;        // Header size (not read by agent parser)
    uint8_t  ekey_size;          // EKey size (always 9)
    uint8_t  pkey_size;          // PKey size (always 9)
    uint32_t flags;              // Format flags (big-endian)
    uint32_t path_table_offset;  // Offset to path table (big-endian)
    uint32_t path_table_size;    // Size of path table (big-endian)
    uint32_t vfs_table_offset;   // Offset to VFS table (big-endian)
    uint32_t vfs_table_size;     // Size of VFS table (big-endian)
    uint32_t cft_table_offset;   // Offset to container file table (big-endian)
    uint32_t cft_table_size;     // Size of container file table (big-endian)
    uint16_t max_depth;          // Maximum path depth
    // Optional EST fields (only present if TVFS_FLAG_ENCODING_SPEC is set)
    uint32_t est_table_offset;   // Encoding spec table offset
    uint32_t est_table_size;     // Encoding spec table size
};

Verified Header Properties:

Magic bytes: Always “TVFS” (0x54564653) in ASCII
Format version: Always 1 across all samples
Header size: 38 bytes minimum, 46 with EST table
EKey size: 9 bytes (TACT standard)
PKey size: 9 bytes (TACT standard)
All multi-byte integer fields are big-endian (NGDP standard)

Format Flags (Implementation Details):

#![allow(unused)]
fn main() {
// TVFS format flags
const TVFS_FLAG_INCLUDE_CKEY: u32 = 0x01;      // Include content keys
const TVFS_FLAG_ENCODING_SPEC: u32 = 0x02;     // Encoding spec table (EST) present
const TVFS_FLAG_PATCH_SUPPORT: u32 = 0x04;     // Patch support enabled
}

Value 7 (0x7): Include C-key + Encoding spec + Patch support (all features)
EST Table Present: When bit 1 (0x02) is set. The agent checks flags & 2 for encoding specifier presence.
Header Size: 38 bytes minimum (without EST), 46 bytes with EST table fields

Sample Analysis Results:

File sizes: 49,896 - 50,844 bytes (decompressed)
All files use identical header format
Table offsets and sizes are consistent with file structure
Two retail builds (11.2.0.62706 and 11.2.0.62748) are byte-identical

Table Structure

Path Table (PathTableOffset + PathTableSize):

Recursive prefix tree (trie) encoding file paths. Each entry has:

Optional 0x00 path separator bytes (before/after name fragments)
Length-prefixed name fragment (1-byte length + N bytes)
0xFF marker followed by 4-byte big-endian NodeValue:
- Bit 31 set: folder node, lower 31 bits = folder data length (includes the 4-byte NodeValue). Children are inline within that byte range.
- Bit 31 clear: file node, value = byte offset into the VFS table.

Maximum depth is tracked in the header.

VFS Table (VfsTableOffset + VfsTableSize):

Span-based entries addressed by byte offset from path table NodeValues. Each entry has:

span_count (1 byte): 1-224 = file entry, 225-254 = other, 255 = deleted
Per span (repeated span_count times):
- file_offset (4 bytes BE): offset within the referenced content
- span_length (4 bytes BE): content size of this span
- cft_offset (CftOffsSize bytes BE): byte offset into the CFT

CftOffsSize is computed from cft_table_size using GetOffsetFieldSize: >0xFFFFFF = 4 bytes, >0xFFFF = 3 bytes, >0xFF = 2 bytes, else 1 byte.

Container File Table (CftTableOffset + CftTableSize):

Fixed-stride entries addressed by byte offset from VFS span cft_offset values. Entry layout depends on header flags:

EKey (ekey_size bytes): encoding key
EncodedSize (4 bytes BE): encoded (compressed) size
CKey (pkey_size bytes): content key (if TVFS_FLAG_INCLUDE_CKEY)
est_index (EstOffsSize bytes BE): EST entry index (if TVFS_FLAG_ENCODING_SPEC)
patch_offset (CftOffsSize bytes BE): patch entry offset (if TVFS_FLAG_PATCH_SUPPORT)

EstOffsSize is computed from est_table_size using the same GetOffsetFieldSize function as CftOffsSize.

Encoding Specifier Table (EST) (Optional, if encoding spec flag is set):

Contains null-terminated encoding spec strings (same format as the ESpec table in the encoding file)
Only present if flag bit 1 (0x02) is set
Required for writing files to underlying storage
Parsed from est_table_offset for est_table_size bytes

Sample Table Sizes (Build 11.2.0.62748):

Path Table:      Offset 46,     Size 11,814 bytes
VFS Table:       Offset 41,527, Size 9,317 bytes
Container Table: Offset 11,882, Size 29,645 bytes

Format Analysis Status

Verified against CascLib and CDN data (WoW Retail, Classic, Classic Era):

Header format, magic bytes, flags, and table offsets
Path table recursive prefix tree with 0xFF NodeValue markers
VFS span-based entries with variable-width CFT offsets
CFT fixed-stride entries with flag-dependent fields
EST null-terminated encoding spec strings
Round-trip parse/build produces structurally equivalent output

Usage

Parsing a TVFS Manifest

#![allow(unused)]
fn main() {
use cascette_formats::tvfs::TvfsFile;

// From decompressed data
let tvfs = TvfsFile::parse(&data)?;

// From BLTE-encoded CDN data
let tvfs = TvfsFile::load_from_blte(&blte_data)?;
}

Enumerating Files

#![allow(unused)]
fn main() {
// All file entries from the path table
for file in &tvfs.path_table.files {
    println!("{} -> VFS offset {}", file.path, file.vfs_offset);
}

// With VFS entry details
for (file, vfs_entry) in tvfs.enumerate_files() {
    if let Some(entry) = vfs_entry {
        for span in &entry.spans {
            println!("{}: offset={}, length={}, cft_offset={}",
                file.path, span.file_offset, span.span_length, span.cft_offset);
        }
    }
}
}

Resolving a Path

#![allow(unused)]
fn main() {
// Resolve path -> VFS entry -> CFT entry (EKey)
if let Some(container_entry) = tvfs.resolve_path("path/to/file") {
    println!("EKey: {}", container_entry.ekey_hex());
    if let Some(ckey) = container_entry.content_key_hex() {
        println!("CKey: {}", ckey);
    }
}
}

Building a TVFS Manifest

#![allow(unused)]
fn main() {
use cascette_formats::tvfs::TvfsBuilder;

let mut builder = TvfsBuilder::with_flags(0x07); // CKEY + EST + PATCH
builder.add_est_spec("b:256K*=z".to_string());
builder.add_file(
    "path/to/file".to_string(),
    [0x01; 9],   // ekey
    1024,         // encoded_size
    2048,         // content_size
    Some([0x02; 16]), // content_key
);
let data = builder.build()?;
}

References

See Root File for legacy file mapping
See Encoding Documentation for content resolution
See Archives for storage details

NGDP Configuration File Formats

This document describes the configuration file formats used in NGDP for managing product versions, CDN endpoints, and content distribution.

Overview

NGDP uses five configuration file types:

Build Configuration - Defines build metadata and system file references
CDN Configuration - Lists CDN servers and available archives
Patch Configuration - Contains delta update information
Keyring Configuration - Encryption keys for Salsa20 decryption
Product Configuration - Client installation and platform metadata

Configuration File Access

Configuration files are accessed through CDN endpoints using content-addressed paths derived from hashes returned by the Ribbit API.

Path Structure

Configuration files use a two-level directory structure for efficient CDN distribution:

http://<cdn-host>/<path>/<type>/<hash[0:2]>/<hash[2:4]>/<full-hash>

Where:

<cdn-host>: CDN server hostname
<path>: Base path from CDN response (e.g., tpr/wow)
<type>: Content type (config, data, patch)
<hash[0:2]>: First 2 characters of hash
<hash[2:4]>: Characters 3-4 of hash (positions 2-3 in 0-indexed)
<full-hash>: Complete hash value

Example:

# Build config for wow_classic_era 1.15.7.61582
# Hash: ae66faee0ac786fdd7d8b4cf90a8d5b9
# Note: hash[0:2] = "ae", hash[2:4] = "66"
http://cdn.arctium.tools/tpr/wow/config/ae/66/ae66faee0ac786fdd7d8b4cf90a8d5b9

Build Configuration

Build configurations define build-specific metadata and reference all system files required for a build.

Format

Key-value pairs, one per line, with = delimiter (space-equals-space). Values may contain multiple space-separated tokens (e.g., content key + encoding key).

Common Keys

Key	Description	Example
`root`	Root file content key (NOT for direct CDN fetch)	`ea8aefdebdbd6429da905c8c6a2b1813`
`install`	Install manifest: content key + encoding key	`54c189d60033f93f42e7b91165e7de1c a9dcee49ab3f952d69441eb3fd91c159`
`encoding`	Encoding file: content key + encoding key (use 2nd for CDN)	`b07b881f4527bda7cf8a1a2f99e8622e bbf06e7476382cfaa396cff0049d356b`
`encoding-size`	Sizes for encoding file versions	`14004322 14003043`
`download`	Download manifest: content key + encoding key	`42a7bb33cd1e9a7b72bef6ee14719b58 53ba96f0965adc306d2d0cf3b457949c`
`size`	Size manifest: content key + encoding key	`d1d9e612a645cc7a7e4b42628bde21ce 0d5704735f4985e555907a7e7647099a`
`patch`	Patch file content key	`658506593cf1f98a1d9300c418ee5355`
`patch-config`	Patch configuration hash (fetch separately)	`17f5bbcb7eae2fc8fb3ea545c65f74d4`
`patch-index`	Patch index files	`3806f4c7b1f179ce976d7685f9354025 eb5758bd78805f0aabac15cf44ea767c`
`patch-size`	Size of patch file	`22837`
`build-name`	Human-readable build identifier	`WOW-55646patch1.15.3_ClassicRetail`
`build-uid`	Unique build identifier	`wow_classic_era`
`build-product`	Product identifier	`WoW`
`build-playbuild-installer`	Installer build number	`ngdp:wow_classic_era:55646`
`build-partial-priority`	Partial download priorities	Space-separated list
`build-num`	Build number	`61582`
`build-num-retail`	Retail build number	`61582`
`build-attributes`	Build attribute metadata	Attribute string
`build-file-db`	File database for containerless builds	Hash value
`build-file-db-size`	Size of file database	Size in bytes
`client-version`	Client version string	Version string
`feature-placeholder`	Feature placeholder flag	`true` or absent
`feature-use-hardlinks`	Enable hard link support	`true` or absent
`no-frame-encoding`	Disable frame encoding (sets v3.0.0)	`true` or absent
`vfs-root-espec`	ESpec for VFS root manifest	ESpec string
`install-high-ver`	High-version install manifest hash	Hash value
`install-high-ver-size`	Size of high-version install	Size in bytes
`key-layout-index-bits`	Static key layout index bits	Numeric value

VFS (Virtual File System) Keys

Modern WoW builds (8.2+) include VFS fields that reference TVFS (TACT Virtual File System) manifests:

Key	Description	Example
`vfs-root`	Main TVFS manifest: content key + encoding key	`fd2ea24073fcf282cc2a5410c1d0baef 14d8c981bb49ed169e8558c1c4a9b5e5`
`vfs-root-size`	Sizes for TVFS root manifest	`50071 33487`
`vfs-1` through `vfs-N`	Additional TVFS manifests for different products/regions	Same format as vfs-root
`vfs-N-size`	Size for corresponding VFS manifest	Same format as vfs-root-size
`vfs-N-espec`	Encoding spec for corresponding VFS manifest	ESpec string

Important: Each vfs-* field points to a TVFS manifest file that contains the virtual file system structure. These manifests are BLTE-encoded and fetched using the encoding key (second hash). See TVFS documentation for manifest format details.

Modern builds can have 1,500+ VFS fields representing different:

Product variants (retail, PTR, beta)
Language/region combinations
Platform-specific configurations
Feature flags and optional content

Example

# Build Configuration for wow_classic_era 1.15.7.61582
# URL: http://cdn.arctium.tools/tpr/wow/config/ae/66/ae66faee0ac786fdd7d8b4cf90a8d5b9
root = ea8aefdebdbd6429da905c8c6a2b1813
install = 54c189d60033f93f42e7b91165e7de1c a9dcee49ab3f952d69441eb3fd91c159
install-size = 23038 22281
download = 42a7bb33cd1e9a7b72bef6ee14719b58 53ba96f0965adc306d2d0cf3b457949c
download-size = 5606744 4818287
size = d1d9e612a645cc7a7e4b42628bde21ce 0d5704735f4985e555907a7e7647099a
size-size = 3637629 3076687
encoding = b07b881f4527bda7cf8a1a2f99e8622e bbf06e7476382cfaa396cff0049d356b
encoding-size = 14004322 14003043
patch-index = 5472ee24b5b9d148acfd2a436fc514be 76ce88ecb704dc93849def9fb489a6fb
patch-index-size = 16783 6591
patch = 4f185b4a837d4a363b2490432aaef092
patch-size = 11017
patch-config = 474b9630df5b46df5d98ec27c5f78d07
build-name = WOW-61582patch1.15.7_ClassicRetail
build-uid = wow_classic_era
build-product = WoW
build-playbuild-installer = ngdptool_casc2

Critical Implementation Note

ENCODING KEY VS CONTENT KEY:

Most build config entries have TWO hashes: <content-key> <encoding-key>
The content key (first hash) is the unencoded file identifier
The encoding key (second hash) is what you use for CDN fetches
EXCEPTION: The encoding file itself can be fetched directly using its

encoding key

File Fetch Process:

Fetch encoding file using its encoding key: bbf06e7476382cfaa396cff0049d356b
Parse encoding file to find encoding keys for other files
Use those encoding keys to fetch files from CDN
The root file CANNOT be fetched using ea8aefdebdbd6429da905c8c6a2b1813 directly

Notes

Multiple encoding/size entries support different compression levels
Patch-config reference enables delta updates between builds
Build-partial-priority lists files for streaming installation

Static Key Layouts

Build configs can contain key-layout-<number> entries that define static data layout schemes. Each key layout has sub-fields:

Chunk Bits: Number of bits for chunk addressing
Archive Bits: Number of bits for archive addressing
Offset Bits: Number of bits for offset addressing
Alignment: Data alignment requirement

The key-layout-index-bits field in the build config specifies the number of index bits for the static key layout system.

Chunk System

Build configs can reference chunk-<number> entries. Chunks are associated with archives and use a bits-based addressing system. The agent validates that chunk identifiers follow the chunk-<number> naming pattern.

Hard Link Entries

Build configs can contain hard link entries. The agent validates the format of these entries and uses them for storage optimization on file systems that support hard links.

Manifest Validation

The agent validates that each manifest type (download, install, size, encoding) has matching C-Key/C-Size and E-Key/E-Size pairs. If a size field is specified, the corresponding key must also be present.

CDN Configuration

CDN configurations list available CDN servers and archive files.

CDN Configuration Format

Key-value pairs with special handling for multi-value keys.

Keys

Key	Description	Format
`archives`	List of archive hashes	Space-separated
`archive-group`	Group identifier for archives	Single hash
`patch-archives`	List of patch archive hashes	Space-separated
`patch-archive-group`	Group identifier for patch archives	Single hash
`file-index`	File index hash	Single hash
`file-index-size`	Size of file index	Integer
`patch-file-index`	Patch file index hash	Single hash
`patch-file-index-size`	Size of patch file index	Integer
`archives-index-size`	Sizes of archive index files	Space-separated integers
`archive-group-index-size`	Size of archive group index	Integer
`patch-archives-index-size`	Sizes of patch archive index files	Space-separated integers
`patch-archive-group-index-size`	Size of patch archive group index	Integer
`builds`	Reference to builds using this CDN config	Space-separated

CDN Configuration Example

# CDN Configuration for wow_classic_era 1.15.7.61582
# URL: http://cdn.arctium.tools/tpr/wow/config/63/ee/63eee50d456a6ddf3b630957c024dda0
# (Showing first 10 archives of 1000+)
archives = 0017a402f556fbece46c38dc431a2c9b 003b147730a109e3a480d32a54280955 \
  00b79cc0eebdd26437c7e92e57ac7f5c 00e43d6a55fe497ebaecece75c464913 \
  00f71443fef647344027dd37beda651f 0105f03cb8b8faceda8ea099c2f2f476 \
  0128ec2c42df9e7ac7b58a54ad902147 01794f476dce0d0adeb975eaff4ff850 \
  01df479cca2ad2a8991bac020db5287e 01f0908f6ece2f26d918d1665f919222
archive-group = 58a3c9e02c964b0ec9dd6c085df99a77
patch-archives = 01c87e5f5e87ffc088c3fe20a7e332ce
0239bc973b31a4e52e8c96652a14b9e0 \
  034e2e6e0e5cdecb0f0bc07e87f0e074 04f8e6c8cbfbd6e9fd3e9ccbcd95e53a \
  0662e1cf69dbd0c6c10e7e3e6303b8cf 0bffd45f01e8ad33731f973bb96f3db1 \
  0d17c61fa98e6db91e14e0b24c8bc9f9 0d47f019c36e88c00fc43b3fe973f3d1 \
  101e4f7b592c12bf3c436d3b95e38b8f 1027ab37f63c039a8a3dd8a039e43e81
patch-archive-group = de09c9cd5f93c4e4f6f1f0f4a8edb9c0
file-index = fb37bc7303bae99d6c57e96a079e2c77
file-index-size = 34236152
patch-file-index = eb99f93d5c8dbdbb652f1d71da9c7de6
patch-file-index-size = 5015068
builds = ae66faee0ac786fdd7d8b4cf90a8d5b9

Archive Management

Archives are immutable once created
New content creates new archives
Archive-group combines multiple archives for efficient access
File-index provides fast lookups across all archives

Patch Configuration

Patch configurations define delta updates between builds. They are referenced within build configurations using the patch-config field and contain detailed patch entry definitions.

Access Pattern

Patch configs are accessed through:

Fetch build config
Extract patch-config hash from build config
Fetch patch config using standard config path structure

Patch Configuration Format

Text format with metadata and multiple patch-entry lines.

Patch Entry Format

patch-entry = <type> <content-key> <size> <encoding-key> <encoded-size>
[compression-info] [additional-keys...]

Fields

Field	Description
`type`	File type (encoding, install, download, size, vfs:*)
`content-key`	Target content key
`size`	Target file size
`encoding-key`	Encoded version key
`encoded-size`	Encoded file size
`compression-info`	Compression blocks (e.g., `b:{11=n,4813402=n,793331=z}`)
`additional-keys`	Alternative encoding keys and sizes

The agent validates patch-entry fields including target ESpec validation. Patch config parsing uses structured per-entry validation.

Patch Configuration Example

# Patch Configuration for wow_classic 1.13.7.38631
# URL: http://cdn.arctium.tools/tpr/wow/config/17/f5/17f5bbcb7eae2fc8fb3ea545c65f74d4
# (Showing metadata and sample entries)

# Patch Configuration

patch = 658506593cf1f98a1d9300c418ee5355
patch-size = 22837

patch-entry = download 6d616efdfd334916898276805f043927 6113132 \
  64332f9899b6d42a939fa3e02080bf33 5528795 b:{16=n,5524659=n,588457=z} \
  0a45352357be8ddca09749ec421bbb48 6112126 50ac209d796a11818da1429d6cb69c60
12502
patch-entry = encoding fcf166e21580ee48497b4d85e433b900 13084283 \
  716906f960db61ea62f07f7e9697127d 13082541
b:{22=n,2574=z,61216=n,7835648=n,40192=n,5144576=n,*=z} \
  5905362dbda48cebbea7c80d05ef6c60 13084283 ce2c3294ca7e37aa3be1f227bdc9072a
89156
patch-entry = install 179088c6b3495b1a9dec3715e77834e1 15565 \
  a75d4aa7e38dff6a1ddc59bd80c2ad3c 15197 b:{610=z,14955=n} \
  f66d038c20f580be307f4645c7b5d3f2 15633 072a9339d594a00c884ffea987381883 486
patch-entry = size 5841844a1a1ad48eaeb756c716869bf5 3248493 \
  d06fc7a7e4b5d8fb138a2ee27f54674f 2878957 b:{15=n,588457=z,64K*=n} \
  2061f6427c842d01d9445d1bcc58d65b 3247949 daccd8bf9f2719ea9dbbb57991a03ed7
452303

Compression Info Format

The b:{...} notation describes block compression:

n = uncompressed block
z = zlib compressed block
Numbers indicate block sizes or offsets
* = all remaining blocks
64K* = 64KB blocks

Entry Types

Patch configs commonly include:

System files: download, encoding, install, size, patch-index
VFS entries: vfs:* with hexadecimal identifiers (e.g., vfs:000000040000::)
Metadata: patch and patch-size fields for the patch file itself

Availability

Patch configs are found in:

Classic WoW builds (1.13.x through 5.5.x)
Older retail builds (pre-8.0)
Rarely in modern builds (mostly replaced by direct patching)

Keyring Configuration

Keyring configurations contain encryption keys for decrypting protected CASC content. Each entry maps an 8-byte key ID to a 16-byte Salsa20 encryption key.

Discovery

Keyring config hashes are in the Ribbit versions response KeyRing column, NOT in build configs. The config is fetched from CDN using the standard config path structure.

Format

Same key-value format as other configs, with = delimiter. Each entry uses the key- prefix followed by a hex-encoded key ID.

key-{KEY_ID_HEX} = {KEY_VALUE_HEX}

Where:

KEY_ID_HEX: 16 hex characters (8 bytes) identifying the encryption key
KEY_VALUE_HEX: 32 hex characters (16 bytes) Salsa20 encryption key

Example

key-4eb4869f95f23b53 = c9316739348dcc033aa8112f9a3acf5d

Validation

Agent.exe (tact::ConfigReader::ValidateKeyringConfig at 0x6e7020) requires at least one key entry. Duplicate key IDs with different values produce a warning and the duplicate is ignored (first entry wins).

Usage

Keys are loaded into a hash map by tact::KeyGetter::LoadKeyring. During BLTE decryption, the 8-byte key ID from the encrypted block header is used to look up the 16-byte Salsa20 decryption key.

Distribution

Keyring sizes vary by product:

WoW Retail: 1 key entry
Call of Duty (Odin): 1 key entry
Overwatch 2: 63 key entries (largest observed)
WoW Classic products: empty keyrings (no KeyRing column in versions response)

Product Configuration

Product configurations contain Battle.net client metadata for installation and platform requirements.

Note: Product config hashes are present in Ribbit/Wago data, and the actual config files are accessible via CDN using the /tpr/configs/data/ path structure as demonstrated in the examples below.

Product Configuration Format

JSON object with nested configuration sections.

Structure

{
  "all": {
    "config": {
      // Global configuration
    }
  },
  "platform": {
    "win": { /* Windows-specific */ },
    "mac": { /* macOS-specific */ }
  },
  "<locale>": {
    "config": {
      // Locale-specific configuration
    }
  }
}

Product Configuration Example

// Product Configuration for WoW 11.2.0.62748
// URL: http://cdn.arctium.tools/tpr/configs/data/53/02/53020d32e1a25648c8e1eafd5771935f
{
  "all": {
    "config": {
      "product": "WoW",
      "update_method": "ngdp",
      "data_dir": "Data/",
      "supports_multibox": true,
      "supports_offline": false,
      "supported_locales": ["enUS", "esMX", "ptBR", "deDE", "esES", "frFR"],
      "display_locales": ["enUS", "esMX", "ptBR", "frFR", "deDE", "esES"],
      "shared_container_default_subfolder": "_retail_",
      "enable_block_copy_patch": true
    }
  },
  "platform": {
    "win": {
      "config": {
        "binaries": {
          "game": {
            "relative_path": "WoW.exe",
            "relative_path_arm64": "Wow-ARM64.exe"
          }
        },
        "min_spec": {
          "default_required_cpu_speed": 2600,
          "default_required_ram": 2048,
          "default_requires_64_bit": true
        }
      }
    },
    "mac": {
      "config": {
        "binaries": {
          "game": {
            "relative_path": "World of Warcraft.app"
          }
        },
        "min_spec": {
          "default_required_cpu_speed": 2200,
          "default_required_ram": 2048
        }
      }
    }
  },
  "enus": {
    "config": {
      "install": [{
        "start_menu_shortcut": {
          "link": "%commonstartmenu%World of Warcraft/World of Warcraft.lnk",
          "target": "%shortcutpath%",
          "description": "Click here to play World of Warcraft."
        }
      }]
    }
  }
  // ... additional locales ...
}

Global Configuration Keys

Key	Description	Type
`product`	Product identifier	String
`update_method`	Update protocol	“ngdp”
`data_dir`	Data directory path	String
`supported_locales`	Available languages	Array
`display_locales`	UI languages	Array
`launch_arguments`	Default launch args	Array
`supports_multibox`	Multiple instances	Boolean
`supports_offline`	Offline play	Boolean
`enable_block_copy_patch`	Block-level patching	Boolean
`shared_container_default_subfolder`	Shared data path	String

Platform Configuration

{
  "platform": {
    "win": {
      "config": {
        "binaries": {
          "game": {
            "relative_path": "WoWClassic.exe",
            "relative_path_arm64": "WowClassic-arm64.exe",
            "launch_arguments": []
          }
        },
        "min_spec": {
          "default_required_cpu_cores": 1,
          "default_required_cpu_speed": 2600,
          "default_required_ram": 2048,
          "default_requires_64_bit": true,
          "required_osspecs": {
            "6.1": { "required_subversion": 0 }
          }
        },
        "form": {
          "game_dir": {
            "default": "Program Files",
            "required_space": 11500000000,
            "space_per_extra_language": 2000000000
          }
        }
      }
    }
  }
}

Locale Configuration

{
  "enus": {
    "config": {
      "install": [{
        "desktop_shortcut": {
          "link": "%desktoppreference%World of Warcraft Classic.lnk",
          "target": "%shortcutpath%",
          "description": "Click here to play World of Warcraft.",
          "args": "--productcode=wow_classic_era"
        }
      }]
    }
  }
}

Installation Variables

Product configs use variables resolved by Battle.net:

Variable	Description
`%installpath%`	Game installation directory
`%binarypath%`	Executable path
`%shortcutpath%`	Launcher path
`%desktoppreference%`	User desktop path
`%commonstartmenu%`	Start menu path
`%titlepath%`	Product root directory
`%game%`	Game data directory
`%locale%`	Current locale
`%uid%`	Unique installation ID

Parser Implementation Status

Python Parser (cascette-py)

Status: Complete

Capabilities:

Fetches patch configs from build config references
Parses patch entry format with compression info
Analyzes entry types (system files, VFS entries)
Supports both patch and product config examination
Handles standard CDN path structure

Verified Against:

WoW Classic 1.13.7.38631 patch config
WoW Classic 4.4.2.60142 patch config (205 entries)
WoW Classic 5.5.0.62655 patch config

Known Issues:

None identified - both product and patch configs successfully fetched
Requires fetching build config first to get patch-config hash

See https://github.com/wowemulation-dev/cascette-py for the Python implementation.

Product Configuration Status Summary

ProductConfig contains product-specific metadata and installation parameters. These are referenced in Ribbit responses and are accessible via CDN.

Status: Available via CDN using /tpr/configs/data/ path structure Format: JSON Purpose: Product metadata, platform settings, feature flags

Known Fields (from Ribbit)

Product configuration hash (16 bytes hex)
Associated with specific product versions
May be embedded in client or launcher

Configuration Discovery Flow

Ribbit Query: Get version and CDN information
Version Lookup: Find build configuration hash and keyring hash
Build Config: Fetch build metadata and system files
CDN Config: Get archive lists and CDN servers
Keyring Config: Fetch encryption keys (if KeyRing column present)
Patch Config: Retrieve update paths (rarely available)
Product Config: Client installation metadata (may not be accessible)

Implementation Considerations

Parsing

Build/CDN/Patch/Keyring configs: Simple key-value parser
Product config: JSON parser
Handle comments (lines starting with #)
Support multi-value fields (comma or space separated)

Caching

Configuration files are immutable (content-addressed)
Cache indefinitely once fetched
Validate using content hash

Error Handling

Retry failed fetches with exponential backoff
Fall back to alternate CDN servers
Validate configuration completeness

Security

Verify content hashes match expected values
Use HTTPS when available
Validate file sizes before download

NGDP/TACT Patch System

The NGDP patch system enables incremental updates between game versions using differential patches.

Patch System Architecture

The patch system uses a multi-tier structure:

Patch Manifests (PA files in /patch/): Index files listing patches

between builds
Patch Archives (ZBSDIFF files in /patch/): Actual differential patch data
Intermediate Results (in /data/): Results of applying patches in a chain

Patch File Locations

According to wowdev.wiki, the directories are:

/config/: Build configs, CDN configs, and Patch configs
/data/: Archives, indexes, and unarchived files (binaries, media, root,

install, download)
/patch/: Patch manifests, patch files, patch archives, patch indexes

Specifically:

Patch Manifests: https://cdn.host/tpr/wow/patch/{hash[:2]}/{hash[2:4]}/{hash}
- PA (Patch Archive) format files containing patch entry indices
- Referenced by patch field in build configs
Patch Archives: https://cdn.host/tpr/wow/patch/{hash[:2]}/{hash[2:4]}/{hash}
- ZBSDIFF1 format differential patch files stored in archives
- Found in patch-entry lines (the patch_hash values)
- Stored in archives just like regular data files
Patch Archive Indices: https://cdn.host/tpr/wow/patch/{hash[:2]}/{hash[2:4]}/{hash}.index
- Index files for patch archives using the same format as data archive indices
- Map content hashes to locations within patch archives
- Referenced by patch-archives-index field in CDN configs
- Use IndexType::Patch (offset_bytes = 0) in the footer
Patch Results: https://cdn.host/tpr/wow/data/{hash[:2]}/{hash[2:4]}/{hash}
- Intermediate or final results of applying patches
- BLTE-encoded files with DL/EN/IN signatures for manifest types
Patch Configurations: https://cdn.host/tpr/wow/config/{hash[:2]}/{hash[2:4]}/{hash}
- Text configs with patch-entry lines describing patch chains
- Referenced by patch-config field in build configs

Patch Manifest Format

Patch manifests use the PA (Patch Archive) format. All numeric fields are big-endian throughout (header, block table, and block data).

Header Structure (10 bytes)

struct PatchArchiveHeader {  // 10 bytes, big-endian
    uint8_t  magic[2];         // "PA" (0x5041)
    uint8_t  version;          // Format version (1-2)
    uint8_t  file_key_size;    // Target file CKey size (1-16, typically 16)
    uint8_t  old_key_size;     // Source file EKey size (1-16, typically 16)
    uint8_t  patch_key_size;   // Patch EKey size (1-16, typically 16)
    uint8_t  block_size_bits;  // Block size as power of 2 (range [12, 24])
    uint16_t block_count;      // Number of blocks (big-endian)
    uint8_t  flags;            // Format flags (see below)
};

Flags:

Bit 0 (0x01): Plain data mode (informational, Agent.exe logs but does not reject)
Bit 1 (0x02): Extended header present with encoding info. All known CDN patch manifests have this flag set.

Extended Header (when flags & 0x02)

Present immediately after the 10-byte header. Contains encoding file metadata for the patch manifest:

struct PatchArchiveEncodingInfo {
    uint8_t  encoding_ckey[file_key_size];  // Encoding file CKey
    uint8_t  encoding_ekey[file_key_size];  // Encoding file EKey
    uint32_t decoded_size;                  // Decoded size (big-endian)
    uint32_t encoded_size;                  // Encoded size (big-endian)
    uint8_t  espec_length;                  // Length of ESpec string
    uint8_t  espec[espec_length];           // ESpec (length-prefixed, NOT null-terminated)
};

Block Table

Follows the header (or extended header if present). Each entry has a fixed size of file_key_size + 20 bytes:

struct BlockTableEntry {  // file_key_size + 20 bytes per entry
    uint8_t  last_file_ckey[file_key_size];  // Last (highest) CKey in this block
    uint8_t  block_md5[16];                  // MD5 hash of block data
    uint32_t block_offset;                   // Absolute byte offset (big-endian)
};

The block table is sorted by last_file_ckey. Agent.exe validates sort order using _memcmp during parsing.

Block Data

At each block_offset, file entries are stored as variable-length records terminated by a 0x00 sentinel byte:

// Repeat until num_patches == 0:
struct FileEntry {
    uint8_t  num_patches;                    // 0 = end of block
    uint8_t  target_ckey[file_key_size];     // Target file CKey
    uint8_t  decoded_size[5];                // uint40, big-endian
    // Followed by num_patches patch records:
    struct {
        uint8_t  source_ekey[old_key_size];  // Source file EKey
        uint8_t  source_decoded_size[5];     // uint40, big-endian
        uint8_t  patch_ekey[patch_key_size]; // Patch data EKey
        uint32_t patch_size;                 // Patch data size (big-endian)
        uint8_t  patch_index;                // Ordering hint
    } patches[num_patches];
};
uint8_t end_marker = 0;  // Sentinel byte

Decoded sizes use uint40 (5-byte big-endian) to support files up to ~1 TB.

Compression Info Format

The compression info string describes byte ranges and their compression:

Format: {offset=method,offset=method,...,*=default}
Methods: n (none), z (zlib)
Example: {22=n,10044521=z,734880=n,*=z}

Build Config References

Build configurations reference patches through:

patch: Main patch manifest hash
patch-size: Size of patch manifest
patch-index: Patch index files
patch-config: Patch configuration hash

Patch Configuration

Patch configs contain patch-entry lines describing patch chains between file versions.

Patch Entry Format

patch-entry = type old_hash old_size new_hash new_size compression_info
[result_hash result_size patch_hash patch_size]+

Components:

type: Manifest type (download, encoding, install, size, vfs:, etc.)
old_hash: MD5 of original file content
old_size: Size of original file
new_hash: MD5 of final patched content
new_size: Size of final file
compression_info: Compression specification (e.g., b:{11=n,8183230=n,1255589=z})
Followed by repeating groups of:
- result_hash: MD5 of intermediate/final result (stored in /data/)
- result_size: Size of result file
- patch_hash: MD5 of ZBSDIFF patch file (stored in /patch/)
- patch_size: Size of patch file

Patch Chain Example

patch-entry = download 6afd6862... 9438830 d29e5263... 8190785 b:{...} \
  557b46d1... 15384969 08c046c8... 1623773 \
  4ebf89a1... 15384925 e960d26b... 1623636

This describes a chain:

Apply patch 08c046c8 to original 6afd6862 → result 557b46d1
Apply patch e960d26b to result 557b46d1 → result 4ebf89a1
Continue until reaching final d29e5263

ZBSDIFF1 Format (Zlib-compressed Binary Differential)

ZBSDIFF1 is the binary differential patch format used by NGDP/TACT for efficient file updates:

Header (32 bytes, little-endian)

struct ZbsdiffHeader {
    uint8_t  signature[8];       // "ZBSDIFF1"
    int64_t  control_size;       // Size of compressed control block (little-endian)
    int64_t  diff_size;          // Size of compressed diff block (little-endian)
    int64_t  output_size;        // Size of final output file (little-endian)
};

Three-Block Structure

Control Block (zlib-compressed):
- Triple sequences: (diff_size, extra_size, seek_offset)
- Instructions for applying differences and inserting new data
- All values are signed 64-bit integers
Diff Block (zlib-compressed):
- Byte differences to apply to old data
- Applied by XOR operation: new[i] = old[i] + diff[i]
Extra Block (zlib-compressed):
- New data to insert at specified positions
- Copied directly to output

Streaming Application

ZBSDIFF1 supports streaming application without loading entire files:

#![allow(unused)]
fn main() {
// Streaming patch application
let mut old_pos = 0;
let mut new_pos = 0;
let mut control_entries = decompress_control_block(&patch.control_data)?;

while let Some((diff_size, extra_size, seek_offset)) = control_entries.next()? {
    // Copy diff_size bytes with differences
    copy_with_diff(&old_data[old_pos..], &diff_data, &mut new_data[new_pos..], diff_size);
    old_pos += diff_size;
    new_pos += diff_size;

    // Copy extra_size bytes of new data
    copy_extra(&extra_data, &mut new_data[new_pos..], extra_size);
    new_pos += extra_size;

    // Seek in old data
    old_pos += seek_offset;
}
}

Format Characteristics

Little-Endian Header: All header fields use little-endian byte order (verified against Agent.exe tact::BsPatch::ParseHeader at 0x6fbd1c)
Signed Integers: Control block uses signed 64-bit little-endian integers for sizes and offsets
Zlib Compression: All data blocks compressed independently
Memory Efficient: Can process large files with minimal RAM usage
Error Detection: Header validation and decompression errors detected

Patch Archive Storage

Patch data is stored in archives just like regular game data:

Patch Archives: Large files containing multiple patch data blobs
- Located in /patch/ directory on CDN
- Contain BLTE-encoded ZBSDIFF1 patches
- Named with content hashes like regular archives
Patch Archive Indices: Map patch hashes to archive locations
- Use the same .index format as data archives
- Footer uses IndexType::Patch (offset_bytes = 0)
- Allow CDN to locate specific patches within archives
Patch Archive Groups: Client-side optimization structures
- Use the same Archive Group format as data archives
- Group related patches for efficient client caching
- Located in client’s local CASC storage (not on CDN)
- Referenced in .idx files with grouped archive information
CDN Config References:
- patch-archives: List of patch archive hashes
- patch-archives-index: Corresponding index file hashes
- patch-archives-index-size: Size of each index file

This completely mirrors the structure used for data archives:

archives → patch-archives
archives-index → patch-archives-index
Archive Groups → Patch Archive Groups
Same formats, just in /patch/ directory instead of /data/

Patch Chain Building and Validation

Patch Chain Construction

Patches can form chains from one content version to another with cycle detection:

#![allow(unused)]
fn main() {
pub fn build_patch_chain(
    &self,
    start_key: &[u8; 16],
    end_key: &[u8; 16]
) -> Option<PatchChain> {
    let mut chain = Vec::new();
    let mut current_key = *start_key;
    let mut visited = HashSet::new();

    while current_key != *end_key {
        // Cycle detection
        if visited.contains(&current_key) {
            return None; // Cycle detected
        }
        visited.insert(current_key);

        let patch_entry = self.find_patch_for_content(&current_key)?;
        current_key = patch_entry.new_content_key;
        chain.push(patch_entry.clone());

        // Safety limit: prevent infinite chains
        if chain.len() > 10 {
            return None; // Chain too long
        }
    }

    Some(PatchChain { steps: chain, start_key: *start_key, end_key: *end_key })
}
}

Safety Validations

Cycle Detection: Prevents infinite loops in patch chains
Chain Length Limits: Maximum 10 steps to prevent excessive processing
Size Validation: Output size must match header specification
Checksum Verification: Content keys validated after patch application
Stream Bounds Checking: Prevents buffer overflows during streaming

Size Limits and Memory Management

#![allow(unused)]
fn main() {
// ZBSDIFF1 size limits for safety
const MAX_PATCH_SIZE: usize = 100 * 1024 * 1024; // 100MB max patch
const MAX_OUTPUT_SIZE: usize = 1024 * 1024 * 1024; // 1GB max output
const MAX_CONTROL_ENTRIES: usize = 1_000_000; // Prevent memory exhaustion

impl ZbsdiffHeader {
    pub fn validate(&self) -> Result<(), ZbsdiffError> {
        if self.output_size > MAX_OUTPUT_SIZE as u64 {
            return Err(ZbsdiffError::OutputTooLarge(self.output_size));
        }

        if self.control_size + self.diff_size > MAX_PATCH_SIZE as u64 {
            return Err(ZbsdiffError::PatchTooLarge);
        }

        Ok(())
    }
}
}

Patch Application Process

Fetch patch manifest from CDN using patch hash from build config
Parse manifest to find patch entry for target file
Validate patch chain: Check for cycles and reasonable length
Look up patch in patch archive index to find archive and offset
Download patch data from archive using index information
Validate patch size limits before processing
Decode BLTE wrapper and extract ZBSDIFF1 patch
Apply patch using streaming algorithm with bounds checking
Verify result size and hash match expectations

Implementation Notes

Patches are not BLTE-encoded at the manifest level
Individual patch data files may be BLTE-encoded
Block size is typically 64KB (2^16 bytes)
Version 2 is the current patch format version
Patches enable efficient updates without re-downloading entire files

BPSV Format Specification

BPSV (Blizzard Pipe-Separated Values) is a structured data serialization format, similar to CSV but using pipes (|) as delimiters with Blizzard-specific schemas. It’s used in Ribbit API responses, configuration files, and version manifests. BPSV is a data format, not a network protocol.

Format Structure

BPSV files contain three components:

Header line (required)
Sequence number line (optional)
Data rows (zero or more)

graph TD
    A[BPSV File] --> B[Header Line]
    A -.-> C[Sequence Number Line]
    A --> D[Data Rows]

    B --> E["FieldName!TYPE:length|FieldName2!TYPE:length"]
    C -.-> F["seqn = {number}"]
    D --> G["value1|value2|value3"]

    style A stroke-width:4px
    style B stroke-width:3px
    style C stroke-width:2px,stroke-dasharray:5 5
    style D stroke-width:3px
    style E stroke-width:2px
    style F stroke-width:2px
    style G stroke-width:2px

Header Line Format

The header line defines field structure using pipe-separated field definitions:

FieldName!TYPE:length|FieldName2!TYPE:length|FieldName3!TYPE:length

Each field definition contains:

Field name (case-sensitive)
Exclamation mark separator
Field type (case-insensitive)
Colon separator
Length specification

Sequence Number

The optional sequence number appears on a separate line:

## seqn = 12345

Properties:

Always starts with ## seqn
Supported separators: =, :, or space
Contains integer value
Used for version tracking and cache invalidation
Maximum one per file

Accepted formats:

## seqn = 12345 (equals with spaces)
## seqn: 12345 (colon separator)
## seqn 12345 (space only)
Extra whitespace is trimmed

Field Types

BPSV supports three field types:

STRING:length

Text data with length constraints:

Length 0: unlimited characters
Length > 0: maximum character count
Type names: STRING, String, string (case-insensitive)
UTF-8 encoding

HEX:length

Binary data encoded as hexadecimal:

Length specifies bytes in binary form
Requires exactly length × 2 hexadecimal characters
Valid characters: 0-9, a-f, A-F
Empty values always valid
Common usage: HEX:16 for MD5 hashes (32 hex chars)

DEC:length

Decimal integers:

Length indicates storage size (4 = uint32, 8 = uint64)
Length not enforced during parsing
Supports full int64 range
Type names: DEC, Dec, dec (case-insensitive)

Data Rows

Data rows contain pipe-separated values matching header field definitions:

Column count must match header field count
Empty values allowed for all field types
Values parsed according to field type specifications

Parsing Flow

flowchart TD
    A[Start Parsing] --> B[Read First Line]
    B --> C[Parse Header Fields]
    C --> D[Read Next Line]
    D --> E{"Line starts with '## seqn ='?"}
    E -->|Yes| F[Parse Sequence Number]
    E -->|No| G[Parse as Data Row]
    F --> H[Read Next Line]
    H --> I{More Lines?}
    G --> J[Validate Column Count]
    J --> K[Parse Field Values by Type]
    K --> L[Store Data Row]
    L --> I
    I -->|Yes| M[Read Next Line]
    I -->|No| N[Parsing Complete]
    M --> G

    style A stroke-width:4px
    style N stroke-width:4px
    style C stroke-width:3px
    style E stroke-width:3px,stroke-dasharray:5 5
    style I stroke-width:2px,stroke-dasharray:5 5
    style F stroke-width:2px
    style J stroke-width:2px
    style K stroke-width:2px

Usage Context

BPSV is a data serialization format used in multiple contexts:

Ribbit API Responses: Structured data returned by Ribbit protocol
Product Configuration Files: .product files with version information
Version Manifests: Build and CDN configuration references
CDN Configuration: Server URLs and path mappings
Background Downloads: Download priority information

Note: BPSV is the data format; Ribbit is the protocol that transmits BPSV data.

Implementation Requirements

Type Validation

Parsers must validate field values according to type specifications:

STRING fields accept any UTF-8 text
HEX fields require valid hexadecimal characters and exact length
DEC fields must parse as valid integers
Empty values are valid for all field types

Parsing Architecture

Implementations may use different parsing strategies:

Zero-copy parsing: Borrow from original string for efficiency
Owned parsing: Copy data for serialization/storage
Lazy parsing: Keep raw strings until typed values requested
Schema validation: Enforce field uniqueness and type compatibility

Error Handling

Common parsing errors:

Column count mismatch between header and data rows
Invalid characters in HEX fields
Incorrect HEX field length (must be exactly length × 2 chars)
Non-numeric values in DEC fields
Multiple sequence number lines
Duplicate field names in schema

Performance Considerations

Typical file size: < 10MB
Typical row count: < 10,000
UTF-8 encoding recommended
Both Unix (\n) and Windows (\r\n) line endings accepted

Format Examples

Basic Product Configuration

Region!STRING:4|BuildConfig!HEX:16|CDNConfig!HEX:16
## seqn = 98765
us|a1b2c3d4e5f6789012345678|f1e2d3c4b5a69870123456789abcdef0
eu|b2c3d4e5f6789012345678a1|e2d3c4b5a69870123456789abcdef0f1

CDN Server List

Name!STRING:0|Path!STRING:0|Hosts!STRING:0
## seqn = 54321
us|tpr/wow|us.patch.battle.net level3.blizzard.com
eu|tpr/wow|eu.patch.battle.net level3.blizzard.com

Version Information

Product!STRING:10|Seqn!DEC:4|Flags!HEX:4
wow|12345|0001
wowt|12346|0002

Type Casing Examples

Field types accept case variations:

# All valid type specifications
Name!STRING:50|ID!DEC:4|Hash!HEX:16
Name!String:50|ID!Dec:4|Hash!Hex:16
Name!string:50|ID!dec:4|Hash!hex:16

Empty Value Handling

Empty values preserve semantic meaning:

Product!STRING:10|Version!STRING:10|Hash!HEX:16
wow|8.3.0|a1b2c3d4e5f6789012345678
wowt||b2c3d4e5f6789012345678a1

The second row contains an empty version field, which differs from a missing field.

Implementation Status

Rust Implementation (cascette-formats)

BPSV parser and builder:

Schema parsing - Field name, type, and size validation (complete)
Document parsing - Multi-row data with sequence numbers (complete)
Type support - STRING, HEX, and DEC field types (complete)
Round-trip validation - parse(build(data)) == data guarantee (complete)
Case-insensitive types - Accepts STRING, String, string variations (complete)
Builder support - Programmatic BPSV file creation (complete)

Validation Status:

Byte-for-byte round-trip validation
Integration tests with real Ribbit API responses
Handles empty values, comments, and sequence numbers
Validated against real Battle.net BPSV files

Analysis and Usage

BPSV format is used throughout the NGDP system for configuration and version data.

NGDP/CASC Format Transitions

This document summarizes verified format transitions discovered through systematic analysis of WoW builds from 2014-2025, starting with CASC’s introduction in Warlords of Draenor (6.0.x) which replaced the MPQ system.

Verification Methodology

Format transitions were identified through:

Strategic Build Analysis: Examining key builds across WoW versions using tools/examine_build.py
Chronological Comparison: Tracking format changes between adjacent builds
Cross-Product Validation: Comparing wow, wow_classic, wow_classic_era, wow_classic_titan, and wow_anniversary
Automated Verification: Using Python scripts to validate format assumptions

Discovered Format Transitions

Root File Format Evolution

The Root file format has evolved since CASC’s introduction in Warlords of Draenor:

Version 1 (Early CASC, 2014-2021)

Magic: None initially, later MFST (big-endian)
First Seen: Warlords of Draenor (6.0.x) - CASC introduction
Structure: Basic content key mapping with file flags
Features:
- FileDataID to content key mapping
- Basic content/locale flags (32-bit)
- Jenkins96 hash for named files
Note: This is the first CASC Root format, replacing the MPQ system

Version 2 (Transitional CASC, 2021)

Magic: TSFM (little-endian)
First Seen: Shadowlands (9.0.2)
Structure: Added size fields and magic signature
Features:
- TSFM magic signature introduction
- Size fields for validation
- Maintained v1 data structures

Version 3 (Modern CASC, 2021-Present)

Magic: TSFM (little-endian standard)
First Seen: Shadowlands late patches
Structure: Enhanced metadata and extended flags
Features:
- Extended content flags (40-bit total)
- Improved compression efficiency
- Better locale targeting

Version 4 (Current CASC, 2023-Present)

Magic: TSFM
First Seen: Dragonflight (10.x)
Structure: Further optimizations
Features:
- Additional metadata fields
- VFS integration improvements

Verified Transition Points

Based on build examination across retail and Classic:

WoW Retail (wow) Format Evolution:

Version	Build Date	Root Version	Magic	Config Fields	Key Changes
6.0.1.18125	2014-06-20	1	None	13	CASC introduction, replacing MPQ
7.3.5.25848	2018-01-16	1	None	15	Still using v1 format
9.0.2.37176	2021-01-13	2	TSFM	17	Major transition: TSFM magic, size fields added
10.1.5.51130	2023-08-31	3	TSFM	1,623	VFS expansion: 1,600+ virtual file system fields added
11.2.0.62748	2025-08-22	3	TSFM	1,716	Current retail standard with extended features

WoW Classic (wow_classic) Format Evolution:

Version	Build Date	Root Version	Magic	Config Fields	Key Changes
1.13.0.28211	2018-10-23	1	None	13	Classic launch using CASC v1
2.5.2.39926	2021-08-31	1	None	16	Patch fields added
3.4.2.50063	2023-06-20	1	None	756	VFS adoption: 740+ VFS fields
3.4.4.61075	2025-05-28	3	TSFM	758	Format jump: Skipped v2, went directly to v3
5.5.0.62655	2025-08-19	3	TSFM	905	Current Classic standard

Classic Format Lag Pattern

Classic follows retail with significant delays:

Root v1→v2/v3: Retail (2021) → Classic (2025) = 4 years behind
VFS Introduction: Retail (2023) → Classic (2023) = 18 months behind
TSFM Magic: Retail (2021) → Classic (2025) = 4 years behind

Classic skipped Root v2 entirely, jumping directly from v1 to v3, demonstrating selective adoption of retail improvements.

Parser Compatibility Matrix

Based on verified transitions, parsers must support:

Product	Supported Root Versions	Magic Detection	VFS Support	Timeframe
wow_classic_era	v3 only	TSFM	Modern	2021+ (uses retail backend)
wow_classic	v1, v3	None, TSFM	Legacy → Modern	2018-2025
wow_classic_titan	v3 only	TSFM	Modern	2025+ (CN only, WotLK 3.80.x)
wow_anniversary	v3 only	TSFM	Modern	2025+ (TBC 2.5.x)
wow	v1, v2, v3	None, TSFM	Legacy → Modern	2018-2025

Implementation Recommendation: Always attempt v3 parsing first with TSFM magic detection, then fall back to v1 legacy format. Root v2 is retail-specific and uncommon.

Build Configuration Evolution

Build configurations have evolved to support new file types and compression methods:

Early CASC (6.0.x)

root = <content_key>
encoding = <content_key> <encoding_key>
install = <content_key> <encoding_key>
download = <content_key> <encoding_key>

Modern CASC (11.x)

root = <content_key>
encoding = <content_key> <encoding_key>
install = <content_key> <encoding_key>
download = <content_key> <encoding_key>
patch = <patch_key>
size = <content_key> <encoding_key>

Evolution Pattern:

Root field simplified to single content key
New fields added (patch, size) for enhanced functionality
Encoding/install/download maintain dual-key format

BLTE Format Evolution

BLTE (Block Table Encoded) compression has remained stable but usage patterns evolved:

Compression Type Usage by Era

Era	None (N)	ZLIB (Z)	Encrypted (E)	Frame (F)
Early CASC	20%	75%	0%	5%
Modern CASC	15%	60%	5%	20%

Key Changes:

Increased use of Frame compression for nested compression
Introduction of encrypted blocks for sensitive data
ZLIB remains primary compression method

Block Structure Evolution

Single Block: Simpler files, configuration data
Multi Block: Large files, game assets
Trend: Growing use of multi-block for better streaming

Verification Scripts

Format verification tools have been moved to the cascette-py project: https://github.com/wowemulation-dev/cascette-py

The Python implementation includes:

Cache management for downloaded files
Root file version detection testing
Build configuration evolution tracking
BLTE compression pattern analysis
Complete format verification suite

See the cascette-py documentation for setup and usage instructions.

Implementation Impact

For Rust Implementation

Based on verified format evolution across retail and Classic:

Root File Parser:
- Primary Support: Root v1 (legacy) and v3 (modern) formats
- Limited Support: Root v2 (retail-only transition format)
- Magic Detection: TSFM (little-endian) and None (legacy)
- Version Strategy: Try v3+TSFM first, fall back to v1+None
Configuration Parser:
- Early Builds: 13-17 fields (simple key=value)
- VFS Era: 756-1,716 fields (massive vfs-* expansion)
- Feature Support: Handle feature-placeholder and VFS fields
- Backwards Compatibility: Support both v1 (legacy) and v3 (modern) formats
Product-Specific Logic:
- wow_classic_era: Always modern format (v3, TSFM)
- wow_classic: Dual format support with clear transition point (2025)
- wow_classic_titan: Modern format only (v3, TSFM), 368 VFS entries, CN region only
- wow_anniversary: Modern format only (v3, TSFM), 325 VFS entries, all regions
- wow retail: Full format evolution support (2018-2025)
BLTE Decoder: All compression types (N, Z, E, F) with consistent usage

patterns across all product lines

Key Architectural Decisions

Version Detection Strategy:

#![allow(unused)]
fn main() {
// Recommended parsing order
if has_tsfm_magic() {
    try_root_v3_format()
} else {
    try_root_v1_format()
}
}

Configuration Parsing:
- VFS Detection: Fields starting with vfs- indicate modern builds
- Feature Detection: feature-placeholder indicates latest builds
- Backwards Compatibility: Always support minimal 13-field format
Product Detection:
- Use Wago.tools build database for version context
- Classic Era assumes modern format post-2021
- Classic has explicit v1→v3 transition in May 2025
Testing Strategy: Verify against all transition points with real build data

Future Analysis

Formats not yet tracked for transitions:

Encoding file table structure changes
Install/Download tag system evolution
Archive index format stability
Patch file introduction timeline

References

Last Updated: 2025-08-23 Verification Status: Automated verification scripts created and tested Next Review: After implementing Rust parsers based on verified formats

BLTE (Block Table Encoded) Format

BLTE is NGDP’s container format for compressed and optionally encrypted content. It provides block-based compression, encryption support, and efficient streaming capabilities for game data delivery.

Overview

BLTE files wrap game content with:

Optional multi-block structure for large files
Per-block compression (none, zlib, or others)
Optional encryption (Salsa20 or ARC4)
MD5 checksums for integrity verification

Binary Format

File Structure

BLTE File Layout:
┌─────────────────────────┐
│ BLTE Header (8 bytes)   │
├─────────────────────────┤
│ Extended Header         │ (optional, if header_size > 0)
│ - Flags (1 byte)        │
│ - Chunk Count (3 bytes) │
├─────────────────────────┤
│ Chunk Info Table        │ (24 bytes per chunk)
│ - Compressed Size       │
│ - Decompressed Size     │
│ - MD5 Checksum          │
├─────────────────────────┤
│ Data Block 1            │
│ - Encoding Type (1 byte)│
│ - Compressed Data       │
├─────────────────────────┤
│ Data Block 2            │
│ ...                     │
└─────────────────────────┘

Header Format

#![allow(unused)]
fn main() {
// Primary BLTE header (always 8 bytes)
struct BlteHeader {
    magic: [u8; 4],        // "BLTE" (0x424C5445 in big-endian)
    header_size: u32,      // Big-endian, total header size including these 8 bytes
}
}

Header Size Values

header_size == 0: Single chunk file, no extended header
header_size > 0: Multi-chunk file with extended header

Extended Header

Present only when header_size > 0:

#![allow(unused)]
fn main() {
struct ExtendedHeader {
    flags: u8,             // 0x0F = standard, 0x10 = extended
    chunk_count: [u8; 3],  // 24-bit big-endian chunk count
}
}

Chunk Information Table

Standard Format (flags = 0x0F)

Each chunk has a 24-byte entry:

#![allow(unused)]
fn main() {
struct ChunkInfo {
    compressed_size: u32,      // Big-endian
    decompressed_size: u32,    // Big-endian
    checksum: [u8; 16],        // MD5 of compressed chunk data
}
}

Extended Format (flags = 0x10)

Each chunk has a 40-byte entry:

#![allow(unused)]
fn main() {
struct ExtendedChunkInfo {
    compressed_size: u32,      // Big-endian
    decompressed_size: u32,    // Big-endian
    checksum: [u8; 16],        // MD5 of compressed chunk data
    decompressed_checksum: [u8; 16], // MD5 of decompressed chunk data
}
}

This extended format provides additional integrity checking with MD5 checksums of both compressed and decompressed data.

Formula Validation

For standard chunks (flags = 0x0F):

header_size = 12 + (chunk_count * 24)

For extended chunks (flags = 0x10):

header_size = 12 + (chunk_count * 40)

Where:

12 = 8 (BLTE header) + 1 (flags) + 3 (chunk count)
24 = size of standard ChunkInfo entry
40 = size of extended ChunkInfo entry

The header_size field includes the 8-byte BLTE header (“BLTE” magic + header_size u32). Data starts at offset header_size from the beginning of the file.

Encoding Types

Each data block starts with a single-byte encoding type:

Byte	Character	Type	Description
0x4E	‘N’	None	Uncompressed data
0x5A	‘Z’	ZLib	ZLib compressed (deflate)
0x34	‘4’	LZ4	LZ4HC high compression
0x45	‘E’	Encrypted	Encrypted data block
0x46	‘F’	Frame	Recursive BLTE (deprecated)

Compression Formats

None (0x4E)

Uncompressed data follows immediately after the encoding byte:

[0x4E] [raw data...]

ZLib (0x5A)

Standard zlib compression:

[0x5A] [2-byte zlib header] [deflate stream...]

Important: Most implementations skip the zlib header and use raw deflate.

LZ4 (0x34)

LZ4HC (high compression) format:

[0x34] [decompressed_size:8] [compressed_lz4_data...]

decompressed_size: 64-bit little-endian size
Data following the prefix is a single LZ4 block (no sub-blocks)
Provides ~200-300 MB/s decompression speed

Format discrepancy: The WoWDev wiki describes a different LZ4 format with headerVersion (1 byte), 64-bit big-endian size, blockShift (1 byte, range 5-16), and multiple sub-blocks of 1 << blockShift bytes each. Agent.exe 3.13.3 uses the 8-byte LE prefix + single block format documented above. tact::Codec::DecodeLZ4 at 0x6f5fdb is a stub in Agent.exe 3.13.3 (returns error 5), so the LZ4 format cannot be fully verified from this binary version. cascette-rs matches the Agent.exe format. The wiki format may apply to a newer protocol version or a different product.

Encryption Format

Encrypted Block Structure

[0x45] [key_name_size:1] [key_name:8] [iv_size:1] [iv:4] [type:1]
[encrypted_data...]

Fields:

key_name_size: Usually 8
key_name: 64-bit key identifier
iv_size: Usually 4
iv: Initialization vector
type: 0x53 (‘S’) for Salsa20, 0x41 (‘A’) for ARC4 (legacy, not used in TACT 3.13.3+)

IV Extension and Modification for Chunks

The IV (typically 4 bytes) is zero-padded to 8 bytes for the Salsa20 nonce:

#![allow(unused)]
fn main() {
let mut nonce = [0u8; 8];  // zero-initialized
nonce[..iv_size].copy_from_slice(&iv);
// Remaining bytes stay zero (NOT duplicated)
}

For multi-chunk files, the IV is XORed with the chunk index before extension:

#![allow(unused)]
fn main() {
fn modify_iv(iv: &mut [u8], chunk_index: usize) {
    for i in 0..4 {
        iv[i] ^= ((chunk_index >> (i * 8)) & 0xFF) as u8;
    }
}
}

Parsing Algorithm

Step 1: Read BLTE Header

#![allow(unused)]
fn main() {
let magic = read_u32_be();  // Must be 0x424C5445 ("BLTE")
let header_size = read_u32_be();
}

Step 2: Determine Structure

#![allow(unused)]
fn main() {
if header_size == 0 {
    // Single chunk file
    // Data starts at offset 8
    // Chunk size = file_size - 8 - 1 (encoding byte)
} else {
    // Multi-chunk file
    // Read extended header and chunk table
    // Note: Data offset calculation varies by format!
}
}

The data offset for multi-chunk files is always header_size from the start of the file. The header_size field includes the 8-byte BLTE header.

Step 3: Read Extended Header (if present)

#![allow(unused)]
fn main() {
let flags = read_u8();  // 0x0F for standard, 0x10 for extended
let chunk_count = read_u24_be();  // 24-bit big-endian

// Read chunk information table
let chunks = Vec::with_capacity(chunk_count);
for _ in 0..chunk_count {
    chunks.push(ChunkInfo {
        compressed_size: read_u32_be(),
        decompressed_size: read_u32_be(),
        checksum: read_bytes(16),
    });
}
}

Step 4: Process Data Blocks

#![allow(unused)]
fn main() {
let mut output = Vec::new();
let mut offset = header_size;

for chunk_info in chunks {
    // Read chunk data
    let chunk_data = &data[offset..offset + chunk_info.compressed_size];

    // Optionally verify MD5 checksum (not done automatically during parsing)
    // let hash = md5::compute(chunk_data);
    // assert_eq!(hash.0, chunk_info.checksum);

    // Decompress based on encoding type
    let decompressed = decompress_chunk(chunk_data);
    output.extend_from_slice(&decompressed);

    offset += chunk_info.compressed_size;
}
}

Decompression Implementation

#![allow(unused)]
fn main() {
fn decompress_chunk(data: &[u8]) -> Result<Vec<u8>> {
    if data.is_empty() {
        return Err("Empty chunk");
    }

    match data[0] {
        0x4E => {
            // None - return raw data
            Ok(data[1..].to_vec())
        },
        0x5A => {
            // ZLib - decompress using deflate
            // Skip: [0x5A] [78 9C] (zlib header)
            let deflate_data = &data[3..];
            decompress_deflate(deflate_data)
        },
        0x34 => {
            // LZ4 - high compression
            let decompressed_size = u64::from_le_bytes(
                data[1..9].try_into()?
            );
            let compressed_data = &data[9..];
            decompress_lz4(compressed_data, decompressed_size as usize)
        },
        0x45 => {
            // Encrypted - requires key
            decrypt_chunk(&data[1..])
        },
        0x46 => {
            // Frame - recursive BLTE
            let inner_blte = &data[1..];
            parse_blte(inner_blte)
        },
        _ => Err("Unknown encoding type"),
    }
}
}

Real-World Example

Let’s examine the encoding file we fetched earlier:

00000000: 424c 5445 0000 00b4 0f00 0007 0000 0017  BLTE............
          ^^^^^^^^^ ^^^^^^^^^ ^^ ^^^^^^^ ^^^^^^^^^
          Magic     Hdr Size  F  Count   CompSize

Breaking down the header:

- Magic: 0x424C5445 = "BLTE"

- Header Size: 0x000000B4 = 180 bytes

- Flags: 0x0F (required value)

- Chunk Count: 0x000007 = 7 chunks

- First Chunk Compressed Size: 0x00000017 = 23 bytes

This indicates:

Multi-chunk file (header_size > 0)
7 chunks total
Extended header size = 12 + (7 * 24) = 180 bytes

Performance Characteristics

Compression Mode Comparison

Mode	Compression Speed	Decompression Speed	Compression Ratio	Memory Usage
None	~500 MB/s	~500 MB/s	1.0x	Minimal
LZ4	~200 MB/s	~300 MB/s	2-4x	~64 KB
ZLib	~50-150 MB/s	~100-200 MB/s	3-8x	~256 KB

Data Type Recommendations

Data Type	Recommended Mode	Reasoning
Text/Config	ZLib (level 6-9)	High compressibility, access infrequent
Textures	LZ4 or None	Often pre-compressed, need fast access
Audio	None or LZ4	Poor compressibility, streaming required
Models	ZLib (level 3-6)	Structured data compresses well
Temporary	None	Speed critical, short-lived

Special Cases

Headerless Files

When header_size == 0:

Single chunk only
No chunk information table
Data starts immediately at offset 8
Entire remaining file is one compressed block

Empty Chunks

Some chunks may have:

compressed_size == 0
decompressed_size == 0
Usually placeholders or removed content

Large Files

Multi-chunk structure enables parallel decompression and partial/resumable downloads, allowing streaming installation of large files.

Error Handling

Critical checks:

Verify BLTE magic number
Validate flags == 0x0F for extended headers
Check chunk count > 0 when header_size > 0
MD5 checksums are available via verify_checksum() on each chunk (not verified automatically during parsing)
Handle unknown encoding types gracefully
Ensure decompressed size matches expected
Enforce maximum decompression size (1 GB) to prevent decompression bombs

Implementation Considerations

Process chunks incrementally rather than loading entire files into memory
Decompress chunks in parallel where possible
Checksum verification is a separate step from parsing (call verify_checksum() on chunk data)
Maximum decompression size is 1 GB (MAX_DECOMPRESSION_SIZE). Chunks claiming a larger decompressed size are rejected

Integration with NGDP

BLTE files in NGDP context:

Fetched using encoding keys from CDN
May be stored in archives or as loose files
Encoding file maps content keys to BLTE-encoded versions
Archive indices point to BLTE data within archives

Debugging Tips

Identifying BLTE Files

# Check for BLTE magic
xxd -l 4 file.bin
# Should show: 424c 5445 (BLTE)

# Check header size
xxd -s 4 -l 4 -e file.bin
# Big-endian u32 value

Common Issues

Wrong endianness: BLTE uses big-endian, not little-endian
Skipping zlib header: Most implementations skip bytes 1-2 after 0x5A
IV modification: Remember to XOR IV with chunk index for encryption
Checksum validation: Use MD5 of compressed data, not decompressed

Implementation Status

Rust Implementation (cascette-formats)

BLTE parser and builder:

None (N) - Uncompressed passthrough (complete)
ZLib (Z) - Deflate compression using flate2 (complete)
LZ4 (4) - LZ4 compression with proper size headers (complete)
Encrypted (E) - Salsa20 and ARC4 encryption with multi-chunk support (complete)
Frame (F) - Recursive BLTE support (not implemented, deprecated format)
Extended Format - Full support for 0x10 format with dual checksums (complete)

Validation Status:

Byte-for-byte round-trip validation with real WoW files
Successfully processes encoding, root, install, and download files
Integration tests with WoW Classic Era production data
Builder support for creating valid BLTE files programmatically
Both standard (0x0F) and extended (0x10) chunk formats supported

Python Tools (cascette-py)

Analysis and decompression tool supports:

None (N), ZLib (Z), Frame (F) modes
LZ4 (4) - Analysis only, decompression requires Rust implementation
Encrypted (E) - Detection and metadata extraction

See https://github.com/wowemulation-dev/cascette-py for the Python implementation.

References

wowdev.wiki BLTE documentation
See ESpec Format for encoding specification strings
See Salsa20 Encryption for encrypted block details

ESpec (Encoding Specification) Documentation

Overview

ESpec is a domain-specific language used throughout NGDP for specifying BLTE encoding instructions. It defines how content blocks are compressed, encrypted, and structured within BLTE containers. ESpec appears in patch configurations, encoding files, and BLTE block headers.

Grammar Components

ESpec uses single-character identifiers for encoding operations:

Basic Encodings

n: Plain/uncompressed data
z: Zlib compression
e: Encryption
b: Block-based encoding
c: BCPack compression
g: GDeflate compression

Encoding Combinations

ESpec supports nested and sequential encoding operations through composition.

Block Syntax

Size Specifications

Block sizes support unit suffixes:

K: Kilobytes (1024 bytes)
M: Megabytes (1024 * 1024 bytes)
No suffix: Bytes

Count Specifications

Block counts can be:

Exact number: Specific block count (e.g., 3)
Variable: Asterisk (*) for variable block count
Dynamic sizing: Block count of zero with an average block size. Block boundaries are determined dynamically based on content. Distinct from variable (*) block count.

Block Format

b:{size[*count]=encoding}

Components:

size: Block size with optional unit suffix
count: Block count (optional, defaults to 1)
encoding: Encoding specification for blocks

Grammar Reference

Simple Encodings

plain := "n"
zlib := "z" [ ":" ( level | "{" zlib_params "}" ) ]
zlib_params := ( level | variant ) [ "," ( variant | window_bits ) ] [ "," window_bits ]
encryption := "e" ":" "{" key "," iv "," content_encoding "}"

Zlib supports multiple syntax forms: z, z:9, z:{9}, z:{9,mpq}, z:{9,15}, z:{9,mpq,15}, z:{mpq}, z:{mpq,15}. The second parameter can be either a variant name or a numeric window_bits value.

Block Encoding

block := "b" ":" ( "{" block_spec { "," block_spec } "}" | encoding )
block_spec := size [ "*" count ] "=" encoding
size := number [ unit ]
unit := "K" | "M"
count := number | "*"

A block table can omit braces when it contains a single encoding with no size specification: b:z is equivalent to a single block with no explicit size.

Complex Encodings

encoding := plain | zlib | encryption | block | bcpack | gdeflate
bcpack := "c" [ ":" "{" bcn "}" ]
gdeflate := "g" [ ":" "{" level "}" ]

Examples

Simple Block Encoding

b:{495=z,9673=n}

This specifies:

First block: 495 bytes, zlib compressed
Second block: 9673 bytes, uncompressed

Variable Block Sizes

b:{16K*=z}

This specifies:

Variable number of 16KB blocks
All blocks use zlib compression

Encrypted Blocks

b:{256K*=e:{key,iv,z}}

This specifies:

Variable number of 256KB blocks
Each block is encrypted with specified key and IV
Content is zlib compressed before encryption

Compression Levels

b:{16K*=z:{6,mpq}}

This specifies:

Variable number of 16KB blocks
Zlib compression level 6
MPQ-compatible compression settings

Mixed Block Types

b:{1K=n,4K*=z,2K=n}

This specifies:

First block: 1KB uncompressed
Variable number of 4KB zlib-compressed blocks
Final block: 2KB uncompressed

Zlib Compression Levels

Level Specification

Zlib compression supports level, variant, and window bits parameters:

z:{level}
z:{level,window_bits}
z:{level,variant}
z:{level,variant,window_bits}

Standard Levels

Valid levels are 1-9:

1: Fastest compression
6: Default compression (balance of speed/size)
9: Maximum compression

Level 0 is not accepted.

Variant Specifications

mpq: MPQ-compatible compression settings
zlib: Standard zlib settings
lz4hc: LZ4HC-compatible compression settings

Window Bits

Zlib window bit count can be specified in range [8, 15]. Two values can be provided (must match). Default is 15.

Compression Examples

z:{1}           # Fast compression
z:{9}           # Maximum compression
z:{6,mpq}       # MPQ-compatible level 6
z:{6,zlib,15}   # Zlib variant with explicit window bits

Encryption Specification

Format

e:{key,iv,content_encoding}

Components

key: Encryption key identifier or value
iv: Initialization vector
content_encoding: Encoding applied before encryption

Key Format

Keys must be exactly 16 hex characters (8 bytes):

e:{0123456789abcdef,fedcba98,z}

This specifies:

Encryption key: 0123456789abcdef (16 hex chars, 8 bytes)
IV: fedcba98 (8 hex chars, 4 bytes)
Content: zlib compressed before encryption

The parser rejects keys that are not exactly 16 hex characters. The IV must be exactly 8 hex characters (4 bytes).

BCPack Compression

BCPack Usage

c
c:{3}

BCPack compression uses a proprietary algorithm optimized for specific content types. An optional BCn (block compression number) parameter selects the mode, in range [1, 7]:

bcpack := "c" [ ":" "{" bcn "}" ]

Block-Based BCPack

b:{64K*=c}
b:{64K*=c:{5}}

Variable 64KB blocks using BCPack compression.

GDeflate Compression

GDeflate Usage

g
g:{6}

GDeflate is a GPU-accelerated deflate variant designed for DirectStorage. An optional compression level parameter can be specified in range [1, 12]:

gdeflate := "g" [ ":" "{" level "}" ]

Block-Based GDeflate

b:{32K*=g}
b:{32K*=g:{8}}

Variable 32KB blocks using GDeflate compression.

Usage Contexts

PatchConfig Files

ESpec appears in patch-entry lines:

patch-entry = source_hash target_hash size espec

Example:

patch-entry = 1234567890abcdef abcdef1234567890 524288 b:{16K*=z}

Encoding Files

Encoding files use ESpec for content encoding specifications:

content_key encoded_key size espec

BLTE Data Blocks

BLTE headers contain ESpec for block processing instructions:

graph TD
    A[BLTE Header] --> B[Block Count]
    A --> C[ESpec]
    C --> D[Block 1 Processing]
    C --> E[Block 2 Processing]
    C --> F[Block N Processing]

Parser Implementation

Tokenization

ESpec parsing requires tokenization of:

Identifiers: Single characters (n, z, e, b, c, g)
Numbers: Decimal integers
Units: Size suffixes (K, M)
Delimiters: Braces, colons, commas, equals, asterisks

Grammar Rules

#![allow(unused)]
fn main() {
// Example parser structure
enum ESpec {
    Plain,
    Zlib { level: Option<u8>, variant: Option<String> },
    Encryption { key: String, iv: String, content: Box<ESpec> },
    Block { specs: Vec<BlockSpec> },
    BCPack,
    GDeflate,
}

struct BlockSpec {
    size: u64,
    count: BlockCount,
    encoding: ESpec,
}

enum BlockCount {
    Exact(u32),
    Variable,
}
}

Error Handling

Common parsing errors:

Invalid identifier characters
Malformed block specifications
Missing required parameters
Invalid size or count values
Unbalanced braces or parentheses

Validation Rules

Size Constraints

Block sizes must be positive integers
Maximum block size typically limited to several MB
Minimum block size typically 1 byte

Count Constraints

Block counts must be positive integers when specified
Variable count (*) requires size specification
Total content size must be consistent

Encoding Constraints

Encryption requires valid key and IV lengths
Compression levels must be within algorithm-specific ranges
Nested encodings must be logically valid

Performance Considerations

Block Size Selection

Block sizes depend on usage:

Small blocks (1-4KB): Better for streaming, higher overhead
Medium blocks (16-64KB): Balanced performance
Large blocks (256KB+): Better compression ratios, higher memory usage

Compression Algorithm Selection

Algorithm characteristics:

zlib: Universal compatibility, good compression
BCPack: Optimized for specific content types
GDeflate: Fast compression with good ratios
None (n): Maximum speed, no space savings

Memory Usage

#![allow(unused)]
fn main() {
// Example memory-efficient processing
fn process_blocks(espec: &ESpec, data: &[u8]) -> Result<Vec<u8>> {
    match espec {
        ESpec::Block { specs } => {
            let mut output = Vec::new();
            let mut offset = 0;

            for spec in specs {
                let block_data = &data[offset..offset + spec.size as usize];
                let processed = process_encoding(&spec.encoding, block_data)?;
                output.extend(processed);
                offset += spec.size as usize;
            }

            Ok(output)
        }
        // Other encoding types...
    }
}
}

Common Patterns

Streaming-Optimized

b:{16K*=z}

Small, consistent block sizes for streaming applications.

Storage-Optimized

b:{1M*=z:{9}}

Large blocks with maximum compression for storage efficiency.

Mixed Content

b:{4K=n,64K*=z,4K=n}

Headers and footers uncompressed, bulk content compressed.

Encrypted Streaming

b:{32K*=e:{key,iv,z:{6}}}

Moderate block sizes with encryption and balanced compression.

Debugging and Validation

ESpec Validation

#![allow(unused)]
fn main() {
fn validate_espec(espec: &str) -> Result<ESpec, ESpecError> {
    let parsed = parse_espec(espec)?;
    validate_constraints(&parsed)?;
    Ok(parsed)
}

fn validate_constraints(espec: &ESpec) -> Result<(), ESpecError> {
    match espec {
        ESpec::Zlib { level: Some(level), .. } if *level > 9 => {
            Err(ESpecError::InvalidCompressionLevel(*level))
        }
        ESpec::Block { specs } if specs.is_empty() => {
            Err(ESpecError::EmptyBlockSpec)
        }
        // Additional validation rules...
        _ => Ok(())
    }
}
}

Round-Trip Testing

#![allow(unused)]
fn main() {
#[test]
fn test_espec_round_trip() {
    let original = "b:{16K*=z:{6}}";
    let parsed = parse_espec(original).unwrap();
    let serialized = serialize_espec(&parsed);
    assert_eq!(original, serialized);
}
}

Integration Examples

BLTE Block Processing

#![allow(unused)]
fn main() {
fn process_blte_block(espec: &ESpec, input: &[u8]) -> Result<Vec<u8>> {
    match espec {
        ESpec::Plain => Ok(input.to_vec()),
        ESpec::Zlib { level, .. } => decompress_zlib(input),
        ESpec::Encryption { key, iv, content } => {
            let decrypted = decrypt(input, key, iv)?;
            process_blte_block(content, &decrypted)
        }
        ESpec::Block { specs } => process_block_specs(specs, input),
    }
}
}

Patch Application

#![allow(unused)]
fn main() {
fn apply_patch_with_espec(
    source: &[u8],
    patch: &[u8],
    espec: &ESpec
) -> Result<Vec<u8>> {
    let processed_patch = process_blte_block(espec, patch)?;
    apply_binary_patch(source, &processed_patch)
}
}

Reference Implementation

Complete Parser

#![allow(unused)]
fn main() {
use nom::{
    branch::alt,
    bytes::complete::tag,
    character::complete::{alphanumeric1, char, digit1},
    combinator::{map, opt},
    multi::separated_list0,
    sequence::{delimited, preceded, separated_pair, tuple},
    IResult,
};

pub fn parse_espec(input: &str) -> IResult<&str, ESpec> {
    alt((
        parse_plain,
        parse_zlib,
        parse_encryption,
        parse_block,
        parse_bcpack,
        parse_gdeflate,
    ))(input)
}

fn parse_plain(input: &str) -> IResult<&str, ESpec> {
    map(char('n'), |_| ESpec::Plain)(input)
}

fn parse_zlib(input: &str) -> IResult<&str, ESpec> {
    map(
        tuple((
            char('z'),
            opt(preceded(
                char(':'),
                delimited(
                    char('{'),
                    separated_pair(
                        digit1,
                        opt(char(',')),
                        opt(alphanumeric1)
                    ),
                    char('}')
                )
            ))
        )),
        |(_, params)| match params {
            Some((level, variant)) => ESpec::Zlib {
                level: level.parse().ok(),
                variant: variant.map(|s| s.to_string()),
            },
            None => ESpec::Zlib { level: None, variant: None },
        }
    )(input)
}
}

Implementation Status

Rust Implementation (cascette-formats)

ESpec parser:

Plain (n) - Uncompressed content
ZLib compression (z) - Level [1,9], variant (mpq/zlib/lz4hc), window bits [8,15]; all optional, 3-param syntax supported
Encryption (e) - Key, IV, and nested content encoding
Block-based (b) - Variable and fixed block specifications
BCPack (c) - Optional BCn version [1,7]; bare c accepted
GDeflate (g) - Optional level [1,12]; bare g accepted

Parser Features:

Safe integer casting with try_from to prevent truncation
Display trait implementation for round-trip string conversion
Test suite covering production ESpec patterns and edge cases
Integration with BLTE and Encoding file processing

Analysis and Validation

ESpec patterns are validated across all CASC formats to ensure correct parsing and processing of compression and encryption specifications.

Salsa20 Encryption in CASC

Salsa20 is the primary stream cipher used for encrypting sensitive content in CASC archives. It provides fast, secure encryption for game assets while maintaining streaming capabilities.

Overview

CASC uses Salsa20 with 128-bit (16-byte) keys and the tau (“expand 16-byte k”) constants. Each encrypted BLTE block specifies a 64-bit key name for key store lookup and a 4-byte IV that is extended to 8 bytes by zero-padding.

Algorithm Details

Salsa20 Core

Salsa20 is a stream cipher designed by Daniel J. Bernstein:

Key size: 128 bits (16 bytes) in CASC; 256 bits (32 bytes) in standard Salsa20
Nonce/IV size: 64 bits (8 bytes)
Block size: 512 bits (64 bytes)
Rounds: 20 (reduced variants use 8 or 12)

Core Function

#![allow(unused)]
fn main() {
fn salsa20_core(input: &[u32; 16]) -> [u32; 16] {
    let mut x = *input;

    // 20 rounds (10 double-rounds)
    for _ in 0..10 {
        // Column round
        quarter_round(&mut x, 0, 4, 8, 12);
        quarter_round(&mut x, 5, 9, 13, 1);
        quarter_round(&mut x, 10, 14, 2, 6);
        quarter_round(&mut x, 15, 3, 7, 11);

        // Row round
        quarter_round(&mut x, 0, 1, 2, 3);
        quarter_round(&mut x, 5, 6, 7, 4);
        quarter_round(&mut x, 10, 11, 8, 9);
        quarter_round(&mut x, 15, 12, 13, 14);
    }

    // Add input to output
    for i in 0..16 {
        x[i] = x[i].wrapping_add(input[i]);
    }

    x
}

fn quarter_round(x: &mut [u32; 16], a: usize, b: usize, c: usize, d: usize) {
    x[b] ^= (x[a].wrapping_add(x[d])).rotate_left(7);
    x[c] ^= (x[b].wrapping_add(x[a])).rotate_left(9);
    x[d] ^= (x[c].wrapping_add(x[b])).rotate_left(13);
    x[a] ^= (x[d].wrapping_add(x[c])).rotate_left(18);
}
}

CASC Implementation

BLTE Encryption Block

In BLTE files, encrypted blocks use format:

[0x45] [key_name_size:1] [key_name:8] [iv_size:1] [iv:4] [type:1]
[encrypted_data...]

Where:

0x45: ‘E’ marker for encrypted block
key_name: 64-bit key identifier
iv: Initialization vector (1-8 bytes, typically 4)
type: 0x53 (‘S’) for Salsa20. 0x41 (‘A’) for ARC4 in legacy CASC versions (not used in TACT 3.13.3+)

Key Lookup

CASC uses a 64-bit key name to look up the 16-byte encryption key from a key store. The agent calls a key getter callback with the key name; there is no key derivation in the encryption path.

#![allow(unused)]
fn main() {
struct CASCKeyManager {
    keys: HashMap<u64, [u8; 16]>,  // key_name -> 16-byte key
}

impl CASCKeyManager {
    pub fn get_key(&self, key_name: u64) -> Option<[u8; 16]> {
        self.keys.get(&key_name).copied()
    }
}
}

IV Modification for Chunks

For multi-chunk BLTE files, the IV is modified per chunk:

#![allow(unused)]
fn main() {
fn modify_iv_for_chunk(base_iv: u32, chunk_index: usize) -> u32 {
    let mut iv_bytes = base_iv.to_le_bytes();

    // XOR with chunk index
    for i in 0..4 {
        iv_bytes[i] ^= ((chunk_index >> (i * 8)) & 0xFF) as u8;
    }

    u32::from_le_bytes(iv_bytes)
}
}

Salsa20 State Setup

State Initialization

#![allow(unused)]
fn main() {
struct Salsa20State {
    state: [u32; 16],
    counter: u64,
}

impl Salsa20State {
    pub fn new(key: &[u8; 16], nonce: &[u8; 8]) -> Self {
        let mut state = [0u32; 16];

        // Tau constants "expand 16-byte k" (CASC uses 16-byte keys)
        state[0]  = 0x61707865; // "expa"
        state[5]  = 0x3120646e; // "nd 1"
        state[10] = 0x79622d36; // "6-by"
        state[15] = 0x6b206574; // "te k"

        // 16-byte key placed at positions 1-4 and duplicated at 11-14
        for i in 0..4 {
            let word = u32::from_le_bytes([
                key[i * 4],
                key[i * 4 + 1],
                key[i * 4 + 2],
                key[i * 4 + 3],
            ]);
            state[1 + i] = word;
            state[11 + i] = word;  // Duplicate for 16-byte key mode
        }

        // Counter (initially 0)
        state[8] = 0;
        state[9] = 0;

        // Nonce
        state[6] = u32::from_le_bytes([nonce[0], nonce[1], nonce[2], nonce[3]]);
        state[7] = u32::from_le_bytes([nonce[4], nonce[5], nonce[6], nonce[7]]);

        Salsa20State { state, counter: 0 }
    }
}
}

Encryption/Decryption

Stream Generation

#![allow(unused)]
fn main() {
impl Salsa20State {
    pub fn generate_keystream(&mut self, output: &mut [u8]) {
        let mut pos = 0;

        while pos < output.len() {
            // Generate next block
            let block = salsa20_core(&self.state);

            // Convert to bytes
            let block_bytes = unsafe {
                std::slice::from_raw_parts(
                    block.as_ptr() as *const u8,
                    64
                )
            };

            // Copy to output
            let copy_len = std::cmp::min(64, output.len() - pos);
            output[pos..pos + copy_len]
                .copy_from_slice(&block_bytes[..copy_len]);

            // Increment counter
            self.increment_counter();
            pos += copy_len;
        }
    }

    fn increment_counter(&mut self) {
        self.counter += 1;
        self.state[8] = (self.counter & 0xFFFFFFFF) as u32;
        self.state[9] = (self.counter >> 32) as u32;
    }
}
}

Decryption Process

#![allow(unused)]
fn main() {
pub fn decrypt_salsa20(
    ciphertext: &[u8],
    key: &[u8; 32],
    nonce: &[u8; 8]
) -> Vec<u8> {
    let mut state = Salsa20State::new(key, nonce);
    let mut keystream = vec![0u8; ciphertext.len()];
    state.generate_keystream(&mut keystream);

    // XOR ciphertext with keystream
    let mut plaintext = Vec::with_capacity(ciphertext.len());
    for i in 0..ciphertext.len() {
        plaintext.push(ciphertext[i] ^ keystream[i]);
    }

    plaintext
}
}

CASC-Specific Usage

BLTE Decryption

#![allow(unused)]
fn main() {
fn decrypt_blte_chunk(
    chunk_data: &[u8],
    chunk_index: usize,
    key_manager: &CASCKeyManager
) -> Result<Vec<u8>> {
    // Parse encryption header
    let key_name_size = chunk_data[0] as usize;
    let key_name = u64::from_le_bytes(
        chunk_data[1..1 + key_name_size].try_into()?
    );

    let iv_offset = 1 + key_name_size;
    let iv_size = chunk_data[iv_offset] as usize;
    let base_iv = u32::from_le_bytes(
        chunk_data[iv_offset + 1..iv_offset + 1 + iv_size].try_into()?
    );

    let cipher_type = chunk_data[iv_offset + 1 + iv_size];

    if cipher_type != 0x53 {  // 'S' for Salsa20
        return Err("Not Salsa20 encrypted");
    }

    // Get encryption key
    let key = key_manager.get_key(key_name)
        .ok_or("Key not found")?;

    // Modify IV for chunk
    let iv = modify_iv_for_chunk(base_iv, chunk_index);
    let mut nonce = [0u8; 8];
    nonce[..4].copy_from_slice(&iv.to_le_bytes());

    // Decrypt data
    let encrypted_offset = iv_offset + 1 + iv_size + 1;
    let ciphertext = &chunk_data[encrypted_offset..];

    Ok(decrypt_salsa20(ciphertext, &key, &nonce))
}
}

Known Keys

CASC uses various encryption keys for different content:

#![allow(unused)]
fn main() {
// Example key names (actual keys not included for legal reasons)
const CINEMATIC_KEY: u64 = 0xFAC5C7F366D20C85;
const ACHIEVEMENT_KEY: u64 = 0x0123456789ABCDEF;
const PVP_KEY: u64 = 0xDEADBEEFCAFEBABE;
}

Performance Optimization

SIMD Implementation

Using SIMD for parallel processing:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")]
use std::arch::x86_64::*;

unsafe fn salsa20_core_simd(input: &[u32; 16]) -> [u32; 16] {
    // Load state into SIMD registers
    let mut row0 = _mm_loadu_si128(input[0..4].as_ptr() as *const __m128i);
    let mut row1 = _mm_loadu_si128(input[4..8].as_ptr() as *const __m128i);
    let mut row2 = _mm_loadu_si128(input[8..12].as_ptr() as *const __m128i);
    let mut row3 = _mm_loadu_si128(input[12..16].as_ptr() as *const __m128i);

    // Perform rounds using SIMD operations
    // ... (implementation details)

    // Store results
    let mut output = [0u32; 16];
    _mm_storeu_si128(output[0..4].as_mut_ptr() as *mut __m128i, row0);
    _mm_storeu_si128(output[4..8].as_mut_ptr() as *mut __m128i, row1);
    _mm_storeu_si128(output[8..12].as_mut_ptr() as *mut __m128i, row2);
    _mm_storeu_si128(output[12..16].as_mut_ptr() as *mut __m128i, row3);

    output
}
}

Buffered Decryption

For large files:

#![allow(unused)]
fn main() {
struct BufferedSalsa20 {
    state: Salsa20State,
    buffer: [u8; 4096],
    buffer_pos: usize,
}

impl BufferedSalsa20 {
    pub fn decrypt_stream<R: Read, W: Write>(
        &mut self,
        input: &mut R,
        output: &mut W
    ) -> Result<()> {
        let mut cipher_buffer = [0u8; 4096];

        loop {
            let bytes_read = input.read(&mut cipher_buffer)?;
            if bytes_read == 0 {
                break;
            }

            self.state.generate_keystream(&mut self.buffer[..bytes_read]);

            for i in 0..bytes_read {
                self.buffer[i] ^= cipher_buffer[i];
            }

            output.write_all(&self.buffer[..bytes_read])?;
        }

        Ok(())
    }
}
}

Security Considerations

IV Uniqueness: IVs must not be reused with the same key (CASC handles this via chunk index XOR)
Side Channels: Use constant-time operations for key comparison
Key Storage: CASC encryption keys are static and community-maintained; the TactKeyStore keeps them in memory with redacted debug output

Testing

Test Vectors

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    #[test]
    fn test_salsa20_encryption() {
        let key = [0u8; 32];
        let nonce = [0u8; 8];
        let plaintext = b"Hello, World!";

        let ciphertext = encrypt_salsa20(plaintext, &key, &nonce);
        let decrypted = decrypt_salsa20(&ciphertext, &key, &nonce);

        assert_eq!(plaintext, &decrypted[..]);
    }
}
}

cascette-crypto API

The cascette-crypto crate provides CASC-specific Salsa20 implementation.

Basic Decryption

#![allow(unused)]
fn main() {
use cascette_crypto::salsa20::{decrypt_salsa20, Salsa20Cipher};

// CASC uses 16-byte keys and 4-byte IVs
let key: [u8; 16] = [0x01; 16];
let iv: [u8; 4] = [0x02, 0x03, 0x04, 0x05];
let block_index = 0; // First block in BLTE file

let ciphertext = &[/* encrypted data */];
let plaintext = decrypt_salsa20(ciphertext, &key, &iv, block_index)
    .expect("decryption failed");
}

In-Place Processing

#![allow(unused)]
fn main() {
use cascette_crypto::Salsa20Cipher;

let key: [u8; 16] = [0x42; 16];
let iv: [u8; 4] = [0x11, 0x22, 0x33, 0x44];

let mut cipher = Salsa20Cipher::new(&key, &iv, 0)
    .expect("cipher creation failed");

let mut data = vec![0u8; 1024];
cipher.apply_keystream(&mut data);
}

TACT Key Management

#![allow(unused)]
fn main() {
use cascette_crypto::{TactKeyStore, TactKey};

// Create store with hardcoded WoW keys
let store = TactKeyStore::new();

// Look up key by ID
let key_id = 0xFA505078126ACB3E_u64;
if let Some(key) = store.get(key_id) {
    // Use key for decryption
    println!("Found key: {:02X?}", key);
}

// Add custom key
let mut store = TactKeyStore::empty();
let key = TactKey::from_hex(
    0x1234567890ABCDEF,
    "0123456789ABCDEF0123456789ABCDEF"
).expect("invalid key hex");
store.add(key);

// Load keys from string content (file I/O is caller's responsibility)
let csv_content = "FA505078126ACB3E,BDC51862ABED79B2DE48C8E7E66C6200";
store.load_from_csv(csv_content);

let txt_content = "FA505078126ACB3E BDC51862ABED79B2DE48C8E7E66C6200";
store.load_from_txt(txt_content);
}

Custom Storage Backends

The TactKeyProvider trait allows implementing custom key storage:

#![allow(unused)]
fn main() {
use cascette_crypto::{TactKeyProvider, TactKey, CryptoError};

// Implement for keyring, database, encrypted files, etc.
struct MyKeyStore { /* ... */ }

impl TactKeyProvider for MyKeyStore {
    fn get_key(&self, id: u64) -> Result<Option<[u8; 16]>, CryptoError> {
        // Look up key from your storage backend
        todo!()
    }

    fn add_key(&mut self, key: TactKey) -> Result<(), CryptoError> {
        // Store key in your backend
        todo!()
    }

    // ... other trait methods
}
}

ARC4 (Legacy)

#![allow(unused)]
fn main() {
use cascette_crypto::Arc4Cipher;

// ARC4 used in older BLTE encrypted blocks
let key = b"encryption_key";
let mut cipher = Arc4Cipher::new(key)
    .expect("cipher creation failed");

let encrypted = cipher.encrypt(b"plaintext");

// Decrypt requires fresh cipher instance
let mut cipher = Arc4Cipher::new(key)
    .expect("cipher creation failed");
let decrypted = cipher.decrypt(&encrypted);
}

Implementation Details

CASC-Specific Differences

The CASC Salsa20 variant differs from standard Salsa20:

Aspect	Standard Salsa20	CASC Salsa20
Key size	32 bytes	16 bytes (duplicated internally)
IV/Nonce size	8 bytes	4 bytes (extended internally)
Constants	“expand 32-byte k”	“expand 16-byte k”
Block index	Counter-based	XORed with IV

Key Duplication

CASC uses 16-byte keys with the “expand 16-byte k” (tau) constants:

#![allow(unused)]
fn main() {
// Tau constants for 16-byte keys
state[0]  = 0x61707865; // "expa"
state[5]  = 0x3120646e; // "nd 1"
state[10] = 0x79622d36; // "6-by"
state[15] = 0x6b206574; // "te k"

// Key bytes 0-15 placed at positions 1-4
// Key bytes 0-15 repeated at positions 11-14
}

IV Extension

The IV modification and zero-padding algorithm is documented in the CASC Implementation section above.

Validation Status

Integration tests with real WoW encryption keys
Test suite validates against known BLTE ‘E’ mode samples
Zero-allocation keystream generation for performance

Note: CascLib duplicates the IV (same bug as was in cascette-rs before the fix). The correct behavior is zero-padding.

TACT Key Coverage

The cascette-crypto crate includes hardcoded TACT keys for major WoW expansions:

Battle for Azeroth, Shadowlands, The War Within, Classic Era

Keys are stored with redacted debug output to prevent accidental logging.

References

Salsa20 Specification
See BLTE Format for encryption in BLTE blocks
See Archives for encrypted content storage

CDN Architecture Documentation

Overview

NGDP uses a Content Delivery Network (CDN) architecture for distributing game content. The system provides geographical distribution of content through HTTP/HTTPS endpoints, with automatic failover and load balancing capabilities.

Note: Code examples in this document illustrate concepts. For working implementations, see the cascette CLI or the cascette-protocol crate.

Discovery and Access Flow

Product Discovery

Product discovery begins with a v1/summary query to the Ribbit TCP service:

sequenceDiagram
    participant Client
    participant Ribbit
    participant CDN

    Client->>Ribbit: v1/summary (TCP)
    Ribbit-->>Client: Available products

    Client->>Ribbit: v2/versions/{product}
    Ribbit-->>Client: Version manifests

    Client->>Ribbit: v2/cdns/{product}
    Ribbit-->>Client: CDN configurations

    Client->>CDN: HTTP GET config files
    CDN-->>Client: BuildConfig, CDNConfig

    Client->>CDN: HTTP GET content files
    CDN-->>Client: Game data

Region Selection

NGDP supports the following regions:

us: United States
eu: Europe
kr: Korea
tw: Taiwan
cn: China (restricted access)
sg: Singapore

HTTPS v2 Endpoints

The v2 API provides three primary endpoints:

versions: Product version information and build manifests
cdns: CDN server configurations and endpoints
bgdl: Background download configurations

Configuration Retrieval Process

Query product versions to get current build information
Retrieve CDN configurations to get the correct Path value
Download BuildConfig and CDNConfig files using the Path from step 2
Parse configuration to locate content files
Begin content download from CDN servers

CRITICAL: Always extract the Path field from CDN responses. Never assume paths based on product names. For example, all WoW products (wow, wow_classic, wow_classic_era, wow_classic_titan, wow_anniversary) use tpr/wow despite having different product codes.

Content Download Workflow

flowchart TD
    A[Get Product Versions] --> B[Select Build]
    B --> C[Get CDN Config]
    C --> D[Download BuildConfig]
    D --> E[Download CDNConfig]
    E --> F[Parse Archive Lists]
    F --> G[Download Content Files]
    G --> H[Verify Content Hashes]

    style A stroke-width:3px
    style H stroke-width:3px
    style C stroke-width:2px,stroke-dasharray:5 5
    style E stroke-width:2px,stroke-dasharray:5 5

CDN URL Construction

URL Pattern

http(s)://{cdn_server}/{cdn_path}/{type}/{hash[0:2]}/{hash[2:4]}/{full_hash}

Component Breakdown

cdn_server: CDN hostname from the Hosts field (e.g., level3.blizzard.com)
cdn_path: Path from the Path field - MUST be extracted from CDN response
type: Content type (config, data, patch)
hash[0:2]: First two characters of content hash
hash[2:4]: Next two characters of content hash
full_hash: Complete content hash

Path vs ProductPath Distinction

IMPORTANT: The CDN response contains two path fields that serve different purposes:

Path (e.g., tpr/wow): Used for ALL game content including:
- Build configuration files (/config/)
- CDN configuration files (/config/)
- Encoding files (/data/)
- Root files (/data/)
- Archive files (/data/)
- Patch files (/patch/)
- All other game data
ProductPath (e.g., tpr/configs): Used ONLY for:
- Product configuration files that Battle.net agent/launcher use
- These are JSON files containing product metadata and settings
- Example: http://cdn.arctium.tools/tpr/configs/data/{hash}

Common mistake: Do NOT use ProductPath for build configs, CDN configs, or any game data files. ProductPath is exclusively for Battle.net launcher product configuration.

Directory Sharding

The two-level directory structure (hash[0:2]/hash[2:4]) distributes files across 65,536 directories, keeping per-directory file counts low for filesystem and CDN edge server performance.

Example URLs

# Configuration file
http://level3.blizzard.com/tpr/wow/config/12/34/1234567890abcdef1234567890abcdef

# Game data file
http://level3.blizzard.com/tpr/wow/data/ab/cd/abcdef1234567890abcdef1234567890

# Patch data
http://level3.blizzard.com/tpr/wow/patch/56/78/567890abcdef1234567890abcdef123456

Real-World Examples

Examples from wow_classic_era version 1.15.7.61582 (archived on Arctium CDN):

# Build configuration (hash: ae66faee0ac786fdd7d8b4cf90a8d5b9)
http://cdn.arctium.tools/tpr/wow/config/ae/66/ae66faee0ac786fdd7d8b4cf90a8d5b9

# CDN configuration (hash: 63eee50d456a6ddf3b630957c024dda0)
http://cdn.arctium.tools/tpr/wow/config/63/ee/63eee50d456a6ddf3b630957c024dda0

# Patch configuration (hash: 474b9630df5b46df5d98ec27c5f78d07)
http://cdn.arctium.tools/tpr/wow/config/47/4b/474b9630df5b46df5d98ec27c5f78d07

# Product configuration (different path structure)
http://cdn.arctium.tools/tpr/configs/data/c9/93/c9934edfc8f217a2e01c47e4deae8454

# Encoding file (using encoding key, not content key!)
# From build config: encoding = b07b881f4527bda7cf8a1a2f99e8622e bbf06e7476382cfaa396cff0049d356b
# Must use the SECOND hash (encoding key): bbf06e7476382cfaa396cff0049d356b
http://cdn.arctium.tools/tpr/wow/data/bb/f0/bbf06e7476382cfaa396cff0049d356b

# Root file: Cannot be fetched directly!
# The root file's encoding key must be looked up in the encoding file first.
# The hash ea8aefdebdbd6429da905c8c6a2b1813 is the content key, not the encoding key.

Note the different path structures:

Most files use /tpr/wow/{type}/
Product configurations use /tpr/configs/data/
Patch files would be under /tpr/wow/patch/

Configuration Files

BuildConfig, CDNConfig, PatchConfig

See Configuration File Formats for the authoritative documentation of BuildConfig, CDNConfig, and PatchConfig fields, formats, and examples.

The key point for CDN access: most BuildConfig fields contain <content-key> <encoding-key> pairs. Use the encoding key (second hash) for CDN fetches. The encoding file must be fetched first to resolve encoding keys for other files.

CDN Response Structure

Field Definitions

Name: CDN configuration identifier
Path: Base path for content requests
Hosts: List of CDN hostnames
Servers: Legacy server configuration
ConfigPath: Path to configuration files

Special Parameters

maxhosts: Maximum number of hosts to use simultaneously
fallback: Fallback CDN configuration

Example CDN Response

Name!STRING:0|Path!STRING:0|Hosts!STRING:0|Servers!STRING:0|ConfigPath!STRING:0
us|tpr/wow|level3.blizzard.com edgecast.blizzard.com|http://level3.blizzard.com/ http://edgecast.blizzard.com/|tpr/configs/data
eu|tpr/wow|eu.cdn.blizzard.com|http://eu.cdn.blizzard.com/|tpr/configs/data

Path Types

Content Types

config: Configuration files (BuildConfig, CDNConfig, etc.)
data: Game content files and archives
patch: Differential patch data

Usage Patterns

# Configuration files
/{cdn_path}/config/{hash_dirs}/{hash}

# Game data
/{cdn_path}/data/{hash_dirs}/{hash}

# Patch data
/{cdn_path}/patch/{hash_dirs}/{hash}

Implementation Requirements

Mandatory Components

Both BuildConfig AND CDNConfig are required for proper NGDP operation:

BuildConfig provides system file references
CDNConfig specifies content storage locations
Missing either file prevents content access

CDN Path Resolution

Extract the Path field from CDN responses as described in the Configuration Retrieval Process section. Cache the path per product for the session duration.

Fallback Logic

Implement fallback mechanisms:

CDN Rotation: Cycle through available CDN servers
Region Fallback: Fall back to alternate regions if available
Protocol Fallback: HTTPS preferred, HTTP as fallback
Retry Logic: Exponential backoff for failed requests

Rate Limiting

Implement client-side rate limiting:

Respect CDN server limitations
Implement connection pooling
Use appropriate request timeouts
Avoid overwhelming CDN infrastructure

Regional Restrictions

China (cn) region has special considerations:

Limited CDN access
Different server infrastructure
Potential connectivity restrictions
Require region-specific handling

Backup Servers

Community Mirrors

Several community-maintained mirrors provide NGDP content:

cdn.arctium.tools

Protocol: HTTP only
Status: Active
Coverage: Full NGDP content mirror

casc.wago.tools

Protocol: HTTP with HTTPS redirects
Status: Active
Coverage: Full NGDP mirror

archive.wow.tools

Protocol: HTTPS
Status: Active
Coverage: Historical NGDP content archive

Mirror Usage

# Primary CDN (preferred)
curl http://level3.blizzard.com/tpr/wow/data/12/34/1234567890abcdef

# Backup mirror
curl http://cdn.arctium.tools/tpr/wow/data/12/34/1234567890abcdef

File Types

Core Manifests

System files that define content structure:

root: Maps file paths to content keys
encoding: Maps content keys to encoded storage keys
install: Defines installation requirements and file tags
download: Specifies download priorities for streaming
size: Contains file size information

Storage Files

Content storage and indexing:

archives: Bulk content storage containers
indexes: Index files for locating content within archives

Encryption Files

Content protection and key management:

KeyRing: Encryption key storage format for protected content

File Type Usage

graph TD
    A[BuildConfig] --> B[Root File]
    A --> C[Encoding File]
    A --> D[Install Manifest]
    A --> E[Download Manifest]

    B --> F[Game Files]
    C --> G[Archive Content]
    D --> H[Installation Tags]
    E --> I[Download Priorities]

    J[CDNConfig] --> K[Archive Files]
    K --> L[Archive Indices]

    M[KeyRing] --> N[Encryption Keys]
    N --> O[Protected Content]

    style A stroke-width:4px
    style J stroke-width:4px
    style M stroke-width:3px,stroke-dasharray:5 5
    style B stroke-width:2px
    style C stroke-width:2px
    style D stroke-width:2px
    style E stroke-width:2px

Error Handling

HTTP Status Codes

200: Successful content retrieval
404: Content not found (may require fallback)
416: Range not satisfiable (check request headers)
503: Service unavailable (implement retry with backoff)

Retry Strategies

#![allow(unused)]
fn main() {
// Example retry logic
async fn download_with_retry(url: &str, max_retries: u32) -> Result<Vec<u8>> {
    let mut attempts = 0;

    loop {
        match download(url).await {
            Ok(data) => return Ok(data),
            Err(e) if attempts < max_retries => {
                attempts += 1;
                let delay = Duration::from_secs(2_u64.pow(attempts));
                tokio::time::sleep(delay).await;
            }
            Err(e) => return Err(e),
        }
    }
}
}

Content Verification

Always verify downloaded content:

Check HTTP response status
Verify content length if provided
Validate content hash against expected value
Retry from alternate CDN on mismatch

Streaming Architecture Implementation

Connection Pooling Architecture

#![allow(unused)]
fn main() {
/// Connection-pooled CDN client with retry logic
pub struct PooledCdnClient {
    /// Inner CDN client
    inner: CdnClient,
    /// Maximum concurrent connections
    max_connections: usize,
    /// Maximum retry attempts
    max_retries: usize,
    /// Initial retry delay
    retry_delay: Duration,
}

impl PooledCdnClient {
    /// Fetch range with exponential backoff retry logic
    pub async fn fetch_range_with_retry(
        &self,
        archive_hash: &str,
        offset: u64,
        size: u64,
    ) -> ArchiveResult<Vec<u8>> {
        let mut last_error = None;

        for attempt in 0..=self.max_retries {
            match self.inner.fetch_range(archive_hash, offset, size).await {
                Ok(data) => return Ok(data),
                Err(e) if attempt < self.max_retries && e.is_retryable() => {
                    // Exponential backoff: 100ms, 200ms, 400ms, 800ms...
                    let delay = self.retry_delay * (1u32 << attempt);
                    tokio::time::sleep(delay).await;
                    last_error = Some(e);
                }
                Err(e) => return Err(e),
            }
        }

        Err(last_error.unwrap_or_else(||
            ArchiveError::NetworkError("All retries exhausted".to_string())))
    }
}
}

CDN Failover Mechanisms

#![allow(unused)]
fn main() {
/// Resilient archive resolver with fallback support
pub struct ResilientArchiveResolver {
    /// Primary resolver
    primary: CdnArchiveResolver,
    /// Fallback resolvers
    fallbacks: Vec<CdnArchiveResolver>,
    /// Error threshold before switching to fallback
    error_threshold: usize,
    /// Current error count (atomic for thread safety)
    error_count: AtomicUsize,
}

impl ResilientArchiveResolver {
    /// Fetch content with automatic fallback
    pub async fn fetch_content_resilient(&self, encoding_key: &[u8; 16]) -> ArchiveResult<Vec<u8>> {
        // Try primary resolver first
        match self.primary.fetch_content(encoding_key).await {
            Ok(content) => {
                // Reset error count on success
                self.error_count.store(0, Ordering::Relaxed);
                return Ok(content);
            }
            Err(e) if e.is_permanent() => return Err(e),
            Err(e) => {
                self.error_count.fetch_add(1, Ordering::Relaxed);

                // Try fallback resolvers if error threshold exceeded
                if self.error_count.load(Ordering::Relaxed) >= self.error_threshold {
                    for fallback in &self.fallbacks {
                        if let Ok(content) = fallback.fetch_content(encoding_key).await {
                            return Ok(content);
                        }
                    }
                }

                Err(e)
            }
        }
    }
}
}

Range Request Coalescing

#![allow(unused)]
fn main() {
/// Streaming archive reader for network content
pub struct StreamingArchiveReader {
    /// CDN client for network operations
    client: Arc<PooledCdnClient>,
    /// Current archive being read
    archive_hash: String,
    /// Current offset in archive
    current_offset: u64,
    /// Remaining size to read
    remaining_size: u64,
    /// Chunk size for streaming reads (default 64KB)
    chunk_size: u64,
}

impl StreamingArchiveReader {
    /// Read next chunk with automatic coalescing
    pub async fn read_chunk(&mut self) -> ArchiveResult<Option<Vec<u8>>> {
        if self.remaining_size == 0 {
            return Ok(None);
        }

        let chunk_size = self.chunk_size.min(self.remaining_size);

        let data = self
            .client
            .fetch_range_with_retry(&self.archive_hash, self.current_offset, chunk_size)
            .await?;

        // Verify response size matches request
        if data.len() as u64 != chunk_size {
            return Err(ArchiveError::IncompleteRangeResponse {
                requested: chunk_size,
                received: data.len() as u64,
            });
        }

        self.current_offset += chunk_size;
        self.remaining_size -= chunk_size;

        Ok(Some(data))
    }

    /// Read all remaining data in one request (coalescing)
    pub async fn read_all(&mut self) -> ArchiveResult<Vec<u8>> {
        if self.remaining_size == 0 {
            return Ok(Vec::new());
        }

        let data = self
            .client
            .fetch_range_with_retry(&self.archive_hash, self.current_offset, self.remaining_size)
            .await?;

        self.current_offset += self.remaining_size;
        self.remaining_size = 0;

        Ok(data)
    }
}
}

Circuit Breaker Pattern

#![allow(unused)]
fn main() {
/// Circuit breaker states for CDN resilience
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum CircuitState {
    Closed,    // Normal operation
    Open,      // Failing fast, not attempting requests
    HalfOpen,  // Testing if service recovered
}

/// Circuit breaker for CDN endpoints
pub struct CdnCircuitBreaker {
    state: Arc<Mutex<CircuitState>>,
    failure_count: Arc<AtomicUsize>,
    failure_threshold: usize,
    timeout: Duration,
    last_failure: Arc<Mutex<Option<Instant>>>,
}

impl CdnCircuitBreaker {
    /// Execute request with circuit breaker protection
    pub async fn execute<F, T, E>(&self, request: F) -> Result<T, E>
    where
        F: Future<Output = Result<T, E>>,
        E: std::fmt::Debug,
    {
        // Check circuit state
        match *self.state.lock().unwrap() {
            CircuitState::Open => {
                // Check if timeout period has passed
                if let Some(last_failure) = *self.last_failure.lock().unwrap() {
                    if last_failure.elapsed() > self.timeout {
                        // Transition to half-open
                        *self.state.lock().unwrap() = CircuitState::HalfOpen;
                    } else {
                        return Err(/* circuit open error */);
                    }
                }
            }
            CircuitState::HalfOpen => {
                // Allow one test request
            }
            CircuitState::Closed => {
                // Normal operation
            }
        }

        // Execute request
        match request.await {
            Ok(result) => {
                // Success - reset failure count and close circuit
                self.failure_count.store(0, Ordering::Relaxed);
                *self.state.lock().unwrap() = CircuitState::Closed;
                Ok(result)
            }
            Err(error) => {
                // Failure - increment count and possibly open circuit
                let failures = self.failure_count.fetch_add(1, Ordering::Relaxed) + 1;
                if failures >= self.failure_threshold {
                    *self.state.lock().unwrap() = CircuitState::Open;
                    *self.last_failure.lock().unwrap() = Some(Instant::now());
                }
                Err(error)
            }
        }
    }
}
}

Caching Strategy

Implement efficient caching:

Cache configuration files with appropriate TTL
Use content-addressed storage for game files
Implement cache invalidation for updated content
Support offline operation with cached content

Security Considerations

Transport: Use HTTPS with certificate validation for all CDN requests
Content Integrity: Verify MD5 content hashes after download; reject mismatches and retry from an alternate CDN
Encryption Keys: CASC uses static community-maintained keys; see Salsa20 Encryption for key management details

Ribbit Protocol

Ribbit is a TCP-based protocol operating on port 1119 that serves as the discovery mechanism for NGDP. It provides version information, CDN endpoints, and configuration data for Blizzard products.

Protocol Variants

Ribbit has three access methods:

TCP Ribbit

Direct TCP connection to tcp://{region}.version.battle.net:1119

V1 Protocol: MIME-formatted responses with ASN.1 signatures and SHA-256 checksums
V2 Protocol: Raw BPSV responses without metadata
Endpoints: summary, products, certificates, and OCSP

HTTP TACT v1

HTTP wrapper at http://{region}.patch.battle.net:1119

Endpoints: /{product}/versions, /{product}/cdns, /{product}/bgdl
Response format: BPSV directly without MIME wrapping
No authentication: Public access
Connection pooling: Reusable HTTP connections

HTTPS TACT v2

Secure wrapper at https://{region}.version.battle.net (standard HTTPS port 443)

Same endpoints as HTTP TACT v1
TLS encryption: Standard HTTPS security
HTTP/2 support: Multiplexing for concurrent requests
Response format: BPSV directly

Protocol Flow

sequenceDiagram
    participant Client
    participant Ribbit as Ribbit Server
    participant Cache as Local Cache

    Client->>Cache: Check cached sequence
    Cache-->>Client: Return cached seqn

    Client->>Ribbit: TCP Connect (port 1119)
    Client->>Ribbit: Send command + \n
    Ribbit->>Ribbit: Process request
    Ribbit-->>Client: Send response
    Ribbit->>Client: Close connection

    Client->>Client: Parse response
    Client->>Client: Extract sequence number

    alt Sequence changed
        Client->>Cache: Update cache
        Client->>Client: Process new data
    else Sequence unchanged
        Client->>Client: Use cached data
    end

Endpoints

Endpoint Comparison

Endpoint	TCP Ribbit	HTTP TACT v1	HTTPS TACT v2
Summary	`v1/summary`	✗	✗
Product versions	`v1/products/{product}/versions`	`/{product}/versions`	`/{product}/versions`
CDN config	`v1/products/{product}/cdns`	`/{product}/cdns`	`/{product}/cdns`
Background download	`v1/products/{product}/bgdl`	`/{product}/bgdl`	`/{product}/bgdl`
Certificates	`v1/certs/{id}`	✗	✗
OCSP	`v1/ocsp/{id}`	✗	✗

Response Format Comparison

Protocol	Response Format	Signature	Checksum
TCP Ribbit V1	MIME multipart with BPSV	PKCS#7/CMS	SHA-256
TCP Ribbit V2	Raw BPSV	None	None
HTTP TACT v1	Raw BPSV	None	None
HTTPS TACT v2	Raw BPSV	None	None

Note: The certificate and OCSP endpoints were part of Blizzard’s custom PKI infrastructure, now replaced by system trust stores.

Certificate and Signature Verification

V1 Signature Structure

V1 responses include PKCS#7/CMS signatures for authenticity:

SignedData Structure

Content Type: PKCS#7 SignedData (OID: 1.2.840.113549.1.7.2)
Signer Identification: IssuerAndSerialNumber or SubjectKeyIdentifier
Certificates: Embedded in CertificateSet or fetched via SKI
Signed Attributes: Optional, DER-encoded as SET for verification

Supported Algorithms

Digest Algorithms:

SHA-256 (OID: 2.16.840.1.101.3.4.2.1)
SHA-384 (OID: 2.16.840.1.101.3.4.2.2)
SHA-512 (OID: 2.16.840.1.101.3.4.2.3)

Signature Algorithms:

RSA with SHA-256 (OID: 1.2.840.113549.1.1.11)
RSA with SHA-384 (OID: 1.2.840.113549.1.1.12)
RSA with SHA-512 (OID: 1.2.840.113549.1.1.13)

Verification Process

Basic Flow

Extract Signature: From MIME part with Content-Disposition: signature
Parse PKCS#7 Structure: Extract SignedData from ContentInfo
Identify Signer: Match via IssuerAndSerialNumber or SubjectKeyIdentifier
Extract Public Key: From embedded certificate or fetch via endpoint
Verify Signature: Process depends on signed attributes presence
Validate Checksum: SHA-256 of content matches epilogue

Signed Attributes Processing

When signed attributes are present (typical case):

Re-encode as DER SET:
- Convert from implicit [0] to SET OF (tag 0x31)
- Sort attributes in DER canonical order
- Apply proper DER length encoding
Verify Against SET:
- Signature verifies the DER-encoded SET
- Message digest attribute must match content hash
Without Signed Attributes:
- Signature directly verifies message content
- Direct RSA verification of content hash

RSA Verification Details

Padding Scheme: PKCS#1 v1.5
Key Format: Parse SubjectPublicKeyInfo to extract RSA public key
Signature Format: Raw signature bytes converted to RSA signature object
Hash Algorithms: SHA-256, SHA-384, or SHA-512 based on OID

Certificate Fetching

When certificates are not embedded:

Extract Subject Key Identifier from signer info
Request certificate via /v1/certs/{ski} endpoint
Validate SKI matches between signature and certificate
Extract public key for verification

Implementation Strategies

Parsing Approaches:

Primary: Use ASN.1/CMS parsing libraries
Fallback: Pattern-based manual parsing for compatibility
Handle both embedded and detached signatures

Key Extraction:

Parse SubjectPublicKeyInfo structure
Extract RSA public key in PKCS#1 format
Determine key size from modulus length

Critical Implementation Details:

SET Encoding: Signed attributes MUST be re-encoded as DER SET for verification
Canonical Ordering: Attributes sorted for DER canonical form
Dual Verification Paths: Different handling for signed vs unsigned attributes
Base64 Detection: Signatures may be binary or base64-encoded in MIME

Error Handling:

Invalid ASN.1 structures
Missing or mismatched certificates
Unsupported algorithms
Signature verification failures
DER encoding errors

Regional Servers

Available regions for {region}.version.battle.net:

us - United States
eu - Europe
kr - Korea
tw - Taiwan
sg - Singapore
cn - China (restricted to China-only access)

BPSV Format

Blizzard Pipe-Separated Values (BPSV) is the data format for responses:

Structure

Header line: Column names with type annotations
Data lines: Pipe-separated values
Sequence line: ## seqn = {number} (exact format with spaces required)

Data Types

STRING:0 - Variable-length string
HEX:16 - 16-byte hexadecimal value (MD5 hash)
DEC:4 - 4-byte decimal integer

Example

Region!STRING:0|BuildConfig!HEX:16|CDNConfig!HEX:16|BuildId!DEC:4|VersionsName!String:0
us|be2bb98dc28aee05bbee519393696cdb|fac77b9ca52c84ac28ad83a7dbe1c829|61491|11.1.7.61491
eu|be2bb98dc28aee05bbee519393696cdb|fac77b9ca52c84ac28ad83a7dbe1c829|61491|11.1.7.61491
## seqn = 2241282

V1 MIME Response Structure

TCP Ribbit V1 responses use MIME multipart format:

MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="{boundary}"

--{boundary}
Content-Type: text/plain
Content-Disposition: data

[BPSV data here]

--{boundary}
Content-Type: application/octet-stream
Content-Disposition: signature

[ASN.1 signature data]

--{boundary}--
Checksum: {64-character SHA-256 hash}

The checksum validation process:

Search for “Checksum: “ pattern at end of response
Extract 64-character hexadecimal checksum
Compute SHA-256 hash of content before checksum line
Compare with provided checksum (case-insensitive)

Connection Handling

TCP Ribbit Connection Flow

graph TD
    A[Create TCP Socket] --> B[Connect to server:1119]
    B --> C[Send command + \n]
    C --> D[Read response until EOF]
    D --> E[Server closes connection]
    E --> F{Response type?}
    F -->|V1| G[Parse MIME]
    F -->|V2| H[Parse BPSV]
    G --> I[Validate checksum]
    I --> J[Extract BPSV data]
    H --> K[Process data]
    J --> K

    style E stroke-width:4px
    style A stroke-width:3px
    style F stroke-width:3px,stroke-dasharray:5 5
    style I stroke-width:2px
    style K stroke-width:2px

HTTP/HTTPS TACT Connection Flow

graph TD
    A[Connection Pool] --> B{Existing connection?}
    B -->|Yes| C[Reuse connection]
    B -->|No| D[Create new HTTP connection]
    D --> C
    C --> E[Send HTTP request]
    E --> F[Receive response]
    F --> G[Parse BPSV directly]
    G --> H[Return connection to pool]
    H --> I[Process data]

    style H stroke-width:4px,stroke-dasharray:5 5
    style A stroke-width:3px
    style B stroke-width:3px,stroke-dasharray:5 5
    style G stroke-width:2px
    style I stroke-width:2px

Key differences:

TCP: New connection per request, server closes after response
HTTP/HTTPS: Connection pooling, keep-alive, multiple requests per connection

Unified Client Architecture

Protocol Abstraction

A unified client should abstract protocol differences:

graph TD
    A[Unified NGDP Client] --> B{Protocol}
    B -->|TCP| C[Ribbit TCP Client]
    B -->|HTTP| D[TACT HTTP Client]
    B -->|HTTPS| E[TACT HTTPS Client]

    C --> F[BPSV Parser]
    D --> F
    E --> F

    C --> G[Response Types]
    D --> G
    E --> G

    style A stroke-width:4px
    style B stroke-width:3px,stroke-dasharray:5 5
    style F stroke-width:3px
    style G stroke-width:2px
    style C stroke-width:2px
    style D stroke-width:2px
    style E stroke-width:2px

Common Interface

All protocol variants share common operations:

Get product versions
Get CDN configurations
Get background download info
Parse BPSV responses

Protocol-Specific Features

TCP Ribbit Only:

Summary endpoint
Certificate/OCSP endpoints
MIME response parsing
Signature verification

HTTP/HTTPS TACT Only:

Connection pooling
HTTP/2 multiplexing
Standard HTTP features

Configuration Requirements

Host Configuration:

Default hosts: {region}.version.battle.net or {region}.patch.battle.net
Custom hosts: Support for private servers or testing
Port configuration: 1119 for TCP/HTTP, 443 for HTTPS

Connection Settings:

Timeout configuration (connect, read, total)
Retry logic (count, backoff, jitter)
Pool settings (max connections, idle timeout)
HTTP/2 settings (multiplexing, window size)

Implementation Requirements

TCP Client

Create new connection per request (no pooling)
Send ASCII command terminated with \n
Read response until server closes connection
Default connection timeout: 10 seconds

Retry Logic

Production implementations should include retry logic:

Default: 0 retries for backward compatibility
Exponential backoff: 100ms initial, 10s maximum, 2x multiplier
Jitter: 10% randomization to prevent thundering herd
Retryable: Connection, timeout, network failures
Non-retryable: Parse errors, validation failures

DNS Caching

Implementations may cache DNS lookups:

TTL: 300 seconds (5 minutes) typical
Multiple IPs: Try all resolved addresses sequentially
Thread-safe: Concurrent access protection required

Response Parsing

V1: Parse MIME structure, validate SHA-256 checksum
V2/HTTP/HTTPS: Parse BPSV directly
Handle empty responses (headers without data rows)
Parse typed column headers correctly

Caching

Cache responses with key: {endpoint}-{arguments}-{sequence_number}
Check sequence numbers to detect updates
Sequence numbers only increase (never decrease)
Skip re-downloading if sequence unchanged

Product Identifiers

Common product identifiers used with Ribbit:

World of Warcraft

wow - Retail
wow_beta - Beta
wow_classic - Classic
wow_classic_era - Classic Era
wow_classic_ptr - Classic PTR
wow_classic_titan - Classic Titan (CN region only, WotLK 3.80.x with upgraded Classic/TBC raids)
wow_anniversary - Classic Anniversary (TBC 2.5.x, progression through Classic branches on a shortened timeline)
wowt - Public Test Realm
wowz - Internal/Development

Other Products

agent - Battle.net Agent
bna - Battle.net Application

Version Response Fields

Field	Type	Description
Region	STRING:0	Region identifier
BuildConfig	HEX:16	Build configuration hash
CDNConfig	HEX:16	CDN configuration hash
KeyRing	HEX:16	Encryption keys hash
BuildId	DEC:4	Build number
VersionsName	String:0	Version string
ProductConfig	HEX:16	Product configuration hash

CDN Response Fields

Field	Type	Description
Name	STRING:0	CDN name
Path	STRING:0	Base path for content
Hosts	STRING:0	Space-separated host list
Servers	STRING:0	Full URLs with parameters
ConfigPath	STRING:0	Path to configuration files

Error Handling

Connection Errors

Connection timeout: Implement 10-30 second timeout (not automatic)
CN region: Only accessible from within China (will timeout from elsewhere)
Network failures: TCP connection may fail or drop

Response Errors

Empty responses: Some endpoints return headers only (especially bgdl)
404 errors: Not all products have all endpoints
Malformed MIME: V1 responses may have invalid structure
Invalid checksum: V1 checksum validation may fail
Buffer overflow: No standard response size limit

Parsing Errors

Type inconsistency: Handle String:0 vs STRING:0 in BPSV
Column mismatch: Data rows may not match header count
Invalid sequence format: Must match ## seqn = exactly (with space after equals)
Escaped characters: Pipe characters in values not escaped

Implementation Notes

Buffer Management

Use appropriate buffer sizes for TCP reads (typically 4KB-8KB)
Stream responses to avoid loading entire response in memory
No standard maximum response size - implement limits as needed

MIME Parsing Complexity

V1 MIME parsing requires multipart message handling
Consider using established MIME libraries
First chunk typically contains BPSV data
Signature chunk identified by Content-Disposition header

Ribbit Server

cascette-ribbit implements a Ribbit protocol server that serves BPSV-formatted game version and CDN configuration data over HTTP and TCP.

For protocol specification details, see Ribbit Protocol.

Architecture

graph TD
    A[cascette-ribbit binary] --> B[Server]
    B --> C[HTTP Server - axum]
    B --> D[TCP Server - tokio]
    C --> E[AppState]
    D --> E
    E --> F[BuildDatabase]
    E --> G[CdnConfig]

    C --> H[HTTP Handlers]
    H --> I[BpsvResponse]

    D --> J{Protocol Version}
    J -->|v1| K[MIME Wrapper + SHA-256]
    J -->|v2| L[Raw BPSV]
    K --> I
    L --> I

Components

Component	File	Purpose
`ServerConfig`	`config.rs`	CLI arguments, env vars, TLS paths
`CdnConfig`	`config.rs`	CDN host/path resolution per region
`BuildDatabase`	`database.rs`	JSON build record storage with product indexing
`BuildRecord`	`database.rs`	Single build entry with MD5 hash validation
`AppState`	`server.rs`	Shared state (database, CDN config, timestamps)
`Server`	`server.rs`	Orchestrates HTTP + TCP listeners
`BpsvResponse`	`responses/bpsv.rs`	BPSV response builder (versions, cdns, summary)
HTTP handlers	`http/handlers.rs`	axum route handlers for /{product}/{endpoint}
TCP handlers	`tcp/handlers.rs`	Command routing for v1/ and v2/ prefixes
V1 wrapper	`tcp/v1.rs`	RFC 2046 MIME wrapping with SHA-256 checksums
V2 handler	`tcp/v2.rs`	Raw BPSV TCP responses

Configuration

CLI Arguments and Environment Variables

Flag	Env Var	Default	Description
`--http-bind`	`CASCETTE_RIBBIT_HTTP_BIND`	`0.0.0.0:8080`	HTTP listen address
`--tcp-bind`	`CASCETTE_RIBBIT_TCP_BIND`	`0.0.0.0:1119`	TCP listen address
`--builds`	`CASCETTE_RIBBIT_BUILDS`	`./builds.json`	Path to build database JSON
`--cdn-hosts`	`CASCETTE_RIBBIT_CDN_HOSTS`	`cdn.arctium.tools`	CDN host(s)
`--cdn-path`	`CASCETTE_RIBBIT_CDN_PATH`	`tpr/wow`	CDN base path
`--tls-cert`	`CASCETTE_RIBBIT_TLS_CERT`	none	TLS certificate path (enables HTTPS)
`--tls-key`	`CASCETTE_RIBBIT_TLS_KEY`	none	TLS private key path

Build Database Format

The server reads build records from a JSON file. Each record represents a product build:

Field	Type	Required	Description
`id`	u64	yes	Unique build identifier
`product`	string	yes	Product code (e.g., `wow`, `wowt`)
`version`	string	yes	Version string (e.g., `1.14.2.42597`)
`build`	string	yes	Build number
`build_config`	string	yes	32-char hex MD5 hash
`cdn_config`	string	yes	32-char hex MD5 hash
`keyring`	string	no	32-char hex MD5 hash
`product_config`	string	no	32-char hex MD5 hash
`build_time`	string	yes	ISO 8601 timestamp
`encoding_ekey`	string	yes	32-char hex encoding key
`root_ekey`	string	yes	32-char hex root key
`install_ekey`	string	yes	32-char hex install key
`download_ekey`	string	yes	32-char hex download key

MD5 hash fields are validated to be exactly 32 lowercase hexadecimal characters.

HTTP Endpoints

The HTTP server uses axum with gzip compression and CORS support.

Routes

Route	Handler	Response
`GET /{product}/versions`	`handle_versions`	BPSV versions table
`GET /{product}/cdns`	`handle_cdns`	BPSV CDN configuration
`GET /{product}/bgdl`	`handle_bgdl`	BPSV background download (same as versions)

All responses use Content-Type: text/plain; charset=utf-8.

Returns HTTP 404 if the product is not found in the database.

TCP Protocol

The TCP server accepts one command per connection. After sending the response, the server closes the connection. A 10-second read timeout applies.

V2 Commands (Raw BPSV)

v2/products/{product}/versions
v2/products/{product}/cdns
v2/products/{product}/bgdl

V1 Commands (MIME-wrapped)

v1/products/{product}/versions
v1/products/{product}/cdns
v1/products/{product}/bgdl
v1/summary

V1 responses wrap BPSV data in RFC 2046 MIME multipart format with a SHA-256 checksum epilogue. The server does not include PKCS#7 signatures (unlike Blizzard’s production servers).

BPSV Response Format

Versions Response

7 rows, one per region (us, eu, cn, kr, tw, sg, xx):

Region!STRING:0|BuildConfig!HEX:16|CDNConfig!HEX:16|KeyRing!HEX:16|BuildId!DEC:4|VersionsName!STRING:0|ProductConfig!HEX:16
us|0123456789abcdef...|fedcba9876543210...|<keyring>|42597|1.14.2.42597|<product_config>
eu|...|...|...|...|...|...
...
## seqn = 1730534400

CDNs Response

5 rows, one per CDN region (us, eu, kr, tw, cn):

Name!STRING:0|Path!STRING:0|Hosts!STRING:0|Servers!STRING:0|ConfigPath!STRING:0
us|tpr/wow|cdn.arctium.tools|https://cdn.arctium.tools/?maxhosts=4|tpr/wow/config
...
## seqn = 1730534400

Summary Response (TCP v1 only)

One row per product:

Product!STRING:0|Seqn!DEC:4
wow|1730534400
wowt|1730534400
## seqn = 1730534400

Running

Binary

cargo run --bin cascette-ribbit -- --builds ./builds.json

Library

#![allow(unused)]
fn main() {
use cascette_ribbit::{Server, ServerConfig};

let config = ServerConfig {
    http_bind: "127.0.0.1:8080".parse()?,
    tcp_bind: "127.0.0.1:1119".parse()?,
    builds: "./builds.json".into(),
    cdn_hosts: "cdn.arctium.tools".to_string(),
    cdn_path: "tpr/wow".to_string(),
    tls_cert: None,
    tls_key: None,
};

config.validate()?;
let server = Server::new(config)?;
server.run().await?;
}

Example

cargo run --example simple_server

Then test with:

# HTTP
curl http://localhost:8080/wow/versions
curl http://localhost:8080/wow/cdns

# TCP v2
echo "v2/products/wow/versions" | nc localhost 1119

# TCP v1
echo "v1/products/wow/versions" | nc localhost 1119

Testing

The crate has four test suites:

Suite	File	Coverage
HTTP integration	`tests/http_test.rs`	HTTP endpoints, status codes, BPSV format
TCP v1 integration	`tests/tcp_v1_test.rs`	MIME wrapping, checksums, summary
TCP v2 integration	`tests/tcp_v2_test.rs`	Raw BPSV over TCP, connection lifecycle
Contract tests	`tests/contract_test.rs`	cascette-protocol client against server

Contract tests verify that cascette-protocol’s RibbitTactClient can query the server and parse responses correctly. This ensures wire-level compatibility between client and server implementations.

cargo test -p cascette-ribbit
cargo bench -p cascette-ribbit

TLS Support

Enable TLS with the tls feature flag:

cargo run --bin cascette-ribbit --features tls -- \
  --tls-cert /path/to/cert.pem \
  --tls-key /path/to/key.pem

When TLS is enabled, the HTTP server serves HTTPS. The TCP server is not affected (Ribbit TCP does not use TLS).

Battle.net Agent

The Battle.net Agent is a local HTTP service that manages game installations and updates. It runs on port 1120 and provides an API for downloading, installing, and managing Blizzard products.

Overview

The agent serves as the bridge between Blizzard’s CDN infrastructure and the local CASC storage. It handles:

Product installation and updates
Download management and prioritization
Local CASC storage maintenance
Installation verification and repair

HTTP API

The agent exposes a REST API on http://127.0.0.1:1120.

Endpoints

Documentation of the agent’s HTTP endpoints is pending.

Installation Flow

When installing a product, the agent:

Queries Ribbit for product version information
Downloads build and CDN configuration
Fetches encoding and root manifests
Downloads required archives from CDN
Writes data to local CASC storage
Updates local indices

cascette-agent

cascette-agent is a replacement implementation of the Battle.net Agent. It provides the same HTTP API on port 1120 and can be used as a drop-in replacement for:

Downloading products from official Blizzard CDNs
Fallback to community archive mirrors (cdn.arctium.tools)
Managing local CASC installations

Differences from Official Agent

Open source implementation
Supports community CDN mirrors
Cross-platform (Linux, macOS, Windows)
No Battle.net account required for public content

CASC Local Storage

Local CASC storage is the on-disk format used by the Battle.net client to store game data. Unlike CDN archives which are content-addressed, local storage uses optimized indices for fast file lookups.

Directory Structure

A typical CASC installation has the following structure:

<install-dir>/
├── .build.info               # Build configuration (BPSV format)
├── Data/
│   ├── data/
│   │   ├── 0000000001.idx    # Local index files (16 buckets)
│   │   ├── 0100000001.idx
│   │   ├── ...
│   │   ├── 0f00000001.idx
│   │   ├── data.000          # Combined archive data
│   │   ├── data.001
│   │   ├── ...
│   │   └── *.shmem           # Shared memory control file (temp)
│   ├── indices/
│   │   └── ...               # CDN index files (not local storage)
│   ├── residency/            # Download state tracking tokens
│   ├── ecache/               # Encoding cache
│   └── hardlink/             # Hard link trie directory
└── Cache/
    └── ADB/                  # Hotfix database cache
        └── *.bin

Local .idx index files and .data archive files both reside in Data/data/. The Data/indices/ directory holds CDN index files, which are a separate concern from local storage.

Container Types

CASC manages four container types for local storage:

Type	Size	Purpose
Dynamic	0x3c bytes	Read/write CASC archives (.data files)
Static	–	Read-only archives (shared installations)
Residency	0x30 bytes	File state tracking (.residency tokens)
Hard Link	0x30 bytes	Filesystem hard links (trie directory)

The Dynamic container is the primary read-write storage. It manages archive segments, key state tracking, and shared memory coordination. Access modes: 0=none, 1=read-only, 2=read-write, 3=exclusive.

Index Files (.idx)

Local indices use IDX Journal v7 format with little-endian headers (unlike most NGDP formats which use big-endian).

Key size: 9 bytes (truncated encoding keys)
Location size: 5 bytes (1 byte archive high + 4 bytes packed)
Entry size: 18 bytes (9 key + 5 location + 4 size)
Bucket distribution: 16 index buckets (0x00-0x0F)

The 9-byte key truncation saves space while maintaining sufficient uniqueness for local lookups. Keys are encoding keys, not content keys.

Index File Format

Each .idx file contains guarded blocks with Jenkins hash validation:

[GuardedBlockHeader]  (8 bytes: size + Jenkins hash)
[IndexHeaderV2]       (16 bytes: version, bucket, field sizes, segment_size)
[padding]             (8 bytes: hash/alignment)
[GuardedBlockHeader]  (8 bytes: entry block size + Jenkins hash)
[IndexEntry[]]        (N * 18 bytes: sorted by key)

Index Filename Format

{bucket:02x}{version:08x}.idx

Example: 0a00000003.idx = bucket 0x0A, version 3. Total filename length is 14 characters (10 hex digits + .idx).

Bucket Assignment

Files are assigned to index buckets using the XOR-fold algorithm on the first 9 bytes of the encoding key:

hash = key[0] ^ key[1] ^ key[2] ^ key[3] ^ key[4] ^ key[5] ^ key[6] ^ key[7] ^ key[8]
bucket = (hash & 0x0F) ^ (hash >> 4)

Agent uses a flush-and-bind pattern with 3-retry atomic commits when writing index files.

Key Mapping Table (KMT)

Below the index files, CASC maintains a Key Mapping Table (KMT) as the primary on-disk structure for key-to-location resolution:

Two-tier LSM-tree: sorted section (0x12-byte entries) + update section (0x200-byte pages)
Jenkins lookup3 hashes for bucket distribution
9-byte EKey prefix binary search within sorted sections
Update section uses 0x200-byte (512-byte) pages with 0x15 (21) entries per page (minimum 0x7800 bytes)

Data Files (data.NNN)

Data files contain BLTE-encoded content. Each entry has a 30-byte (0x1E) local header before the BLTE data:

Offset  Size  Field
0x00    16    Encoding key (reversed byte order)
0x10    4     Size including header (big-endian)
0x14    2     Flags
0x16    4     ChecksumA
0x1A    4     ChecksumB
0x1E    ...   BLTE data

Archive Location Packing

The 5-byte archive location in index entries encodes both archive ID and offset:

Byte 0:      archive_id >> 2 (high 8 bits)
Bytes 1-4:   (archive_id_low << 30) | (offset & 0x3FFFFFFF) (big-endian)

This gives 10-bit archive IDs (max 1023) and 30-bit offsets (max ~1 GiB).

Container Index

Agent maintains a ContainerIndex with 16 segments and supports frozen/thawed archive management:

Segments can be frozen (read-only) or thawed (writable)
0x1E-byte reconstruction headers per archive entry
Segment limit configurable up to 0x3FF (1023)
Per-segment tracking: 0x40 (64) bytes per segment in compactor state

Shared Memory (shmem)

The shmem file provides memory-mapped coordination between the Agent process and game clients:

Protocol versions 4 (base) and 5 (exclusive access flag at DWORD index 0x54)
Free space table format identifier at DWORD index 0x42 (value 0x2AB8)
V5 PID tracking: slot array with PID (u32) and mode (u32) per slot
Writer lock: named global mutex with Global\ prefix
DACL: D:(A;;GA;;;WD)(A;;GA;;;AN) (grant all to Everyone + Anonymous)
Retry logic: 10 attempts with Sleep(0) between failures
.lock file with 10-second backoff for coordination

LRU Cache

Agent maintains an LRU cache in shared memory:

Linked-list table structure
Generation-based checkpoints for eviction
20-character hex filenames with .lru extension

.build.info

The .build.info file contains installation metadata in BPSV format:

Product code and region
Active build configuration hash
CDN configuration hash
Installation tags and flags

Residency Tracking

The Residency container tracks which content keys are fully downloaded:

.residency token files mark valid containers
Byte-span tracking for partial downloads (header and data residency)
Reserve, mark-resident, remove, query operations
Scanner API for enumeration
Drive type check prevents unsupported storage media

Hard Link Storage

The Hard Link container uses a TrieDirectory for content sharing:

Hard links allow multiple keys to reference the same physical file
32-character hex filename validation
Unlinked key collection (link count <= 1)
Recursive compaction
LRU file descriptor cache with two open modes (handle vs async IO)
3-retry delete before hard link creation
Falls back to residency when hard links are unsupported

Maintenance Operations

Compaction

Two-phase process: archive merge then extract-compact.

Defrag algorithm: removes gaps between files, reorganizes positions
Fillholes algorithm: estimates free space without moving data
Merge threshold: float in [0.0, 0.4]
Async read/write pipeline with 128 KB minimum buffer
Per-segment span validation with overlap detection

Garbage Collection

4-stage pipeline:

Remove unreferenced keys from dynamic container
Remove obsolete config files
Remove CDN index files
Clean up empty directories recursively

Build Repair

Multi-stage pipeline using marker files for crash recovery:

RepairMarker.psv (pipe-separated, writable keys)
CASCRepair.mrk (V2 marker format)
Stages: read config, init CDN index, repair containers (data/ecache/hardlink sequentially), data repair, post-repair cleanup

Differences from CDN Storage

Aspect	CDN	Local
Key size	16 bytes	9 bytes (truncated)
Key type	Content keys	Encoding keys
Organization	Per-archive indices	16-bucket index files
Entry header	None	30-byte local header
Index format	CDN index footer	IDX Journal v7 with guarded blocks
Mutability	Immutable	Updated during patches
Containers	Single type	4 types (dynamic/static/residency/hardlink)

References

Archives
Archive Groups
BLTE Container
Agent Comparison

CDN Content Caching

The cascette-cache crate provides multi-layer caching for NGDP/CDN content. It optimizes network bandwidth and latency by caching frequently accessed data at multiple levels.

Architecture

graph TD
    A[Application] --> B[Multi-Layer Cache]
    B --> C[L1: Memory Cache]
    B --> D[L2: Disk Cache]
    D --> E[CDN]
    C --> E

    subgraph "Cache Layers"
        C
        D
    end

L1: Memory Cache

Fast in-memory cache with LRU eviction:

Immediate access for hot data
Size-based eviction when memory limit reached
TTL-based expiration for stale data
Zero-copy data sharing with bytes::Bytes

L2: Disk Cache

Persistent disk cache for larger datasets:

Survives application restarts
Atomic writes with fsync for durability
Configurable storage limits
Asynchronous I/O with tokio

NGDP-Specific Caches

The crate provides specialized caches for NGDP content types:

Resolution Cache

Caches the NGDP resolution chain:

Root File → Content Key
Content Key → Encoding Key
Encoding Key → CDN Location

Content-Addressed Cache

Stores content by its MD5 hash (ContentKey):

Automatic validation on retrieval
Deduplication across builds
Supports partial content access

BLTE Block Cache

Caches individual BLTE blocks for large files:

Enables partial file access without full download
Block-level validation
Decompressed and raw block storage

Archive Range Cache

Caches byte ranges from CDN archives:

Coalesces nearby requests into larger ranges
Reduces CDN round-trips
Supports range request optimization

Memory Pooling

NGDP files have predictable size distributions. The memory pool uses size classes optimized for these patterns:

Size Class	Range	Typical Content
Small	< 16 KB	Config files, small assets
Medium	< 256 KB	Most game files
Large	< 8 MB	Textures, models
Huge	> 8 MB	Large archives, cinematics

Benefits:

Reduced allocation overhead
Better memory locality
Thread-local pools for zero-contention

Content Validation

All cached content is validated on retrieval:

MD5 Validation

Content keys are MD5 hashes of the data:

#![allow(unused)]
fn main() {
let content_key = ContentKey::from_data(&data);
// Cache validates: MD5(data) == content_key
}

Jenkins96 Validation

Archive indices use Jenkins96 for fast hashing:

#![allow(unused)]
fn main() {
let hash = Jenkins96::hash(path.as_bytes());
// Validates archive index lookups
}

TACT Key Validation

Encrypted content requires TACT key verification before decryption.

SIMD Optimizations

Hash operations use SIMD acceleration when available:

Instruction Set	Vector Width	Speedup
SSE2	128-bit	2x
SSE4.1	128-bit	2x
AVX2	256-bit	4x
AVX-512	512-bit	8x

Runtime CPU detection selects the best available implementation.

Configuration

Memory Cache

#![allow(unused)]
fn main() {
MemoryCacheConfig {
    max_size: 256 * 1024 * 1024,  // 256 MB limit
    ttl: Duration::from_secs(3600), // 1 hour TTL
    eviction_batch_size: 100,      // Evict 100 items at a time
}
}

Disk Cache

#![allow(unused)]
fn main() {
DiskCacheConfig {
    cache_dir: PathBuf::from("/var/cache/cascette"),
    max_size: 10 * 1024 * 1024 * 1024, // 10 GB limit
    sync_writes: true,                  // fsync after writes
}
}

Multi-Layer

#![allow(unused)]
fn main() {
MultiLayerConfig {
    l1: MemoryCacheConfig::default(),
    l2: DiskCacheConfig::default(),
    write_through: true,  // Write to both layers
    promote_on_hit: true, // Copy L2 hits to L1
}
}

CDN Integration

The cache integrates with CDN clients for miss handling:

sequenceDiagram
    participant App
    participant L1 as Memory Cache
    participant L2 as Disk Cache
    participant CDN

    App->>L1: get(key)
    alt L1 Hit
        L1-->>App: data
    else L1 Miss
        L1->>L2: get(key)
        alt L2 Hit
            L2-->>L1: data
            L1-->>App: data
        else L2 Miss
            L2->>CDN: fetch(key)
            CDN-->>L2: data
            L2-->>L1: data
            L1-->>App: data
        end
    end

Features:

Automatic CDN fallback on cache miss
Retry logic with exponential backoff
Multiple CDN endpoint failover
Range request support for partial content

Streaming

Large files are processed in chunks to avoid memory exhaustion:

#![allow(unused)]
fn main() {
StreamingConfig {
    chunk_size: 64 * 1024,      // 64 KB chunks
    max_buffered_chunks: 16,    // 1 MB max buffer
    validate_chunks: true,      // Validate each chunk
}
}

Streaming enables:

Processing files larger than available memory
Progressive validation during download
Early error detection

Metrics

The cache tracks performance metrics:

Hit rate (L1, L2, overall)
Miss rate and CDN fallback frequency
Eviction counts and reasons
Memory and disk usage
Validation success/failure rates

References

CDN Architecture
BLTE Container
Archives
Local Storage

CDN Mirroring and Archival Strategy

Overview

This document outlines strategies for mirroring Blizzard’s CDN content for WoW using NGDP/CASC.

Note: Python code examples in this document are conceptual pseudocode illustrating mirroring workflows. For working code, see the cascette mirror CLI command or reference implementations in References.

Rationale for Mirroring

Blizzard removes older builds from CDN within days to weeks of new patches (see Archival Urgency below). Mirroring preserves builds that would otherwise be lost, enabling:

Preservation: Maintain access to historical builds after CDN removal
Development: Test CASC implementations against known data offline
Performance: Local access avoids CDN latency and bandwidth limits

Target Products

Focus on World of Warcraft products:

Product Code	Description	Update Frequency
wow	Retail/Live	Weekly patches
wowt	Public Test Realm	Frequent updates
wow_beta	Beta servers	Daily during beta
wow_classic	Classic (Wrath/Cata)	Bi-weekly
wow_classic_era	Classic Era (Vanilla)	Rare updates
wow_classic_ptr	Classic PTR	During test cycles
wow_classic_titan	Classic Titan (CN only, WotLK 3.80.x)	Unknown
wow_anniversary	Classic Anniversary (TBC 2.5.x)	Unknown

Archival Urgency

Based on testing CDN retention windows:

Product	Retention Window	Archival Priority
wow (Retail)	14-15 days	High - Daily checks
wow_classic	2-4 weeks	Medium - Weekly checks
wow_classic_era	~3 months	Low - Monthly checks
wow_beta	7-10 days	Critical - Continuous
wowt (PTR)	10-14 days	High - Every 2-3 days

Critical Finding: Retail builds disappear within 2 weeks of new patches.

Build Discovery

Track new builds via Ribbit protocol:

Sequence Number Monitoring

# Query summary endpoint
echo -e "v1/summary\r\n" | nc us.version.battle.net 1119

# Response includes sequence numbers
## seqn = 2241282

Monitor sequence number changes:

async def check_for_updates():
    summary = await ribbit_client.get_summary()

    for product in summary.products:
        stored_seqn = database.get_sequence(product.name)

        if product.seqn > stored_seqn:
            # New build detected!
            await process_new_build(product)
            database.update_sequence(product.name, product.seqn)

Version Information

# Get specific product versions
echo -e "v1/products/wow/versions\r\n" | nc us.version.battle.net 1119

CDN Path Discovery

Critical: Always Extract CDN Paths

# Get CDN information - NEVER hardcode paths!
echo -e "v1/products/wow/cdns\r\n" | nc us.version.battle.net 1119

Example response:

Region!STRING:0|Hosts!STRING:0|Path!STRING:0|ConfigPath!STRING:0
us|level3.blizzard.com edgecast.blizzard.com|tpr/wow|tpr/configs/data
eu|level3.blizzard.com edgecast.blizzard.com|tpr/wow|tpr/configs/data

CRITICAL: The Path field (tpr/wow) must be used for URL construction:

# CORRECT - Uses path from CDN response
cdn_url = f"http://{host}/{path}/data/{hash[:2]}/{hash[2:4]}/{hash}"

# WRONG - Hardcoded path
cdn_url = f"http://{host}/tpr/wow/data/{hash[:2]}/{hash[2:4]}/{hash}"

All WoW products use tpr/wow regardless of product code:

wow, wow_classic, wow_classic_era, wow_classic_titan, wow_anniversary all use tpr/wow
Never assume paths based on product names

Essential Files

Priority order for archival:

1. Configuration Files (Critical)

BuildConfig: Build-specific settings
CDNConfig: CDN and archive information
ProductConfig: Product metadata

2. System Files (Required)

Encoding: Content key mappings (~500MB-2GB)
Root: File manifest
Install: Installation manifest
Download: Download priority

3. Indices (Important)

Archive indices (.index files)
Patch indices for updates

4. Data Archives (Bulk)

Archive files (data.###)
Largest storage requirement
Can be fetched on-demand

Mirroring Architecture

Storage Structure

/mirror
├── configs/
│   └── data/
│       ├── {hash[0:2]}/
│       │   └── {hash[2:4]}/
│       │       └── {hash}
├── data/
│   ├── {hash[0:2]}/
│   │   └── {hash[2:4]}/
│   │       └── {hash}
├── indices/
│   └── *.index
└── metadata.db

Database Schema

CREATE TABLE builds (
    id SERIAL PRIMARY KEY,
    product VARCHAR(50),
    build_config VARCHAR(32),
    cdn_config VARCHAR(32),
    build_name VARCHAR(100),
    detected_at TIMESTAMP,
    archived BOOLEAN DEFAULT FALSE
);

CREATE TABLE files (
    hash VARCHAR(32) PRIMARY KEY,
    size BIGINT,
    type VARCHAR(20),
    downloaded_at TIMESTAMP
);

Download Strategy

Priority-Based Downloading

class MirrorStrategy:
    def __init__(self):
        self.priorities = {
            'configs': 1,      # Highest priority
            'encoding': 2,
            'root': 3,
            'install': 4,
            'indices': 5,
            'data': 10        # Lowest priority
        }

    async def mirror_build(self, build_info):
        # 1. Download configs first
        await self.download_configs(build_info)

        # 2. Get encoding file
        encoding = await self.download_encoding(build_info)

        # 3. Download indices
        indices = await self.download_indices(build_info)

        # 4. Optional: Download data archives
        if self.full_mirror:
            await self.download_archives(indices)

Bandwidth Management

Concurrent downloads: 4-8 connections
Rate limiting: Respect CDN limits
Retry logic: Handle transient failures
Resume support: Continue interrupted downloads

Incremental Updates

Track changes efficiently:

async def incremental_update(product):
    current_build = await get_current_build(product)
    stored_build = database.get_latest_build(product)

    if current_build != stored_build:
        # Download only new/changed files
        new_files = await diff_builds(current_build, stored_build)
        await download_files(new_files)

        database.update_build(product, current_build)

Verification

Ensure data integrity:

Hash Verification

def verify_file(filepath, expected_hash):
    actual_hash = calculate_md5(filepath)
    if actual_hash != expected_hash:
        raise IntegrityError(f"Hash mismatch: {filepath}")

Archive Integrity

Verify BLTE headers
Check chunk checksums
Validate encoding entries

Storage Optimization

Deduplication

Content-addressed storage automatically deduplicates:

def store_file(content, hash):
    path = get_path_from_hash(hash)
    if not os.path.exists(path):
        # Only store if not already present
        write_file(path, content)

Compression

Keep BLTE files compressed
Use filesystem compression for configs
Consider archive formats for old builds

Historical Build Recovery

Using External Sources

Community Archives:
- Shared build collections
- Private archives
Wayback Machine:
- Historical Ribbit responses
- Cached configuration files
Torrent archives:
- Community-shared build collections
- Distributed preservation efforts

Reconstruction

Rebuild missing content:

flowchart TD
    A[Partial Build] --> B[Identify Missing]
    B --> C[Search Mirrors]

    C --> D{Found?}
    D -->|Yes| E[Download Missing]
    D -->|No| F[Check Archives]

    F --> G{In Archive?}
    G -->|Yes| H[Extract Content]
    G -->|No| I[Search Community]

    E --> J[Verify Hashes]
    H --> J

    I --> K{Available?}
    K -->|Yes| L[Request Copy]
    K -->|No| M[Document Gap]

    L --> J
    J --> N[Update Archive]
    M --> O[Gap Report]

    style A stroke-width:4px
    style N stroke-width:4px
    style O stroke-width:3px,stroke-dasharray:5 5
    style D stroke-width:3px,stroke-dasharray:5 5
    style G stroke-width:3px,stroke-dasharray:5 5
    style K stroke-width:3px,stroke-dasharray:5 5
    style J stroke-width:2px
    style B stroke-width:2px

Legal Considerations

Fair Use

Archival under fair use principles:

Research: Academic study of game development
Education: Teaching game architecture
Preservation: Cultural heritage of gaming
Non-commercial: No monetization of archives

Best Practices

Respect intellectual property
Don’t distribute copyrighted content
Use for personal/research purposes
Cooperate with takedown requests

Reference Implementations

For detailed analysis of NGDP/CASC reference implementations, see references.md.

Key implementations examined:

CascLib: Complete C++ library with 10+ years of development
TACT.Net: C# architecture with modular design
rustycasc: Rust implementation with type safety
BlizzTrack: Production monitoring with database persistence
blizztools: Rust CLI for NGDP operations
blizzget: C++ downloader with custom version support
tactmon: Advanced C++ monitoring with template ORM
TACTSharp: .NET extraction library with memory-mapped files

These implementations informed cascette-rs design for CDN interaction and content resolution.

Implementation Examples

Build Tracker

class BuildTracker:
    def __init__(self, products):
        self.products = products
        self.check_interval = 300  # 5 minutes

    async def run(self):
        while True:
            for product in self.products:
                await self.check_product(product)
            await asyncio.sleep(self.check_interval)

    async def check_product(self, product):
        try:
            versions = await ribbit.get_versions(product)
            cdns = await ribbit.get_cdns(product)

            for region in versions.regions:
                build_config = region.build_config
                if not self.is_archived(build_config):
                    await self.archive_build(product, region, cdns)
        except Exception as e:
            logger.error(f"Failed to check {product}: {e}")

Archive Manager

class ArchiveManager:
    def __init__(self, storage_path):
        self.storage = storage_path
        self.cdn_client = CDNClient()

    async def archive_build(self, build_info):
        # Create build directory
        build_dir = self.storage / build_info.product / build_info.build_config
        build_dir.mkdir(parents=True, exist_ok=True)

        # Download in priority order
        await self.download_configs(build_info)
        await self.download_encoding(build_info)
        await self.download_root(build_info)

        # Mark as archived
        self.mark_archived(build_info)

Monitoring and Alerts

Health Checks

class MirrorHealth:
    async def check_health(self):
        return {
            'disk_space': self.check_disk_space(),
            'cdn_connectivity': await self.check_cdn(),
            'database': self.check_database(),
            'last_check': datetime.now()
        }

    def check_disk_space(self):
        usage = shutil.disk_usage(self.storage_path)
        return {
            'used': usage.used,
            'free': usage.free,
            'percent': (usage.used / usage.total) * 100
        }

Disaster Recovery

Backup Strategy

Primary Mirror: Fast SSD storage
Secondary Backup: HDD archive
Cloud Backup: Critical configs only
Community Sharing: Torrent distribution

Recovery Procedures

# Restore from backup
rsync -av /backup/mirror/ /primary/mirror/

# Verify integrity
find /mirror -type f -name "*.index" | xargs -I {} md5sum {}

# Rebuild database
python rebuild_metadata.py /mirror

Community Coordination

Shared Resources

Mirror status: Track who has what builds
Gap identification: Find missing builds
Bandwidth sharing: Distribute download load
Verification: Cross-check integrity

Future Considerations

Automated build discovery with predictive downloading before CDN removal
Differential compression between builds to reduce storage
Geographic replication for redundancy

Tools and Resources

Existing Tools

CASCExplorer: Browse CASC archives
WoW.tools: Online CASC viewer
TACTSharp: .NET extraction library
CascLib: C++ CASC library

Monitoring Services

BlizzTrack: Real-time build tracking
Wago.tools: API for build information

Community

Discord servers: Coordinate archival efforts
GitHub repos: Share tools and scripts
Forums: Technical discussions

The 14-15 day retention window for retail WoW makes automated monitoring and archival essential.

Reference Implementations

This document lists NGDP/CASC implementations useful for understanding the system. These projects have informed cascette-rs development and serve as references for format details and edge cases.

C++ Implementations

ladislav-zezula/CascLib

The original C++ CASC library by the author of StormLib (MPQ library).

Repository: https://github.com/ladislav-zezula/CascLib
Use for: Binary format details, algorithm verification, edge cases
Features: Complete CASC support, local and online archives, multiple games

heksesang/CascLib

C++17 header-only library from the WoW 6.0 era.

Repository: https://github.com/heksesang/CascLib
Use for: Simplified CASC reading, header-only integration
Note: Early implementation, lacks modern features (LZMA, LZ4, Zstd, encryption)

C# Implementations

Marlamin/CascLib

C# fork with WoW-specific enhancements, used by wow.tools.

Repository: https://github.com/Marlamin/CascLib
Use for: Encryption keys, root handlers, CDN index parsing, BLTE decoding
Features: Game-specific root handlers for 20+ Blizzard titles

wowdev/TACTSharp

Memory-mapped C# implementation focused on performance.

Repository: https://github.com/wowdev/TACTSharp
Use for: Performance patterns, zero-copy techniques, CDN optimization
Features: Efficient handling of large encoding files

wowdev/TACT.Net

C# library for TACT extraction operations.

Repository: https://github.com/wowdev/TACT.Net
Use for: Extraction patterns, multiple input/output formats
Features: EKey, CKey, FileDataID, and filename-based extraction

WowDevTools/CASCHost

Server-side CASC hosting for modding.

Repository: https://github.com/WowDevTools/CASCHost
Use for: CASC building, CDN structure generation, content serving
Note: Server-focused (produces content), opposite of cascette-rs (consumes content)

danielsreichenbach/BuildBackup

C# CDN backup tool (maintained fork of TACTAdder).

Repository: https://github.com/danielsreichenbach/BuildBackup
Use for: Mirror command reference, CDN failover, parallel downloads
Features: Archive size caching, resume support, multi-product mirroring

Rust Implementations

ferronn-dev/rustycasc

Rust CASC types and FrameXML extractor.

Repository: https://github.com/ferronn-dev/rustycasc
Use for: Rust type definitions, archive index parsing
Note: Hardcodes 4-byte offsets (doesn’t handle archive-groups)

ohchase/blizztools

Rust CLI for NGDP/TACT operations.

Repository: https://github.com/ohchase/blizztools
Use for: Ribbit protocol, install manifest parsing, async download patterns
Features: Version queries, manifest parsing, file downloads

Other Tools

Warpten/tactmon

C++ CDN tracker with Ribbit monitoring.

Repository: https://github.com/Warpten/tactmon
Use for: Ribbit protocol implementation, CDN monitoring, product tracking
Features: Template-based ORM, database persistence, production monitoring

funjoker/blizzget

Windows GUI CDN downloader.

Repository: https://github.com/nickscha/blizzget
Use for: Download workflow, custom version configs, tag selection
Note: GUI-focused, Windows-only

Kruithne/wow.export

Node.js/TypeScript export toolkit.

Repository: https://github.com/Kruithne/wow.export
Use for: File extraction patterns, M2/WMO handling, BLP conversion
Features: Visual export interface, multiple format support

Marlamin/wow.tools.local

Local wow.tools implementation.

Repository: https://github.com/Marlamin/wow.tools.local
Use for: File history tracking, DB2 diffing, hotfix management
Features: Web-based content browser, model viewer, database browser

Community Resources

wowdev.wiki

Community wiki documenting WoW file formats and systems.

URL: https://wowdev.wiki
Key pages: NGDP, CASC, TACT

wago.tools

Build database with 1,900+ WoW builds.

URL: https://wago.tools/builds
Use for: Build history, version information, product tracking

Community CDN Mirrors

Community-operated mirrors preserving historical WoW builds. These provide access to game data after Blizzard removes it from official CDNs.

cdn.arctium.tools

URL: https://cdn.arctium.tools
Coverage: WoW 6.x onwards (2014+)
Products: World of Warcraft (all variants)

casc.wago.tools

URL: https://casc.wago.tools
Coverage: Recent WoW builds
Products: World of Warcraft

archive.wow.tools

URL: https://archive.wow.tools
Coverage: Various WoW builds
Products: World of Warcraft, historical data

cascette-rs supports automatic fallback between these mirrors when official Blizzard CDNs are unavailable.

Project Setup

This page covers the requirements and setup for developing cascette-rs.

Requirements

Rust Toolchain

Minimum Supported Rust Version (MSRV): 1.92.0
Edition: Rust 2024

Install the required toolchain:

rustup install 1.92.0
rustup default 1.92.0

Required components:

rustup component add rustfmt clippy

For WASM development:

rustup target add wasm32-unknown-unknown

Development Tools

Tool	Purpose	Installation
`cargo-deny`	Dependency auditing	`cargo install cargo-deny`
`cargo-nextest`	Test runner	`cargo install cargo-nextest`
`cargo-llvm-cov`	Code coverage	`cargo install cargo-llvm-cov`
`mdbook`	Documentation	`cargo install mdbook` or via `mise install`

Optional Tools

Tool	Purpose	Installation
`ripgrep`	Code search	`cargo install ripgrep` or system package
`hyperfine`	Benchmarking	`cargo install hyperfine`
`cargo-watch`	Auto-rebuild	`cargo install cargo-watch`

Repository Structure

cascette-rs/
├── crates/                    # Workspace members
│   ├── cascette-crypto/       # Cryptographic primitives
│   ├── cascette-formats/      # Binary format parsers
│   └── ...
├── docs/                      # mdBook documentation
│   ├── src/                   # Documentation source
│   └── book.toml              # mdBook configuration
├── deny.toml                  # cargo-deny configuration
├── Cargo.toml                 # Workspace manifest
└── AGENTS.md                  # AI assistant guidance

First-Time Setup

Clone the repository:

git clone https://github.com/wowemulation-dev/cascette-rs.git
cd cascette-rs

Verify the toolchain:

rustc --version  # Should be 1.92.0 or later
cargo --version

Build the workspace:
```
cargo build --workspace
```
Run tests:
```
cargo nextest run --workspace
```

Verify lints pass:

cargo fmt --all -- --check
cargo clippy --workspace --all-targets

IDE Configuration

VS Code

Recommended extensions:

rust-analyzer - Rust language support
Even Better TOML - TOML file support
crates - Dependency version management

Settings (.vscode/settings.json):

{
  "rust-analyzer.check.command": "clippy",
  "rust-analyzer.check.allTargets": true,
  "editor.formatOnSave": true,
  "[rust]": {
    "editor.defaultFormatter": "rust-lang.rust-analyzer"
  }
}

JetBrains (RustRover/IntelliJ)

Install the Rust plugin
Enable “Run rustfmt on save”
Configure clippy as the external linter

Quality Gate

All changes must pass the CI workflow before merging. Run these checks locally:

# Full CI check (run before committing)
cargo fmt --all -- --check && \
cargo clippy --workspace --all-targets && \
cargo nextest run --profile ci --workspace && \
cargo doc --workspace --no-deps

Individual checks:

Command	Purpose
`cargo fmt --all -- --check`	Format verification
`cargo clippy --workspace --all-targets`	Lint checks
`cargo nextest run --profile ci --workspace`	Unit and integration tests
`cargo doc --workspace --no-deps`	Documentation build
`cargo deny check`	Dependency audit

WASM Compatibility

Core libraries must compile to WASM:

cargo check --target wasm32-unknown-unknown -p cascette-crypto
cargo check --target wasm32-unknown-unknown -p cascette-formats

Documentation

Build and serve the documentation locally:

# Build HTML documentation
mdbook build docs

# Serve locally with auto-reload
mdbook serve docs --open

The documentation will be available at http://localhost:3000.

Workspace Configuration

The workspace uses strict linting. Key settings from Cargo.toml:

[workspace.lints.clippy]
# Lint groups
all = { level = "warn", priority = -1 }
pedantic = { level = "warn", priority = -1 }
nursery = { level = "warn", priority = -1 }
cargo = { level = "warn", priority = -1 }

# Safety lints (higher priority)
unwrap_used = { level = "warn", priority = 2 }
panic = { level = "warn", priority = 2 }
todo = { level = "warn", priority = 2 }
unimplemented = { level = "warn", priority = 2 }
expect_used = { level = "warn", priority = 2 }

Library code should avoid unwrap(), expect(), and panic!(). Use Result types and proper error handling instead.

Testing Guidelines

This page covers testing conventions and practices for cascette-rs.

Test Organization

Module Structure

Tests live in the same file as the code they test, using a #[cfg(test)] module:

#![allow(unused)]
fn main() {
pub fn parse_header(data: &[u8]) -> Result<Header, ParseError> {
    // Implementation
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_parse_header_with_valid_data_returns_header() {
        // Test implementation
    }
}
}

Nested Modules for Large Files

For files with many tests, use nested modules to group related tests:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;

    mod parsing {
        use super::*;

        #[test]
        fn test_parse_entry_from_valid_bytes() { ... }

        #[test]
        fn test_parse_entry_from_truncated_bytes_returns_error() { ... }
    }

    mod building {
        use super::*;

        #[test]
        fn test_builder_with_entries_produces_sorted_output() { ... }
    }

    mod edge_cases {
        use super::*;

        #[test]
        fn test_edge_empty_input_returns_empty_result() { ... }
    }
}
}

Test Naming Convention

Pattern

Use this naming pattern for test functions:

test_<subject>_<condition>_<expected_outcome>

Components:

Part	Description	Example
`subject`	What is being tested	`parser`, `builder`, `entry`
`condition`	The scenario or input	`with_valid_data`, `from_empty_input`
`expected_outcome`	What should happen	`returns_struct`, `returns_error`

Examples

Parsing tests:

#![allow(unused)]
fn main() {
// Good - specific and descriptive
fn test_parse_header_with_valid_magic_returns_header() { ... }
fn test_parse_header_with_invalid_magic_returns_error() { ... }
fn test_parse_entry_from_truncated_data_returns_incomplete_error() { ... }

// Bad - too vague
fn test_parse() { ... }
fn test_header() { ... }
fn test_error() { ... }
}

Building tests:

#![allow(unused)]
fn main() {
// Good
fn test_builder_with_single_entry_creates_valid_output() { ... }
fn test_builder_with_unsorted_entries_sorts_before_writing() { ... }

// Bad
fn test_builder() { ... }
fn test_build() { ... }
}

Round-trip tests:

#![allow(unused)]
fn main() {
// Good - suffix with _round_trip
fn test_index_entry_round_trip_preserves_all_fields() { ... }
fn test_blte_compression_round_trip_matches_original() { ... }

// Bad
fn test_round_trip() { ... }  // Round trip of what?
}

Category Prefixes

Use consistent prefixes for special test categories:

Prefix	Use Case	Example
`test_edge_*`	Edge cases and boundary conditions	`test_edge_empty_input_handled`
`test_error_*`	Error path validation	`test_error_invalid_checksum_detected`
`*_round_trip`	Serialization/deserialization	`test_config_round_trip`

Edge case examples:

#![allow(unused)]
fn main() {
fn test_edge_empty_index_builds_successfully() { ... }
fn test_edge_single_entry_is_searchable() { ... }
fn test_edge_max_u32_offset_handled() { ... }
fn test_edge_zero_length_data_returns_empty() { ... }
}

Error handling examples:

#![allow(unused)]
fn main() {
fn test_error_truncated_footer_returns_parse_error() { ... }
fn test_error_invalid_checksum_returns_mismatch() { ... }
fn test_error_unsorted_entries_rejected() { ... }
}

Test Types

Unit Tests

Test individual functions in isolation:

#![allow(unused)]
fn main() {
#[test]
fn test_jenkins96_hash_with_known_input_produces_expected_output() {
    let result = Jenkins96::hash(b"test");
    assert_eq!(result.hash32, 0x12345678);  // Known value
}
}

Integration Tests

Place in tests/ directory for testing public APIs:

crates/cascette-formats/
├── src/
│   └── lib.rs
└── tests/
    └── archive_integration.rs

Property-Based Tests

Use proptest for testing invariants across many inputs:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod proptest_tests {
    use proptest::prelude::*;

    proptest! {
        #[test]
        fn round_trip_preserves_entries(entries in prop::collection::vec(any::<Entry>(), 0..100)) {
            let built = build(&entries);
            let parsed = parse(&built)?;
            prop_assert_eq!(entries, parsed);
        }
    }
}
}

Property test naming (inside proptest! macro):

No test_ prefix needed (macro adds it)
Describe the property being verified
Examples: round_trip_preserves_entries, checksum_detects_corruption

Assertions

Use `pretty_assertions`

Import pretty_assertions for better diff output on failures:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use pretty_assertions::assert_eq;

    #[test]
    fn test_something() {
        assert_eq!(expected, actual);  // Shows colored diff on failure
    }
}
}

Common Assertions

Assertion	Use Case
`assert_eq!(expected, actual)`	Value equality
`assert_ne!(a, b)`	Values differ
`assert!(condition)`	Boolean conditions
`assert!(result.is_ok())`	Success check
`assert!(result.is_err())`	Error check
`matches!(value, pattern)`	Pattern matching

Error Assertions

Test specific error types:

#![allow(unused)]
fn main() {
#[test]
fn test_parse_with_invalid_data_returns_checksum_error() {
    let result = parse(invalid_data);

    assert!(matches!(
        result,
        Err(ParseError::ChecksumMismatch { .. })
    ));
}
}

Running Tests

This project uses cargo-nextest for faster, parallel test execution with better output formatting.

Basic Commands

# Run all tests with nextest (recommended)
cargo nextest run --workspace

# Run tests with CI profile (stricter timeouts, immediate output on failures)
cargo nextest run --profile ci --workspace

# Run tests for a specific crate
cargo nextest run -p cascette-formats
cargo nextest run --profile ci -p cascette-formats

# Run tests matching a pattern
cargo nextest run --workspace edge_          # All edge case tests
cargo nextest run --workspace error_         # All error tests
cargo nextest run --workspace round_trip     # All round-trip tests

# Run a specific test
cargo nextest run -p cascette-formats test_parse_header_with_valid_data

Feature Combinations

Test with different feature combinations:

# Default features
cargo test --workspace

# No default features (minimal build)
cargo test --workspace --no-default-features

# All features
cargo test --workspace --all-features

Code Coverage

Generate coverage reports:

# Generate LCOV report
cargo llvm-cov --workspace --lcov --output-path lcov.info

# Generate HTML report
cargo llvm-cov --workspace --html

# Open HTML report
open target/llvm-cov/html/index.html

Test Data

Embedded Test Data

For small test cases, embed data directly in tests:

#![allow(unused)]
fn main() {
#[test]
fn test_parse_minimal_header() {
    let data = [
        0x42, 0x4C, 0x54, 0x45,  // Magic: "BLTE"
        0x00, 0x00, 0x00, 0x10,  // Header size: 16
    ];

    let header = parse_header(&data).expect("should parse");
    assert_eq!(header.magic, b"BLTE");
}
}

Test Fixtures

For larger test files, use the include_bytes! macro or test fixtures:

#![allow(unused)]
fn main() {
const TEST_INDEX: &[u8] = include_bytes!("fixtures/sample.index");

#[test]
fn test_parse_real_index_file() {
    let index = ArchiveIndex::parse(TEST_INDEX).expect("should parse");
    assert!(!index.entries.is_empty());
}
}

Property Test Strategies

Define reusable strategies for property tests:

#![allow(unused)]
fn main() {
fn valid_entry_strategy() -> impl Strategy<Value = IndexEntry> {
    (
        prop::array::uniform16(any::<u8>()),  // 16-byte key
        0u32..u32::MAX,                        // offset
        1u32..1_000_000,                       // size
    ).prop_map(|(key, offset, size)| {
        IndexEntry { key: key.to_vec(), offset, size, archive_index: None }
    })
}
}

CI Integration

Tests run automatically on every pull request using cargo-nextest. The CI workflow:

Runs cargo nextest run --profile ci --workspace with default features
Runs tests with --no-default-features on changed crates
Tests each changed crate individually on stable Rust
Collects code coverage using cargo llvm-cov --nextest and uploads to Codecov

See .github/workflows/ci.yml for the full configuration.

Nextest Profiles

The project uses three nextest profiles configured in .config/nextest.toml:

Profile	Description	Use Case
`default`	Standard timeouts, final output on completion	Local development
`ci`	Stricter timeouts, immediate output on failures	CI, PR checks
`release`	Release build with optimizations	Performance testing

Cargo Aliases

Convenient cargo aliases are defined in .cargo/config.toml:

cargo nextest-all          # All tests with default profile
cargo nextest-lib          # Library tests only
cargo nextest-ci           # All tests with CI profile
cargo nextest-release      # All tests with release profile
cargo nextest-unit        # Unit tests only
cargo nextest-integration  # Integration tests only

# Generate flamegraph for benchmarks
cargo flamegraph --bench throughput -- --bench

# Generate flamegraph for a binary
cargo flamegraph --bin cascette-ribbit -- --help

# Generate flamegraph for tests
cargo flamegraph --test integration

# Specify output location (flamegraph.svg is created in working directory by default)
cargo flamegraph --output target/flamegraphs/flamegraph.svg --bench throughput -- --bench

Flamegraph outputs are stored in target/flamegraphs/ and ignored by git.

CI Flamegraph Generation

The .github/workflows/profiling.yml workflow generates flamegraphs automatically:

Trigger: Manual via workflow_dispatch or commits with [perf] in the message
Targets: bench (default), test, binary
Output: Uploaded as artifacts and posted to PR comments

To trigger a flamegraph run:

git commit -m "Add performance optimization [perf]"
git push

Or manually trigger via GitHub Actions UI with a target selector.

Benchmarking

The project uses criterion for benchmarking.

# Run all benchmarks
cargo bench

# Run specific benchmark
cargo bench --bench throughput

# Generate HTML report
cargo bench --bench throughput -- --output-format html
open target/criterion/report/index.html

Benchmark Regression Detection

The profiling workflow automatically detects performance regressions:

Runs on main branch pushes
Uses benchmark-action/github-action-benchmark to store results
Alerts when performance degrades by >200%
Posts comments to commits with regression alerts

Benchmark data is stored in GitHub Actions cache for historical comparison.

Coding Standards

This page covers coding conventions and style guidelines for cascette-rs.

Formatting

All code must be formatted with rustfmt. Run before committing:

cargo fmt --all

The workspace uses default rustfmt settings. No custom configuration is needed.

Linting

The workspace enables strict clippy lints. All warnings must be resolved:

cargo clippy --workspace --all-targets

Lint Configuration

From Cargo.toml:

[workspace.lints.clippy]
# Lint groups at low priority
all = { level = "warn", priority = -1 }
pedantic = { level = "warn", priority = -1 }
nursery = { level = "warn", priority = -1 }
cargo = { level = "warn", priority = -1 }

# Safety lints at higher priority
unwrap_used = { level = "warn", priority = 2 }
panic = { level = "warn", priority = 2 }
todo = { level = "warn", priority = 2 }
unimplemented = { level = "warn", priority = 2 }
expect_used = { level = "warn", priority = 2 }

Error Handling

Library Code

Library crates must use proper error handling:

#![allow(unused)]
fn main() {
// Good - returns Result
pub fn parse(data: &[u8]) -> Result<Header, ParseError> {
    if data.len() < HEADER_SIZE {
        return Err(ParseError::InsufficientData {
            expected: HEADER_SIZE,
            actual: data.len(),
        });
    }
    // ...
}

// Bad - panics
pub fn parse(data: &[u8]) -> Header {
    assert!(data.len() >= HEADER_SIZE);  // Don't do this
    // ...
}
}

Error Types

Use thiserror for error definitions:

#![allow(unused)]
fn main() {
use thiserror::Error;

#[derive(Debug, Error)]
pub enum ParseError {
    #[error("insufficient data: expected {expected} bytes, got {actual}")]
    InsufficientData { expected: usize, actual: usize },

    #[error("invalid magic: expected {expected:?}, got {actual:?}")]
    InvalidMagic { expected: [u8; 4], actual: [u8; 4] },

    #[error("checksum mismatch")]
    ChecksumMismatch { expected: [u8; 8], actual: [u8; 8] },
}
}

Avoiding `unwrap()` and `expect()`

Library code should avoid unwrap() and expect(). Use these alternatives:

#![allow(unused)]
fn main() {
// Instead of unwrap(), propagate errors
let value = map.get(&key).ok_or(Error::KeyNotFound)?;

// Instead of expect(), use ok_or_else() with context
let value = map.get(&key)
    .ok_or_else(|| Error::KeyNotFound { key: key.clone() })?;

// For truly impossible cases, use unreachable!() with comment
match validated_enum {
    Known::Variant => { /* ... */ }
    // Validation already checked all variants
}
}

When expect() is unavoidable (e.g., in binrw map functions), add a file-level allow with documentation:

#![allow(unused)]
fn main() {
//! Module description
//!
//! Uses expect in binrw map functions where Result types cannot be used.
#![allow(clippy::expect_used)]
}

Test Code

Test code may use unwrap() and expect() with the allow attribute:

#![allow(unused)]
fn main() {
#[cfg(test)]
#[allow(clippy::unwrap_used, clippy::expect_used, clippy::panic)]
mod tests {
    // Tests can use unwrap/expect/panic freely
}
}

Binary Format Parsing

Use binrw

All binary formats use the binrw crate for parsing and building:

#![allow(unused)]
fn main() {
use binrw::{BinRead, BinWrite};

#[derive(Debug, BinRead, BinWrite)]
#[brw(big)]  // NGDP uses big-endian
pub struct Header {
    #[brw(magic = b"BLTE")]
    pub magic: (),

    pub header_size: u32,
    pub flags: u8,
}
}

Big-Endian Default

NGDP/CASC formats use big-endian byte order. Always specify:

#![allow(unused)]
fn main() {
#[derive(BinRead, BinWrite)]
#[brw(big)]  // Required for NGDP formats
pub struct Entry {
    pub offset: u32,
    pub size: u32,
}
}

If a field uses little-endian (rare), annotate explicitly:

#![allow(unused)]
fn main() {
#[derive(BinRead, BinWrite)]
#[brw(big)]
pub struct MixedEntry {
    pub big_endian_field: u32,

    #[brw(little)]
    pub little_endian_field: u32,  // Exception - document why
}
}

Round-Trip Testing

Every format must have round-trip tests:

#![allow(unused)]
fn main() {
#[test]
fn test_header_round_trip_preserves_all_fields() {
    let original = Header {
        header_size: 16,
        flags: 0x01,
    };

    let mut buffer = Vec::new();
    original.write(&mut Cursor::new(&mut buffer)).unwrap();

    let parsed = Header::read(&mut Cursor::new(&buffer)).unwrap();

    assert_eq!(original, parsed);
}
}

Documentation

Public API Documentation

All public items require documentation:

#![allow(unused)]
fn main() {
/// Parses a BLTE header from the given data.
///
/// # Arguments
///
/// * `data` - Raw bytes containing the BLTE header
///
/// # Returns
///
/// The parsed header on success, or an error if parsing fails.
///
/// # Errors
///
/// Returns `ParseError::InsufficientData` if the data is too short.
/// Returns `ParseError::InvalidMagic` if the magic bytes don't match.
///
/// # Examples
///
/// ```
/// use cascette_formats::blte::parse_header;
///
/// let data = include_bytes!("../fixtures/sample.blte");
/// let header = parse_header(data)?;
/// println!("Header size: {}", header.header_size);
/// # Ok::<(), cascette_formats::blte::ParseError>(())
/// ```
pub fn parse_header(data: &[u8]) -> Result<Header, ParseError> {
    // ...
}
}

Binary Format Documentation

Document binary formats with exact byte layouts:

#![allow(unused)]
fn main() {
/// Archive index entry.
///
/// ## Binary Layout
///
/// | Offset | Size | Field | Description |
/// |--------|------|-------|-------------|
/// | 0x00 | 16 | key | Encoding key (MD5 hash) |
/// | 0x10 | 4 | size | Compressed size in bytes |
/// | 0x14 | 4 | offset | Offset into archive file |
///
/// Total size: 24 bytes (0x18)
///
/// All multi-byte fields are big-endian.
#[derive(Debug, BinRead, BinWrite)]
#[brw(big)]
pub struct IndexEntry {
    pub key: [u8; 16],
    pub size: u32,
    pub offset: u32,
}
}

Naming Conventions

Types and Traits

Item	Convention	Example
Structs	PascalCase	`ArchiveIndex`, `BlteHeader`
Enums	PascalCase	`CompressionType`, `ParseError`
Traits	PascalCase	`CascFormat`, `KeyStore`
Type aliases	PascalCase	`ContentKey`, `EncodingKey`

Functions and Methods

Item	Convention	Example
Functions	snake_case	`parse_header`, `build_index`
Methods	snake_case	`self.get_entry()`, `self.is_valid()`
Constructors	`new` or `from_*`	`Header::new()`, `Key::from_hex()`
Conversions	`to_` or `into_`	`to_bytes()`, `into_vec()`
Getters	no prefix	`fn size(&self)` not `fn get_size(&self)`
Boolean getters	`is_` or `has_`	`is_empty()`, `has_entries()`

Constants and Statics

#![allow(unused)]
fn main() {
// Constants: SCREAMING_SNAKE_CASE
pub const HEADER_SIZE: usize = 16;
pub const MAGIC_BYTES: [u8; 4] = *b"BLTE";

// Statics (rare): SCREAMING_SNAKE_CASE
static GLOBAL_CONFIG: Lazy<Config> = Lazy::new(Config::default);
}

Modules

Module names use snake_case:

#![allow(unused)]
fn main() {
mod archive;
mod blte;
mod encoding;
mod root;
}

File structure mirrors module structure:

src/
├── archive/
│   ├── mod.rs
│   ├── index.rs
│   └── builder.rs
├── blte/
│   ├── mod.rs
│   ├── header.rs
│   └── compression.rs
└── lib.rs

Memory and Performance

Zero-Copy When Possible

Prefer borrowing over copying:

#![allow(unused)]
fn main() {
// Good - borrows data
pub fn parse<'a>(data: &'a [u8]) -> Result<Entry<'a>, Error> {
    Ok(Entry {
        key: &data[0..16],
        // ...
    })
}

// Less efficient - copies data
pub fn parse(data: &[u8]) -> Result<Entry, Error> {
    Ok(Entry {
        key: data[0..16].to_vec(),
        // ...
    })
}
}

Avoid Loading Large Files Into Memory

Stream large files instead of loading entirely:

#![allow(unused)]
fn main() {
// Good - streams data
pub fn process_archive<R: Read + Seek>(reader: &mut R) -> Result<(), Error> {
    loop {
        let entry = read_entry(reader)?;
        process_entry(&entry)?;
    }
}

// Bad - loads everything
pub fn process_archive(data: &[u8]) -> Result<(), Error> {
    let archive = parse_entire_archive(data)?;  // Out of memory for large files
    // ...
}
}

Use Appropriate Collection Types

Use Case	Type
Ordered, indexed access	`Vec<T>`
Key-value lookup	`HashMap<K, V>` or `BTreeMap<K, V>`
Unique values	`HashSet<T>` or `BTreeSet<T>`
Small fixed-size	`[T; N]` or `ArrayVec<T, N>`
Bytes	`Bytes` (from bytes crate) for shared ownership

Unsafe Code

Unsafe code requires explicit documentation:

#![allow(unused)]
fn main() {
/// # Safety
///
/// Caller must ensure:
/// - `ptr` is valid for reads of `len` bytes
/// - `ptr` is properly aligned for `T`
/// - The memory is not mutated during this call
pub unsafe fn read_from_ptr<T>(ptr: *const u8, len: usize) -> T {
    // ...
}
}

Prefer safe abstractions when possible. Use unsafe only when necessary for performance or FFI.

WASM Compatibility

Core libraries must compile to WASM. Avoid:

C dependencies (use pure Rust implementations)
File system access in library code
Platform-specific code without #[cfg] guards

Test WASM compilation:

cargo check --target wasm32-unknown-unknown -p cascette-crypto
cargo check --target wasm32-unknown-unknown -p cascette-formats

Keyboard shortcuts

cascette-rs