Polkadart Logo
SCALE Codec

Collections

Working with SCALE collection types - sequences, arrays, maps, and sets

Collection Types Overview

SCALE provides several collection types for organizing multiple values. Each has different characteristics and use cases:

  • Sequence (Vec): Variable-length, dynamically-sized
  • Array: Fixed-length, compile-time size
  • BTreeMap: Sorted key-value pairs
  • Set: Unique values
  • BitSequence: Compact boolean array

Sequences (Vec)

Sequences are the most common collection type - variable-length vectors with a compact-encoded length prefix.

Basic Sequence

import 'package:polkadart_scale_codec/polkadart_scale_codec.dart';

// Vec<u32>
final codec = SequenceCodec(U32Codec.codec);

// Encode
final values = [100, 200, 300];
final output = ByteOutput();
codec.encodeTo(values, output);
// [12, 100, 0, 0, 0, 200, 0, 0, 0, 44, 1, 0, 0]
// 12 = compact(3), then three u32 values

// Decode
final decoded = codec.decode(Input.fromBytes(output.toBytes()));
print(decoded); // [100, 200, 300]

Encoding Structure

A sequence is encoded as:

  1. Length prefix (compact-encoded)
  2. Elements (each encoded with inner codec)
// [1, 2, 3] encoded as Vec<u8>:
// 0x0c010203
// └┬┘ └─┬──┘
//  │    └── Elements
//  └─────── Compact(3)

Nested Sequences

Sequences can contain other sequences:

// Vec<Vec<u32>>
final innerCodec = SequenceCodec(U32Codec.codec);
final outerCodec = SequenceCodec(innerCodec);

final data = [
  [1, 2, 3],
  [4, 5],
  [6, 7, 8, 9],
];

final encoded = outerCodec.encode(data);

Sequences of Structs

// Vec<{id: u32, active: bool}>
final structCodec = CompositeCodec({
  'id': U32Codec.codec,
  'active': BoolCodec.codec,
});

final codec = SequenceCodec(structCodec);

final data = [
  {'id': 1, 'active': true},
  {'id': 2, 'active': false},
  {'id': 3, 'active': true},
];

final encoded = codec.encode(data);

Empty Sequences

final codec = SequenceCodec(U32Codec.codec);

// Empty vector
final empty = <int>[];
final output = ByteOutput();
codec.encodeTo(empty, output);
print(output.toBytes()); // [0] - just compact(0)

Arrays

Arrays have a fixed length known at compile time. No length prefix is encoded.

Basic Arrays

// [u32; 3] - array of exactly 3 u32 values
final codec = ArrayCodec(U32Codec.codec, 3);

// Encode
final values = [100, 200, 300];
final output = ByteOutput();
codec.encodeTo(values, output);
// [100, 0, 0, 0, 200, 0, 0, 0, 44, 1, 0, 0]
// No length prefix!

// Must be exact length
try {
  codec.encodeTo([100, 200], output); // Error!
} catch (e) {
  print('Invalid list length');
}

Byte Arrays

For byte arrays, there's an optimized codec:

// [u8; 32] - common for hashes, keys
final codec = U8ArrayCodec(32);

final hash = List.generate(32, (i) => i);
final encoded = codec.encode(hash);
print(encoded.length); // Exactly 32 bytes

When to Use Arrays vs Sequences

Use Array WhenUse Sequence When
Size is fixed and knownSize varies at runtime
Save 1 byte (no length)Dynamic collections
Type safety requires itFlexibility is needed
Hash outputs, signaturesLists, vectors

Common Array Sizes

// Common in blockchain
final hash256 = U8ArrayCodec(32);      // SHA-256, Blake2b-256
final hash512 = U8ArrayCodec(64);      // SHA-512, Blake2b-512
final signature = U8ArrayCodec(64);    // ED25519, SR25519
final publicKey = U8ArrayCodec(32);    // Account public keys

Tuples

Tuples are fixed-length collections of potentially different types:

Basic Tuples

// (u32, bool, String)
final codec = TupleCodec([
  U32Codec.codec,
  BoolCodec.codec,
  StrCodec.codec,
]);

final value = [42, true, 'hello'];
final output = ByteOutput();
codec.encodeTo(value, output);

final decoded = codec.decode(Input.fromBytes(output.toBytes()));
print(decoded); // [42, true, 'hello']

Tuple Structure

Tuples encode each element in order with no separator:

// (u8, u16) with values (5, 1000)
// Encoded as: [5, 232, 3]
//             └┘  └──┬──┘
//             u8    u16 (little-endian)

Nested Tuples

// ((u32, bool), String)
final innerTuple = TupleCodec([U32Codec.codec, BoolCodec.codec]);
final outerTuple = TupleCodec([innerTuple, StrCodec.codec]);

final value = [
  [42, true],
  'test',
];
final encoded = outerTuple.encode(value);

Tuple vs Array

TupleArray
Different typesSame type
[u32, bool, String][u32; 3]
TupleCodec([...])ArrayCodec(codec, N)
HeterogeneousHomogeneous

BTreeMap

Sorted map with key-value pairs:

Basic BTreeMap

// BTreeMap<u32, bool>
final codec = BTreeMapCodec(
  keyCodec: U32Codec.codec,
  valueCodec: BoolCodec.codec,
);

final map = {
  100: true,
  200: false,
  300: true,
};

final output = ByteOutput();
codec.encodeTo(map, output);

final decoded = codec.decode(Input.fromBytes(output.toBytes()));
print(decoded); // {100: true, 200: false, 300: true}

Encoding Structure

BTreeMap is encoded as:

  1. Count (compact-encoded number of entries)
  2. Key-value pairs (sorted by key)
// {2: true, 1: false}
// Sorted order: {1: false, 2: true}
// Encoded: [8, 1, 0, 0, 0, 0, 2, 0, 0, 0, 1]
//          └┘  └───────┬───────┘  └───────┬───────┘
//      compact(2)   entry 1        entry 2

Nested Maps

// BTreeMap<u32, BTreeMap<u32, bool>>
final innerMap = BTreeMapCodec(
  keyCodec: U32Codec.codec,
  valueCodec: BoolCodec.codec,
);

final outerMap = BTreeMapCodec(
  keyCodec: U32Codec.codec,
  valueCodec: innerMap,
);

final data = {
  1: {10: true, 20: false},
  2: {30: true},
};

final encoded = outerMap.encode(data);

Complex Map Example

// BTreeMap<String, Vec<u32>>
final codec = BTreeMapCodec(
  keyCodec: StrCodec.codec,
  valueCodec: SequenceCodec(U32Codec.codec),
);

final data = {
  'alice': [100, 200],
  'bob': [300, 400, 500],
};

final encoded = codec.encode(data);

Set

Unordered collection of unique values:

Basic Set

// BTreeSet<u32> (implemented as BTreeMap<u32, ()>)
final codec = SetCodec(U32Codec.codec);

final set = {1, 2, 3, 2, 1}; // Duplicates removed
final output = ByteOutput();
codec.encodeTo(set, output);

final decoded = codec.decode(Input.fromBytes(output.toBytes()));
print(decoded); // {1, 2, 3}

Set Encoding

Sets are encoded similarly to sequences:

  1. Count (compact-encoded)
  2. Elements (sorted and unique)
// {3, 1, 2} becomes {1, 2, 3}
// Encoded: [12, 1, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0]
//          └┘  └────────────┬────────────────────┘
//     compact(3)         sorted elements

BitSequence

Compact representation of boolean arrays:

Basic BitSequence

final codec = BitSequenceCodec();

// Boolean array
final bits = [true, false, true, true, false, false, true, false];
final encoded = codec.encode(bits);

// Decodes back to original
final decoded = codec.decode(Input.fromBytes(encoded));
print(decoded); // [true, false, true, true, false, false, true, false]

Encoding Structure

BitSequence packs 8 bools into 1 byte:

// [true, false, true, true, false, false, true, false]
// Packed as: 01101001 (reversed bit order)
// Plus compact length prefix

When to Use BitSequence

Large Boolean Arrays

1000 bools = 125 bytes (vs 1000 bytes)

Flags and Permissions

User permission bitfields

Sparse Matrices

Boolean sparse matrices

Bloom Filters

Compact probabilistic data structures

Complete Example: Product Catalog

Real-world example using multiple collection types:

import 'package:polkadart_scale_codec/polkadart_scale_codec.dart';

class Product {
  final int id;
  final String name;
  final BigInt price;
  final List<String> tags;
  final bool inStock;

  Product(this.id, this.name, this.price, this.tags, this.inStock);
}

class ProductCodec with Codec<Product> {
  const ProductCodec();

  @override
  void encodeTo(Product value, Output output) {
    U32Codec.codec.encodeTo(value.id, output);
    StrCodec.codec.encodeTo(value.name, output);
    CompactBigIntCodec.codec.encodeTo(value.price, output);
    SequenceCodec(StrCodec.codec).encodeTo(value.tags, output);
    BoolCodec.codec.encodeTo(value.inStock, output);
  }

  @override
  Product decode(Input input) {
    return Product(
      U32Codec.codec.decode(input),
      StrCodec.codec.decode(input),
      CompactBigIntCodec.codec.decode(input),
      SequenceCodec(StrCodec.codec).decode(input),
      BoolCodec.codec.decode(input),
    );
  }

  @override
  bool isSizeZero() => false;
}

void main() {
  // Catalog: Map<category, Vec<Product>>
  final catalogCodec = BTreeMapCodec(
    keyCodec: StrCodec.codec,
    valueCodec: SequenceCodec(ProductCodec()),
  );

  final catalog = {
    'electronics': [
      Product(1, 'Laptop', BigInt.from(999), ['tech', 'portable'], true),
      Product(2, 'Phone', BigInt.from(699), ['tech', 'mobile'], true),
    ],
    'books': [
      Product(3, 'SCALE Guide', BigInt.from(29), ['tech', 'education'], false),
    ],
  };

  // Encode
  final encoded = catalogCodec.encode(catalog);
  print('Encoded size: ${encoded.length} bytes');

  // Decode
  final decoded = catalogCodec.decode(Input.fromBytes(encoded));
  print('Categories: ${decoded.keys.length}');
}

Performance Considerations

Size Comparison

TypeExampleRegularOptimizedSavings
Vec<u8>1,2,313 bytes4 bytes69%
BitSeq8 bools8 bytes2 bytes75%
u8; 32Hash33 bytes32 bytes3%
BTreeMap3 entriesVariableSortedBetter lookup

Best Practices

Choose the Right Collection

  • Use Array for fixed-size data
  • Use Sequence for dynamic lists
  • Use BTreeMap for key-value lookups
  • Use BitSequence for large boolean arrays

Pre-allocate When Possible

final codec = SequenceCodec(U32Codec.codec);
final values = List.generate(1000, (i) => i);
final expectedSize = codec.sizeHint(values);
// Pre-allocate buffer if needed

Avoid Deep Nesting

Deeply nested collections can be slow to encode/decode:
// Avoid: Vec<Vec<Vec<Vec<u32>>>>
// Prefer: Flatten when possible

Use Byte Arrays for Raw Data

// Prefer U8ArrayCodec for fixed-size raw bytes
final codec = U8ArrayCodec(32);
// Over SequenceCodec for known sizes

Next Steps