Skip to content

Variable-Length Integer Encoding

Your task is to implement a Variable-length integer encoding compression technique that uses fewer bytes to represent smaller numbers and more bytes for larger numbers. The encoding uses a stop-bit scheme where each byte stores 7 bits of actual data, with the most significant bit (MSB) acting as a continuation flag:

  • MSB = 0: More bytes follow (continuation byte)
  • MSB = 1: This is the last byte (stop byte)

Examples

ValueEncoding
0[0x80]
127[0xFF]
128[0x01, 0x80]
16256[0x7F, 0x80]
19730[0x01, 0x1A, 0x92]

Size Comparison Examples

ValueBinaryFixed 32-bitVariable-LengthSavings
004 bytes1 byte75%
12711111114 bytes1 byte75%
128100000004 bytes2 bytes50%
16383111111111111114 bytes2 bytes50%
163841000000000000004 bytes3 bytes25%
197301001101001100104 bytes3 bytes25%

Common Patterns

Single Byte Values (0-127)

  • Range: 0 to 127
  • Pattern: Only the stop-bit is set (0x80 to 0xFF)
  • Examples: 0→0x80, 1→0x81, 127→0xFF

Two Byte Values (128-16383)

  • Range: 128 to 16383
  • Pattern: First byte has MSB=0, second byte has MSB=1
  • Examples: 128→[0x01,0x80], 255→[0x01,0xFF]

Three Byte Values (16384+)

  • Range: 16384 and above
  • Pattern: First two bytes have MSB=0, last byte has MSB=1
  • Examples: 16384→[0x01,0x00,0x80], 19730→[0x01,0x1A,0x92]

Implement the variable-length integer encoder and decoder using a stop-bit scheme descrived above.

cpp
#include <cstdint>
#include <stdexcept>
#include <type_traits>

// TODO: Implement variable-length integer encoding and decoding
//
// Requirements:
// 1. Each byte stores 7 bits of data with MSB as stop-bit
// 2. Stop-bit (MSB) is set only in the last byte
// 3. Functions should only work with unsigned integer types
// 4. Use static_assert to ensure unsigned types only
// 5. Proper error handling for invalid inputs
//
// Key Methods to Implement:
// - encode(value, buffer, buffer_size): Encode unsigned integer to bytes
// - decode(buffer, length): Decode bytes back to unsigned integer
//
// Implementation Strategy:
// - Use static_assert with std::is_unsigned to restrict types
// - Calculate required bytes using std::bit_width (C++20) or manual calculation
// - Extract 7-bit chunks using bitwise operations
// - Set stop-bit (0x80) on the last byte
// - Validate stop-bits during decoding

// Constants for encoding
static constexpr auto PAYLOAD_BITS_PER_BYTE = 7;
static constexpr auto MSB_MASK = 1 << PAYLOAD_BITS_PER_BYTE;  // 0x80
static constexpr auto ENCODING_MASK = MSB_MASK - 1;           // 0x7F

/**
 * @brief Encodes an unsigned integer into a variable-length byte stream.
 * 
 * Each byte stores 7 bits of the original value, with the MSB acting as a stop-bit.
 * The stop-bit is set only in the last byte of the sequence.
 * 
 * @tparam UInt An unsigned integral type
 * @param value The unsigned integer value to encode
 * @param buffer Pointer to the output buffer
 * @param buffer_size Size of the output buffer in bytes
 * @return Number of bytes used in the encoded output, or 0 if buffer is too small
 */
template <typename UInt>
std::size_t encode(UInt value, char* buffer, std::size_t buffer_size) {
    static_assert(std::is_unsigned_v<UInt>, "UInt must be an unsigned integral type");

    return 0;
}

/**
 * @brief Decodes an unsigned integer from a variable-length encoded byte stream.
 * 
 * Reconstructs the original value from the encoded bytes, validating the stop-bit scheme.
 * 
 * @tparam UInt An unsigned integral type
 * @param buffer Pointer to the input buffer containing encoded bytes
 * @param length Number of bytes in the encoded input
 * @return The decoded unsigned integer
 * @throws std::runtime_error if encoding is invalid
 */
template <typename UInt>
UInt decode(const char* buffer, std::size_t length) {
    static_assert(std::is_unsigned_v<UInt>, "UInt must be an unsigned integral type");

    return 0;
}