ReadSource Concept Design

Overview

This document describes the design of the ReadSource concept: a refinement of ReadStream that adds a complete-read primitive. It explains how ReadSource relates to ReadStream, why the refinement hierarchy mirrors the write side, and the use cases each serves.

Definition

template<typename T>
concept ReadSource =
    ReadStream<T> &&
    requires(T& source, mutable_buffer_archetype buffers)
    {
        { source.read(buffers) } -> IoAwaitable;
        requires awaitable_decomposes_to<
            decltype(source.read(buffers)),
            std::error_code, std::size_t>;
    };

ReadSource refines ReadStream. Every ReadSource is a ReadStream. A ReadSource provides two operations:

read_some(buffers) — Partial Read (inherited from ReadStream)

Reads one or more bytes from the source into the buffer sequence. Returns (error_code, std::size_t) where n is the number of bytes read. May return fewer bytes than the buffer can hold.

Semantics

  • On success: !ec, n >= 1 and n <= buffer_size(buffers).

  • On EOF: ec == cond::eof, n == 0.

  • On error: ec, n == 0.

  • If buffer_empty(buffers): completes immediately, !ec, n == 0.

Once read_some returns an error (including EOF), the caller must not call read_some again. The stream is done. Not all implementations can reproduce a prior error on subsequent calls, so the behavior after an error is undefined.

read(buffers) — Complete Read

Reads data into the buffer sequence. Either fills the entire buffer or returns an error. Returns (error_code, std::size_t) where n is the number of bytes read.

Semantics

  • On success: !ec, n == buffer_size(buffers). The buffer is completely filled.

  • On EOF: ec == cond::eof, n is the number of bytes read before EOF was reached (may be less than buffer_size(buffers)).

  • On error: ec, n is the number of bytes read before the error.

  • If buffer_empty(buffers): completes immediately, !ec, n == 0.

Successful partial reads are not permitted. Either the entire buffer is filled, or the operation returns with an error. This is the defining property of a complete-read primitive.

Once read returns an error (including EOF), the caller must not call read or read_some again. The source is done. Not all implementations can reproduce a prior error on subsequent calls, so the behavior after an error is undefined.

When the buffer sequence contains multiple buffers, each buffer is filled completely before proceeding to the next.

Buffer Lifetime

The caller must ensure that the memory referenced by buffers remains valid until the co_await expression returns.

Conforming Signatures

template<MutableBufferSequence Buffers>
IoAwaitable auto read_some(Buffers buffers);

template<MutableBufferSequence Buffers>
IoAwaitable auto read(Buffers buffers);

Concept Hierarchy

ReadStream    { read_some }
    |
    v
ReadSource    { read_some, read }

This mirrors the write side:

WriteStream   { write_some }
    |
    v
WriteSink     { write_some, write, write_eof(buffers), write_eof() }

Algorithms constrained on ReadStream accept both raw streams and sources. Algorithms that need the complete-read guarantee constrain on ReadSource.

Why ReadSource Refines ReadStream

Every concrete ReadSource type has a natural read_some:

  • HTTP content-length body: read_some returns min(available_from_network, remaining_content_length) bytes. It is the underlying stream’s read_some capped by the body’s limit.

  • HTTP chunked body: read_some delivers whatever unchunked data is available from the current chunk.

  • Decompression source (inflate, zstd): read_some does one decompression pass — feeds available compressed input to the decompressor and returns whatever output is produced. This is how zlib::inflate() naturally works.

  • File source: read_some is a single read() syscall. It is the OS primitive.

  • Memory source: read_some returns min(requested, remaining).

No concrete source type lacks a meaningful read_some. The claim that "many sources can’t meaningfully offer `read_some`" does not hold up under scrutiny.

The Relay Argument

If ReadSource were disjoint from ReadStream, generic relay code would need two separate implementations:

// One for ReadStream sources
template<ReadStream Src, WriteSink Dest>
task<> relay(Src& src, Dest& dest);

// A different one for ReadSource sources
template<ReadSource Src, WriteSink Dest>
task<> relay(Src& src, Dest& dest);

With the refinement relationship, one function handles both:

// Works for TCP sockets, HTTP bodies, decompressors, files
template<ReadStream Src, WriteSink Dest>
task<> relay(Src& src, Dest& dest);

This is the same argument that justified WriteSink refining WriteStream.

The Latency Argument

With only read (complete read), a relay must wait for the entire buffer to fill before forwarding any data:

// Must fill 64KB before sending -- high latency
auto [ec, n] = co_await src.read(mutable_buffer(buf, 65536));
co_await dest.write_some(const_buffer(buf, n));

With read_some, data is forwarded as it becomes available:

// Returns with 1KB if that's what's available -- low latency
auto [ec, n] = co_await src.read_some(mutable_buffer(buf, 65536));
co_await dest.write_some(const_buffer(buf, n));

For a decompressor backed by a slow network connection, read_some lets you decompress and forward whatever is available instead of blocking until the entire buffer is filled.

Member Function Comparison

read_some read

Returns whatever is available (at least 1 byte)

Fills the entire buffer or errors

Low latency: forward data immediately

Higher latency: waits for full buffer

Caller loops for complete reads

Source guarantees completeness

Natural for relays and streaming

Natural for fixed-size records and structured data

Composed Algorithms

read(source, dynamic_buffer) — Read Until EOF

auto read(ReadSource auto& source,
          DynamicBufferParam auto&& buffers,
          std::size_t initial_amount = 2048)
    -> io_task<std::size_t>;

Reads from the source into a dynamic buffer until EOF. The buffer grows with a 1.5x factor when filled. On success (EOF), ec is clear and n is total bytes read.

This is the ReadSource equivalent of the ReadStream overload. Both use the same read free function name, distinguished by concept constraints.

Use Cases

Reading an HTTP Body

An HTTP body with a known content length is a ReadSource. The caller reads into a buffer, and the source ensures exactly the right number of bytes are delivered.

template<ReadSource Source>
task<std::string> read_body(Source& body, std::size_t content_length)
{
    std::string result(content_length, '\0');
    auto [ec, n] = co_await body.read(
        mutable_buffer(result.data(), result.size()));
    if(ec)
    {
        result.resize(n);
        co_return result;
    }
    co_return result;
}

Reading into a Dynamic Buffer

When the body size is unknown (e.g., chunked encoding), read until EOF using the dynamic buffer overload.

template<ReadSource Source>
task<std::string> read_chunked_body(Source& body)
{
    std::string result;
    auto [ec, n] = co_await read(
        body, string_dynamic_buffer(&result));
    if(ec)
        co_return {};
    co_return result;
}

Reading Fixed-Size Records from a Source

When a source produces structured records of known size, read guarantees each record is completely filled.

struct record
{
    uint32_t id;
    char data[256];
};

template<ReadSource Source>
task<> process_records(Source& source)
{
    for(;;)
    {
        record rec;
        auto [ec, n] = co_await source.read(
            mutable_buffer(&rec, sizeof(rec)));
        if(ec == cond::eof)
            co_return;
        if(ec)
            co_return;

        handle_record(rec);
    }
}

Decompression with Low-Latency Relay

A decompression source wraps a ReadStream and produces decompressed data. Using read_some (inherited from ReadStream), a relay can forward decompressed data as it becomes available instead of waiting for a full buffer.

template<ReadSource Source, WriteSink Sink>
task<> relay_decompressed(Source& inflater, Sink& dest)
{
    char buf[8192];
    for(;;)
    {
        // read_some: decompress whatever is available
        auto [ec, n] = co_await inflater.read_some(
            mutable_buffer(buf));
        if(ec == cond::eof)
        {
            auto [wec] = co_await dest.write_eof();
            co_return;
        }
        if(ec)
            co_return;

        auto [wec, nw] = co_await dest.write(
            const_buffer(buf, n));
        if(wec)
            co_return;
    }
}

Relaying from ReadSource to WriteSink

When connecting a source to a sink, read_some provides low-latency forwarding. The final chunk uses write_eof for atomic delivery plus EOF signaling.

template<ReadStream Src, WriteSink Sink>
task<> relay(Src& src, Sink& dest)
{
    char buf[8192];
    for(;;)
    {
        auto [ec, n] = co_await src.read_some(
            mutable_buffer(buf));
        if(ec == cond::eof)
        {
            auto [wec] = co_await dest.write_eof();
            co_return;
        }
        if(ec)
            co_return;

        auto [wec, nw] = co_await dest.write(
            const_buffer(buf, n));
        if(wec)
            co_return;
    }
}

Because ReadSource refines ReadStream, this relay accepts ReadSource types (HTTP bodies, decompressors, files) as well as raw ReadStream types (TCP sockets, TLS streams).

Type-Erased Source

The any_read_source wrapper type-erases a ReadSource behind a virtual interface. This is useful when the concrete source type is not known at compile time.

task<> handle_request(any_read_source& body)
{
    // Works for content-length, chunked,
    // compressed, or any other source type
    std::string data;
    auto [ec, n] = co_await read(
        body, string_dynamic_buffer(&data));
    if(ec)
        co_return;

    process_request(data);
}

Conforming Types

Examples of types that satisfy ReadSource:

  • HTTP content-length body: read_some returns available bytes capped by remaining length. read fills the buffer, enforcing the content length limit.

  • HTTP chunked body: read_some delivers available unchunked data. read decodes chunk framing and fills the buffer.

  • Decompression source (inflate, zstd): read_some does one decompression pass. read loops decompression until the buffer is filled.

  • File source: read_some is a single read() syscall. read loops until the buffer is filled or EOF.

  • Memory source: read_some returns available bytes. read fills the buffer from the memory region.

Why read_some Returns No Data on EOF

The read_some contract (inherited from ReadStream) requires that when ec == cond::eof, n is always 0. Data and EOF are delivered in separate calls. See ReadStream: Why Errors Exclude Data for the full rationale. The key points:

  • The clean trichotomy (success/EOF/error, where data implies success) eliminates an entire class of bugs where callers accidentally drop the final bytes of a stream.

  • Write-side atomicity (write_eof(buffers)) serves correctness for protocol framing. Read-side piggybacking would be a minor optimization with significant API cost.

  • Every concrete source type naturally separates its last data delivery from its EOF indication.

  • POSIX read() follows the same model.

This contract carries over to ReadSource unchanged. The read member function (complete read) does allow n > 0 on EOF, because it is a composed loop that accumulates data across multiple internal read_some calls. When the underlying stream signals EOF mid-accumulation, discarding the bytes already gathered would be wrong. The caller needs n to know how much valid data landed in the buffer.

Summary

ReadSource refines ReadStream by adding read for complete-read semantics. The refinement relationship enables:

  • Generic algorithms constrained on ReadStream work on both raw streams and sources.

  • read_some provides low-latency forwarding in relays.

  • read provides the complete-fill guarantee for structured data.

Function Contract Use Case

ReadSource::read_some

Returns one or more bytes. May fill less than the buffer.

Relays, low-latency forwarding, incremental processing.

ReadSource::read

Fills the entire buffer or returns an error with partial count.

HTTP bodies, decompression, file I/O, structured records.

read composed (on ReadStream)

Loops read_some until the buffer is filled.

Fixed-size headers, known-length messages over raw streams.

read composed (on ReadSource)

Loops read into a dynamic buffer until EOF.

Slurping an entire body of unknown size.