Streaming

Streaming allows your application to display AI responses as they’re being generated, creating a more responsive and engaging user experience.

Basic Streaming

The simplest way to use streaming is with the respondStreamed method:

$agent = WeatherAgent::for('user-123');
$stream = $agent->respondStreamed('What\'s the weather like in Boston and Los Angeles?');

foreach ($stream as $chunk) {
    if ($chunk instanceof \LarAgent\Messages\StreamedAssistantMessage) {
        echo $chunk->getLastChunk(); // Output each new piece of content
    }
}

Streaming with Callback

You can also provide a callback function to process each chunk:

$agent = WeatherAgent::for('user-123');
$stream = $agent->respondStreamed(
    'What\'s the weather like in Boston and Los Angeles?',
    function ($chunk) {
        if ($chunk instanceof \LarAgent\Messages\StreamedAssistantMessage) {
            echo $chunk->getLastChunk();
            // Flush output buffer to send content to the browser immediately
            ob_flush();
            flush();
        }
    }
);

// Consume the stream to ensure it completes
foreach ($stream as $_) {
    // The callback handles the output
}

Understanding Stream Chunks

The stream can yield three types of chunks:

StreamedAssistantMessage

Regular text content chunks from the AI assistant

ToolCallMessage

Tool call messages (handled internally by LarAgent)

Array

Final chunk when structured output is enabled

For most use cases, you only need to handle StreamedAssistantMessage chunks as shown in the examples above. Tool calls are processed automatically by LarAgent.

Laravel HTTP Streaming

For Laravel applications, LarAgent provides the streamResponse method that returns a Laravel StreamedResponse, making it easy to integrate with your controllers:

// In a controller
public function chat(Request $request)
{
    $message = $request->input('message');
    $agent = WeatherAgent::for(auth()->id());
    
    // Return a streamable response
    return $agent->streamResponse($message, 'plain');
}

The streamResponse method supports three formats:

// Simple text output
return $agent->streamResponse($message, 'plain');

Frontend implementation (JavaScript):

fetch('/chat?message=What is the weather in Boston?')
  .then(response => {
    const reader = response.body.getReader();
    const decoder = new TextDecoder();
    
    function read() {
      return reader.read().then(({ done, value }) => {
        if (done) return;
        
        const text = decoder.decode(value);
        document.getElementById('output').textContent += text;
        
        return read();
      });
    }
    
    return read();
  });

// Simple text output
return $agent->streamResponse($message, 'plain');

Frontend implementation (JavaScript):

fetch('/chat?message=What is the weather in Boston?')
  .then(response => {
    const reader = response.body.getReader();
    const decoder = new TextDecoder();
    
    function read() {
      return reader.read().then(({ done, value }) => {
        if (done) return;
        
        const text = decoder.decode(value);
        document.getElementById('output').textContent += text;
        
        return read();
      });
    }
    
    return read();
  });

// Structured JSON with delta and content
return $agent->streamResponse($message, 'json');

Example output:

{"delta":"Hello","content":"Hello"}
{"delta":" there","content":"Hello there"}
{"delta":"!","content":"Hello there!"}

Frontend implementation:

fetch('/chat?message=Greet me&format=json')
  .then(response => {
    const reader = response.body.getReader();
    const decoder = new TextDecoder();
    
    function read() {
      return reader.read().then(({ done, value }) => {
        if (done) return;
        
        const text = decoder.decode(value);
        const lines = text.split('\n').filter(line => line.trim());
        
        lines.forEach(line => {
          try {
            const data = JSON.parse(line);
            document.getElementById('output').textContent = data.content;
          } catch (e) {
            console.error('Error parsing JSON:', e);
          }
        });
        
        return read();
      });
    }
    
    return read();
  });

// Server-Sent Events format with event types
return $agent->streamResponse($message, 'sse');

Example output:

event: content
data: {"delta":"Hello","content":"Hello"}

event: content
data: {"delta":" there","content":"Hello there"}

event: content
data: {"delta":"!","content":"Hello there!"}

event: complete
data: {"content":"Hello there!"}

Frontend implementation:

const eventSource = new EventSource('/chat?message=Greet me&format=sse');

eventSource.addEventListener('content', function(e) {
  const data = JSON.parse(e.data);
  document.getElementById('output').textContent = data.content;
});

eventSource.addEventListener('complete', function(e) {
  eventSource.close();
});

eventSource.addEventListener('error', function(e) {
  console.error('EventSource error:', e);
  eventSource.close();
});

Streaming with Structured Output

When using structured output with streaming, you’ll receive text chunks during generation, and the final structured data at the end:

$agent = ProfileAgent::for('user-123');
$stream = $agent->respondStreamed('Generate a profile for John Doe');

$finalStructuredData = null;

foreach ($stream as $chunk) {
    if ($chunk instanceof \LarAgent\Messages\StreamedAssistantMessage) {
        echo $chunk->getLastChunk(); // Part of JSON
    } elseif (is_array($chunk)) {
        // This is the final structured data
        $finalStructuredData = $chunk;
    }
}

// Now $finalStructuredData is array which contains the structured output
// For example: ['name' => 'John Doe', 'age' => 30, 'interests' => [...]]

When using SSE format with structured output, you’ll receive a special event:

event: structured
data: {"type":"structured","delta":"","content":{"name":"John Doe","age":30,"interests":["coding","reading","hiking"]},"complete":true}

event: complete
data: {"content":{"name":"John Doe","age":30,"interests":["coding","reading","hiking"]}}

Best Practices

Do use streaming for long responses to improve user experience

Do handle both text chunks and structured output appropriately

Do implement proper error handling in your streaming code

Don’t forget to consume the entire stream, even when using callbacks

Don’t rely on specific timing of chunks, as they can vary based on network conditions

Get Started

Core concepts

Extensibility & Customization

Basic Streaming

Streaming with Callback

Understanding Stream Chunks

StreamedAssistantMessage

ToolCallMessage

Array

Laravel HTTP Streaming

Streaming with Structured Output

Best Practices

Get Started

Core concepts

Extensibility & Customization

​Basic Streaming

​Streaming with Callback

​Understanding Stream Chunks

StreamedAssistantMessage

ToolCallMessage

Array

​Laravel HTTP Streaming

​Streaming with Structured Output

​Best Practices

Basic Streaming

Streaming with Callback

Understanding Stream Chunks

Laravel HTTP Streaming

Streaming with Structured Output

Best Practices