> ## Documentation Index
> Fetch the complete documentation index at: https://docs.laragent.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Streaming

> Receive AI responses in real-time chunks rather than waiting for the complete response, improving user experience for long interactions.

<Note>
  Streaming allows your application to display AI responses as they're being generated, creating a more responsive and engaging user experience.
</Note>

## Basic Streaming

The simplest way to use streaming is with the `respondStreamed` method:

```php theme={null}
$agent = WeatherAgent::for('user-123');
$stream = $agent->respondStreamed('What\'s the weather like in Boston and Los Angeles?');

foreach ($stream as $chunk) {
    if ($chunk instanceof \LarAgent\Messages\StreamedAssistantMessage) {
        echo $chunk->getLastChunk(); // Output each new piece of content
    }
}
```

## Streaming with Callback

You can also provide a callback function to process each chunk:

```php theme={null}
$agent = WeatherAgent::for('user-123');
$stream = $agent->respondStreamed(
    'What\'s the weather like in Boston and Los Angeles?',
    function ($chunk) {
        if ($chunk instanceof \LarAgent\Messages\StreamedAssistantMessage) {
            echo $chunk->getLastChunk();
            // Flush output buffer to send content to the browser immediately
            ob_flush();
            flush();
        }
    }
);

// Consume the stream to ensure it completes
foreach ($stream as $_) {
    // The callback handles the output
}
```

## Understanding Stream Chunks

The stream can yield three types of chunks:

<CardGroup cols={2}>
  <Card title="StreamedAssistantMessage" icon="message">
    Regular text content chunks from the AI assistant
  </Card>

  <Card title="ToolCallMessage" icon="wrench">
    Tool call messages (handled internally by LarAgent)
  </Card>

  <Card title="Array" icon="brackets-curly">
    Final chunk when structured output is enabled
  </Card>
</CardGroup>

For most use cases, you only need to handle `StreamedAssistantMessage` chunks as shown in the examples above. Tool calls are processed automatically by LarAgent.

## Laravel HTTP Streaming

For Laravel applications, LarAgent provides the `streamResponse` method that returns a Laravel `StreamedResponse`, making it easy to integrate with your controllers:

```php theme={null}
// In a controller
public function chat(Request $request)
{
    $message = $request->input('message');
    $agent = WeatherAgent::for(auth()->id());
    
    // Return a streamable response
    return $agent->streamResponse($message, 'plain');
}
```

The `streamResponse` method supports three formats:

* Plain Text
* JSON
* Server-Sent Events (SSE)

<Tabs>
  <Tab title="Plain Text">
    ```php theme={null}
    // Simple text output
    return $agent->streamResponse($message, 'plain');
    ```

    Frontend implementation (JavaScript):

    ```javascript theme={null}
    fetch('/chat?message=What is the weather in Boston?')
      .then(response => {
        const reader = response.body.getReader();
        const decoder = new TextDecoder();
        
        function read() {
          return reader.read().then(({ done, value }) => {
            if (done) return;
            
            const text = decoder.decode(value);
            document.getElementById('output').textContent += text;
            
            return read();
          });
        }
        
        return read();
      });
    ```
  </Tab>

  <Tab title="JSON">
    ```php theme={null}
    // Structured JSON with delta and content
    return $agent->streamResponse($message, 'json');
    ```

    Example output:

    ```json theme={null}
    {"delta":"Hello","content":"Hello"}
    {"delta":" there","content":"Hello there"}
    {"delta":"!","content":"Hello there!"}
    ```

    Frontend implementation:

    ```javascript theme={null}
    fetch('/chat?message=Greet me&format=json')
      .then(response => {
        const reader = response.body.getReader();
        const decoder = new TextDecoder();
        
        function read() {
          return reader.read().then(({ done, value }) => {
            if (done) return;
            
            const text = decoder.decode(value);
            const lines = text.split('\n').filter(line => line.trim());
            
            lines.forEach(line => {
              try {
                const data = JSON.parse(line);
                document.getElementById('output').textContent = data.content;
              } catch (e) {
                console.error('Error parsing JSON:', e);
              }
            });
            
            return read();
          });
        }
        
        return read();
      });
    ```
  </Tab>

  <Tab title="Server-Sent Events">
    ```php theme={null}
    // Server-Sent Events format with event types
    return $agent->streamResponse($message, 'sse');
    ```

    Example output:

    ```
    event: content
    data: {"delta":"Hello","content":"Hello"}

    event: content
    data: {"delta":" there","content":"Hello there"}

    event: content
    data: {"delta":"!","content":"Hello there!"}

    event: complete
    data: {"content":"Hello there!"}
    ```

    Frontend implementation:

    ```javascript theme={null}
    const eventSource = new EventSource('/chat?message=Greet me&format=sse');

    eventSource.addEventListener('content', function(e) {
      const data = JSON.parse(e.data);
      document.getElementById('output').textContent = data.content;
    });

    eventSource.addEventListener('complete', function(e) {
      eventSource.close();
    });

    eventSource.addEventListener('error', function(e) {
      console.error('EventSource error:', e);
      eventSource.close();
    });
    ```
  </Tab>
</Tabs>

***

## Streaming with Structured Output

When using structured output with streaming, you'll receive text chunks during generation, and the final structured data at the end:

```php theme={null}
$agent = ProfileAgent::for('user-123');
$stream = $agent->respondStreamed('Generate a profile for John Doe');

$finalStructuredData = null;

foreach ($stream as $chunk) {
    if ($chunk instanceof \LarAgent\Messages\StreamedAssistantMessage) {
        echo $chunk->getLastChunk(); // Part of JSON
    } elseif (is_array($chunk)) {
        // This is the final structured data
        $finalStructuredData = $chunk;
    }
}

// Now $finalStructuredData is array which contains the structured output
// For example: ['name' => 'John Doe', 'age' => 30, 'interests' => [...]]
```

When using SSE format with structured output, you'll receive a special event:

```
event: structured
data: {"type":"structured","delta":"","content":{"name":"John Doe","age":30,"interests":["coding","reading","hiking"]},"complete":true}

event: complete
data: {"content":{"name":"John Doe","age":30,"interests":["coding","reading","hiking"]}}
```

## Best Practices

<Check>
  **Do** use streaming for long responses to improve user experience
</Check>

<Check>
  **Do** handle both text chunks and structured output appropriately
</Check>

<Check>
  **Do** implement proper error handling in your streaming code
</Check>

<Icon icon="x" iconType="solid" color="red" /> **Don't** forget to consume the entire stream, even when using callbacks

<Icon icon="x" iconType="solid" color="red" /> **Don't** rely on specific timing of chunks, as they can vary based on network conditions
