Streaming
Receive AI responses in real-time chunks rather than waiting for the complete response, improving user experience for long interactions.
Streaming allows your application to display AI responses as they’re being generated, creating a more responsive and engaging user experience.
Basic Streaming
The simplest way to use streaming is with the respondStreamed
method:
Streaming with Callback
You can also provide a callback function to process each chunk:
Understanding Stream Chunks
The stream can yield three types of chunks:
StreamedAssistantMessage
Regular text content chunks from the AI assistant
ToolCallMessage
Tool call messages (handled internally by LarAgent)
Array
Final chunk when structured output is enabled
For most use cases, you only need to handle StreamedAssistantMessage
chunks as shown in the examples above. Tool calls are processed automatically by LarAgent.
Laravel HTTP Streaming
For Laravel applications, LarAgent provides the streamResponse
method that returns a Laravel StreamedResponse
, making it easy to integrate with your controllers:
The streamResponse
method supports three formats:
Frontend implementation (JavaScript):
Frontend implementation (JavaScript):
Example output:
Frontend implementation:
Example output:
Frontend implementation:
Streaming with Structured Output
When using structured output with streaming, you’ll receive text chunks during generation, and the final structured data at the end:
When using SSE format with structured output, you’ll receive a special event:
Best Practices
Do use streaming for long responses to improve user experience
Do handle both text chunks and structured output appropriately
Do implement proper error handling in your streaming code
Don’t forget to consume the entire stream, even when using callbacks
Don’t rely on specific timing of chunks, as they can vary based on network conditions