Building an AI Chatbot with Blazor Server and .NET 10

If you searched "build AI chatbot" in 2024, every result was Python + a JavaScript frontend. In 2026 that's no longer necessary. Microsoft.Extensions.AI and Blazor Server together give .NET developers a clean, end-to-end chatbot stack — with provider-agnostic models, native token streaming over SignalR, and SQL Server for conversation history. No Python in the path, no separate Node service, no copy-pasting JavaScript code into a Razor file.

This guide walks through the production-ready architecture we recommend on Adaptive Web Hosting: .NET 10 on real Windows + IIS, Blazor Server for the UI, IChatClient for the model, and SQL Server 2022 for session storage.

StreamToken-by-token UX

.NET 10LTS runtime

FREESSL + WAF on every plan

The Blazor chatbot stack in 2026

Blazor Server

Reactive components rendered server-side. SignalR pipe stays open, perfect for streaming chat responses without WebSocket plumbing.

✅ AI Abstractions

Microsoft.Extensions.AI

Unified IChatClient across OpenAI, Anthropic, Azure OpenAI, Ollama, ONNX-local. Swap providers without rewriting your code.

✅ Storage

SQL Server 2022

Conversation history, user preferences, audit logs. Included on every Adaptive Web Hosting plan.

✅ Hosting

IIS on Windows

Real Windows Server with dedicated IIS Application Pools. SignalR + Blazor Server thrive on this — no serverless cold starts.

🟡 Provider

Model API

A GPT-4-class model from OpenAI, Anthropic, Azure OpenAI, or self-hosted via Ollama. Your choice — IChatClient abstracts the difference.

🟡 Optional

Content moderation

For public-facing chat, layer in OpenAI Moderation or Azure Content Safety before sending or storing user input.

Quick reference: the six components

  • The Blazor Server architecture

Blazor Server already keeps a SignalR connection open. Streaming chat is a natural fit — push tokens through the existing pipe, re-render the component. No separate WebSocket, no client-side state machine, no race conditions between fetch and render.

The chat page is a single Razor component that maintains the conversation list in private List<ChatMessage> and re-renders as tokens arrive. SignalR handles transport. The connection lifetime matches the page lifetime, so reconnect logic is built in.

@page "/chat"

@inject IChatClient ChatClient

@inject IChatStore Store

<div class="chat-window">

@foreach (var msg in _messages)

{

<div class="msg @(msg.Role == ChatRole.User ? "user" : "assistant")">

@msg.Text

</div>

}

@if (_streaming)

{

<div class="msg assistant streaming">@_currentStream</div>

}

</div>

<form @onsubmit="SendAsync">

<input @bind="_input" disabled="@_streaming" />

<button type="submit" disabled="@_streaming">Send</button>

</form>

@code {

private List<ChatMessage> _messages = new();

private string _input = "";

private string _currentStream = "";

private bool _streaming = false;

// ...

}

Why a Developer plan handles 100+ concurrent chat users

Blazor Server is more efficient than people expect. Each connected user holds a circuit (~250 KB RAM + a SignalR connection). With 1 GB of RAM in a dedicated IIS Application Pool on the Adaptive Web Hosting ASP.NET Developer plan, you can comfortably support 100+ concurrent chatters on a small to mid-size app. For higher concurrency, the Business plan at 2 GB RAM gives you headroom.

  • Provider-agnostic client with Microsoft.Extensions.AI

This is the most important architectural decision: never bind your code to a specific model provider's SDK. Use IChatClient as your interface and inject a provider implementation at startup. You can swap from OpenAI to Anthropic to a self-hosted Ollama model without changing a single line of your chat code.

// Program.cs — pick a provider via configuration

var provider = builder.Configuration["AI:Provider"];

builder.Services.AddSingleton<IChatClient>(_ => provider switch

{

"openai" => new OpenAI.Chat.ChatClient(

model: builder.Configuration["AI:Model"]!,

apiKey: builder.Configuration["AI:ApiKey"]!)

.AsIChatClient(),

"anthropic" => new AnthropicClient(builder.Configuration["AI:ApiKey"]!)

.Messages.AsIChatClient(builder.Configuration["AI:Model"]!),

"ollama" => new OllamaChatClient(

new Uri("http://localhost:11434"),

modelId: builder.Configuration["AI:Model"]!),

_ => throw new InvalidOperationException("Unknown provider")

});

The Razor component injects IChatClient and uses the same call pattern regardless of which provider is configured. Migration between providers is now a configuration change, not a refactor.

  • Streaming responses with IAsyncEnumerable

Users tolerate AI latency if they see tokens appearing in real time. They abandon chatbots that stare blankly for 8 seconds before dumping a wall of text. Streaming is not optional — it's the difference between "feels fast" and "feels broken."

Microsoft.Extensions.AI returns an IAsyncEnumerable<ChatResponseUpdate> from GetStreamingResponseAsync. The Blazor pattern is to accumulate text chunks into a state field and call StateHasChanged() after each chunk:

private async Task SendAsync()

{

var userMsg = new ChatMessage(ChatRole.User, _input);

_messages.Add(userMsg);

_input = "";

_streaming = true;

_currentStream = "";

StateHasChanged();

var responseBuilder = new StringBuilder();

await foreach (var update in ChatClient.GetStreamingResponseAsync(_messages))

{

// ChatResponseUpdate contains the delta text for this chunk

var delta = update.Text ?? "";

responseBuilder.Append(delta);

_currentStream = responseBuilder.ToString();

StateHasChanged();

}

// Finalize: move streaming buffer into messages list

_messages.Add(new ChatMessage(ChatRole.Assistant, responseBuilder.ToString()));

_currentStream = "";

_streaming = false;

await Store.SaveAsync(_sessionId, _messages);

StateHasChanged();

}

Two production gotchas worth knowing:

SignalR throughput. By default SignalR has a 32 KB message size limit. Long responses with many tokens can hit this. Configure MaximumReceiveMessageSize in AddSignalR() if you expect long single-token responses.

Idempotent state updates. If a user navigates away mid-stream, your await foreach keeps running. Use a CancellationToken tied to the component's lifetime, and check cancellationToken.IsCancellationRequested between updates.

  • Conversation history in SQL Server

For anything beyond a demo, conversations need to persist. We store messages as a ChatMessages table keyed by SessionId. Schema is straightforward:

public class ChatSession

{

public Guid Id { get; set; }

public string UserId { get; set; } = "";

public DateTime CreatedAt { get; set; } = DateTime.UtcNow;

public List<ChatMessageEntity> Messages { get; set; } = new();

}

public class ChatMessageEntity

{

public long Id { get; set; }

public Guid SessionId { get; set; }

public string Role { get; set; } = ""; // "user" | "assistant" | "system" | "tool"

public string Content { get; set; } = "";

public int? PromptTokens { get; set; }

public int? CompletionTokens { get; set; }

public DateTime CreatedAt { get; set; } = DateTime.UtcNow;

}

Two practical refinements once you're past a prototype:

Token tracking. Store PromptTokens and CompletionTokens per message. You'll need this when (not if) the finance question comes up.

Conversation truncation. LLM context windows are large but not unlimited. When a session exceeds ~50 messages, summarize the oldest 25 into a single system message and keep going. Roll your own with a second IChatClient call, or use Semantic Kernel's history reducer.

SQL Server 2022 with EF Core 10 handles thousands of conversations per database. Adaptive Web Hosting plans include 2–10 SQL Server 2022 databases (1 GB each) depending on tier, with the option to add capacity if your bot becomes popular.

  • Function calling: letting the bot do things

A chatbot that only talks is a search engine with worse UX. The real value of IChatClient is that you can register C# methods as tools, and the model calls them when relevant. Order lookup, account status, ticket creation — all become tool calls.

[Description("Look up the current status of an order by ID")]

public async Task<OrderStatus> GetOrderStatusAsync(

[Description("The order ID, e.g., 'ORD-12345'")] string orderId)

{

return await _orders.GetStatusAsync(orderId);

}

[Description("Create a support ticket for the user")]

public async Task<string> CreateTicketAsync(

[Description("Short subject line")] string subject,

[Description("Detailed description of the issue")] string description)

{

var ticketId = await _tickets.CreateAsync(_currentUser.Id, subject, description);

return $"Ticket #{ticketId} created.";

}

// Wire them up at startup

var chatOptions = new ChatOptions

{

Tools = [

AIFunctionFactory.Create(GetOrderStatusAsync),

AIFunctionFactory.Create(CreateTicketAsync)

]

};

The model now sees these tool definitions in every request, decides when to invoke them, and the .NET runtime executes them with type-safe arguments. The next response includes the tool result and the model's interpretation. AIFunctionFactory handles the JSON-schema generation and result serialization automatically.

Guardrails for production tools

Authorization. Pull _currentUser from the Blazor authentication scope, never trust IDs the model passes in.

Side-effect confirmation. For destructive operations (cancel order, delete account), have the tool return a "confirmation required" result that the UI surfaces as a button, not direct execution.

Timeouts. Wrap every tool in a timeout. A hung database call should not freeze the chat.

  • Production hardening

The functional chatbot is now done. The production chatbot needs five more things:

Rate limiting

Per-user and per-IP. AspNetCoreRateLimit works fine for this. Anonymous users typically get 10 messages per hour; authenticated users get whatever your plan supports. Without this, a botnet can burn through your model budget overnight.

Cost caps

Track token usage per session and per user. Hard-cap monthly spend per user. The math is simple — accumulate the token counts we already store in ChatMessageEntity, multiply by your provider's price, refuse new messages above the cap.

Content moderation

For any public-facing chat, run user input through a moderation API before sending it to the model. The moderation step is cheap (sub-cent per message at scale) and catches prompt-injection attempts, abuse, and content policy violations. Both OpenAI's Moderation endpoint and Azure Content Safety are well-supported in .NET.

Audit log

Compliance, debugging, and product analytics all need the same thing: a permanent record of conversations. Write (SessionId, MessageId, Role, Content, ToolCalls, Tokens, Cost, Timestamp) to an append-only table or log sink. SQL Server 2022 with table partitioning by month works well; many teams ship logs to a separate analytics store via Azure Service Bus or RabbitMQ for long-term retention.

Health checks

Add builder.Services.AddHealthChecks() entries for your model provider (probe endpoint) and SQL Server. Wire them to /health. IIS on Adaptive Web Hosting will respect these via Plesk's monitoring panel.

Configure IIS Application Initialization to keep your worker process warm. Cold-start delays after idle timeout are noticeable in a chat context (3-5 seconds of "is it broken?"). Pre-load the AI client and database connections at startup, not on first request.

Hosting recommendations

Where your chatbot fits on Adaptive Web Hosting depends on traffic and concurrency:

ASP.NET Business — $17.49/mo

Customer-facing chatbots with moderate traffic. 500–1,000 concurrent users. 2 GB RAM per pool. Most popular tier.

View Business plan →

ASP.NET Professional — $27.49/mo

Multi-app chatbot platforms, agency white-label deployments, high-traffic public chat. 4 GB RAM, highest priority scheduling.

View Professional plan →

FAQs

Does Blazor Server work over a regular HTTPS connection?

Yes. SignalR negotiates WebSockets by default, and falls back to long-polling if WebSockets are blocked. Adaptive Web Hosting includes free Let's Encrypt SSL on every plan and supports WebSockets out of the box on IIS.

Do I need a vector database for a chatbot?

Only if you want it grounded in your own knowledge base (RAG). For a pure conversational bot calling your domain APIs as tools, no — SQL Server holds the history and your existing application database holds the facts.

Can I use a self-hosted model instead of an API?

Yes. Microsoft.Extensions.AI has an Ollama client, and Ollama runs on Windows. Realistically the model will run on a separate machine with a GPU — Adaptive Web Hosting plans are shared and don't provision GPUs. The chatbot app itself runs on Adaptive; the inference runs wherever your GPU sits.

How do I handle long conversations?

Truncate or summarize older messages once the conversation exceeds the model's effective context window. A simple approach: keep the last 20 messages verbatim and replace anything older with a one-paragraph summary generated by a second IChatClient call.

What about Blazor WebAssembly?

Blazor WASM works for the UI but you'll need a separate API project to keep your model API key off the client. Blazor Server is simpler for chat — the key never leaves the server, and SignalR handles streaming natively.

Can I run multiple chatbot instances behind a load balancer?

Blazor Server requires sticky sessions because the circuit lives on a specific server. On Adaptive Web Hosting plans you typically run a single dedicated app pool, which avoids the issue. For multi-instance setups, use Azure SignalR Service or Redis backplane.

Ship it

The .NET ecosystem caught up to AI in 2025. Microsoft.Extensions.AI gave us a clean abstraction, Blazor Server gave us a reactive UI without JavaScript, and .NET 10 LTS gives us a stable runtime through 2028. The chatbot stack is now native .NET top to bottom.

Adaptive Web Hosting's ASP.NET hosting plans are built for exactly this scenario — real Windows + IIS, SQL Server 2022, dedicated app pools, free SSL. Pick the plan that matches your concurrency target and you have a production chatbot platform.

Back to Blog