Semantic Search for .NET Applications: Embeddings + Blazor UI

The search box on your application is probably worse than it should be. Users type "shipping problem with my last order" and get nothing because the article titled "Tracking delivery issues" doesn't share any keywords. Users type "running shoes for flat feet" and get every shoe in the catalog instead of the four that mention pronation. The problem isn't bad data — it's that keyword search doesn't understand meaning.

Semantic search fixes this. Convert the query and every searchable item into a vector via an embedding model. Find items whose vectors are closest to the query's vector. Results match by meaning, not by string overlap. In 2026, .NET 10 has every piece needed to ship this — Microsoft.Extensions.AI for embeddings, vector storage in Postgres / SQL Server / Qdrant, and Blazor for the UI.

HybridLexical + semantic

.NET 10LTS runtime

SQL ServerOr pgvector / Qdrant

Semantic search is not RAG

RAG retrieves chunks and feeds them to an LLM that writes a prose answer. Semantic search retrieves items and shows them as a list, just like a regular search page — but the relevance ranking is meaning-based instead of keyword-based. Same underlying retrieval mechanic, completely different product surface. See our RAG guide for the chatbot-style pattern.

The semantic search ecosystem in .NET 10

Microsoft.Extensions.AI

IEmbeddingGenerator works against OpenAI, Azure OpenAI, Cohere, or self-hosted ONNX models. The interface stays the same; you switch providers in configuration.

✅ Local model

ONNX sentence-transformers

Small open-source embedding models run inside your .NET process. No per-call API cost. 400-dim vectors, fast inference on CPU.

✅ Storage

SQL Server / Postgres / Qdrant

Choose based on existing infra. SQL Server 2025 vector type for SQL shops. pgvector for Postgres. Qdrant for dedicated vector workloads.

🟡 Recommended

Hybrid ranking

Pure semantic misses exact matches (SKUs, names). Combine with BM25 / full-text via SQL Server's CONTAINS. Hybrid almost always wins on relevance metrics.

🟡 Optional

Reranker

A second pass that scores top-K candidates against the query directly. Adds ~200 ms but improves precision noticeably for product search.

🟡 Optional

Query reformulation

For conversational queries ("anything for back pain that's under $50"), an LLM reformulates the search into a filter + semantic query before retrieval.

Quick reference: the five-stage pipeline

Embedding your searchable items

For a product, embed the title + short description + key attributes (material, color, use case). Don't embed everything — irrelevant text dilutes the vector. For a knowledge base article, embed title + first paragraph + headings. Keep each embedded chunk under ~500 tokens.

Item embeddings are computed once at ingestion and re-computed only when the source content changes. Store the vector alongside your existing table:

// Product entity with embedding vector

public class Product

{

public int Id { get; set; }

public string Name { get; set; } = "";

public string Description { get; set; } = "";

public decimal Price { get; set; }

public string Category { get; set; } = "";

// Stored vector column — depends on your DB choice:

// SQL Server 2025: public byte[] Embedding { get; set; } (VECTOR type)

// Postgres pgvector: public float[] Embedding { get; set; }

// Qdrant: external store, keyed by Id

public float[] Embedding { get; set; } = [];

}

public async Task EmbedProductAsync(Product product)

{

var text = $"{product.Name}. {product.Description}. Category: {product.Category}";

var result = await _embedder.GenerateAsync(new[] { text });

product.Embedding = result[0].Vector.ToArray();

}

public async Task EmbedAllAsync()

{

// Batch — most providers handle 96-256 inputs per call

var products = await _db.Products.Where(p => p.Embedding.Length == 0).Take(96).ToListAsync();

var texts = products.Select(p => $"{p.Name}. {p.Description}").ToArray();

var embeddings = await _embedder.GenerateAsync(texts);

for (int i = 0; i < products.Count; i++)

products[i].Embedding = embeddings[i].Vector.ToArray();

await _db.SaveChangesAsync();

}

For a catalog of 50,000 products, batch embedding takes a few minutes and costs cents-to-dollars depending on provider. Schedule via Hangfire so new products get embedded shortly after they're added.

Query embedding

Every search request first embeds the user's query. The embedding model must be the same one used for items — mixing models silently destroys relevance.

public async Task<float[]> EmbedQueryAsync(string query)

{

var result = await _embedder.GenerateAsync(new[] { query });

return result[0].Vector.ToArray();

}

Practical tweak: cache common queries. For a busy e-commerce site, ~30% of searches are repeats. A simple in-memory LRU cache of the last 10,000 query embeddings cuts API calls noticeably.

Hybrid retrieval: lexical + semantic

Semantic search excels at paraphrase and concept matching. It fails at exact tokens — product SKUs, brand names, model numbers. A user searching for "MX Master 3S" expects exact matches, not semantically related mice. Combining BM25 (which loves exact tokens) with vector search (which loves meaning) gives you both.

Get top-K candidates from each method, merge, score:

public async Task<List<Product>> HybridSearchAsync(string query, int topK = 20)

{

// Lexical: SQL Server full-text or LIKE-based for small catalogs

var lexicalTask = _db.Products

.Where(p => EF.Functions.Contains(p.Name, query)

|| EF.Functions.Contains(p.Description, query))

.Take(50)

.ToListAsync();

// Semantic: vector search

var queryVec = await EmbedQueryAsync(query);

var semanticTask = _vectorStore.SearchAsync(queryVec, limit: 50);

await Task.WhenAll(lexicalTask, semanticTask);

// Merge with score normalization

var lexical = await lexicalTask;

var semantic = await semanticTask;

var merged = lexical.Select(p => (Item: p, Score: 0.5, From: "lexical"))

.UnionBy(semantic.Select(s => (Item: s.Item, Score: s.Score, From: "semantic")),

t => t.Item.Id)

.OrderByDescending(t => t.Score)

.Take(topK)

.Select(t => t.Item)

.ToList();

return merged;

}

Several scoring approaches exist for hybrid (reciprocal rank fusion is the textbook answer). The simple version above performs well for most catalogs. Tune the merge weights based on your golden set.

Reranking + faceted filtering

The top 20 candidates from hybrid retrieval are usually good, but not ordered. A reranker takes the query + candidate set and scores each candidate's actual relevance to the query. This is a cross-encoder model that's slower than embeddings but more precise.

public async Task<List<Product>> SearchWithRerankAsync(

string query,

SearchFilters filters)

{

var candidates = await HybridSearchAsync(query, topK: 30);

// Apply hard filters (price, category, in-stock)

candidates = candidates

.Where(p => (filters.MinPrice == null || p.Price >= filters.MinPrice)

&& (filters.MaxPrice == null || p.Price <= filters.MaxPrice)

&& (filters.Category == null || p.Category == filters.Category))

.ToList();

// Rerank top candidates

var reranked = await _reranker.RerankAsync(query, candidates,

textSelector: p => $"{p.Name}. {p.Description}");

return reranked.Take(filters.PageSize).ToList();

}

For Adaptive Web Hosting customers running on a single instance, a Cohere or local ONNX reranker hosted in-process is the simplest setup. For multi-instance deployments, run the reranker as a separate service.

The Blazor search UI

A modern semantic search UI has three pieces: instant autocomplete, the result list, and faceted filters. Blazor Server makes all three reactive and stream-friendly.

@page "/search"

@inject ISearchService Search

<input @bind="_query" @bind:event="oninput" @bind:after="OnQueryChanged"

placeholder="Try 'shoes for flat feet' or 'lightweight backpack for travel'" />

@foreach (var cat in _categories)

{

<label>

@cat.Name (@cat.Count)

</label>

}

</aside>

@foreach (var product in _results)

{

}

</main>

</div>

@code {

string _query = "";

List<Product> _results = new();

List<FacetItem> _categories = new();

CancellationTokenSource? _cts;

async Task OnQueryChanged()

{

// Debounce + cancel previous search

_cts?.Cancel();

_cts = new();

try

{

await Task.Delay(200, _cts.Token);

await Refresh();

}

catch (TaskCanceledException) { / user kept typing / }

}

async Task Refresh()

{

var filters = new SearchFilters

{

Categories = _categories.Where(c => c.Selected).Select(c => c.Name).ToList(),

PageSize = 24

};

_results = await Search.SearchAsync(_query, filters);

StateHasChanged();

}

Three production patterns worth knowing:

Debounce. Don't fire a search on every keystroke. 200 ms after the user stops typing is the standard.

Cancel prior queries. If the user kept typing, the previous search is wasted work — and worse, can arrive after the newer search and overwrite the correct results.

Stream results. For instant feedback, render the lexical results first (fast), then update with reranked results as they arrive (slower but better).

Where semantic search wins in production

E-commerce. "Lightweight running shoes for marathons" returns appropriate products even when the catalog doesn't use those exact words.

Knowledge bases / help centers. Users describe symptoms in natural language; semantic search finds articles describing solutions.

Internal document search. Engineers search "how do we handle webhook retries" and find the relevant runbook even if it's titled "Idempotency policy v3."

Recipe / content sites. "Dessert with chocolate but not too sweet" returns relevant recipes.

Job boards / classifieds. Match candidate descriptions to listings semantically, not on exact title overlap.

Adaptive Web Hosting includes SQL Server 2022 on every plan. While the native VECTOR type lives in SQL Server 2025, .NET 10 can store vectors as varbinary(max) on 2022 and run cosine similarity in-application for catalogs under ~100K items. For larger workloads, point the application at an external pgvector or Qdrant instance and keep transactional data in SQL Server.

Production patterns

Click-feedback loops

Log which result the user clicks for which query. Over time, this becomes a relevance dataset. Even simple click-through-rate signals improve ranking via a learning-to-rank model bolted onto your reranker.

Synonyms and abbreviations

Modern embedding models handle most synonyms naturally ("sneakers" ≈ "trainers" ≈ "running shoes"). For domain-specific terms ("XL" = "extra large", "SKU" = "stock-keeping unit"), add a synonym dictionary that expands the query before embedding.

Empty-result handling

If the search returns nothing, fall back to suggesting related queries (cluster recent queries by embedding similarity) or show featured products. Empty result pages are conversion killers — never show them.

Latency budget

Target sub-300ms end-to-end for search results. Budget: 50 ms query embedding + 100 ms hybrid retrieval + 100 ms rerank + 50 ms render. Anything beyond this and users notice. Cache aggressively, batch embeddings, and run reranking only on top-K candidates.

Hosting recommendations

ASP.NET Business — $17.49/mo

Mid-size e-commerce catalogs (10K-100K items), public knowledge bases, customer-facing search. 2 GB RAM per app pool.

View Business plan →

ASP.NET Professional — $27.49/mo

Multi-site search platforms, large catalogs, agency-managed deployments. 4 GB per pool, highest priority scheduling.

View Professional plan →

FAQs

Do I have to replace my existing search infrastructure?

No — and you usually shouldn't, immediately. Run semantic search alongside your existing keyword search and A/B test relevance with real users. If semantic wins your metrics (click-through, conversion, time-to-result), graduate it to the default.

How big a catalog can I search?

With pgvector or Qdrant + proper HNSW indexing, you can scale to 10M+ items with sub-100ms latency. The bottleneck moves from search algorithm to ranking quality, which is a product problem rather than an infrastructure one.

Can I run embeddings offline (no API costs)?

Yes. ONNX sentence-transformer models (e.g., all-MiniLM-L6-v2) run in-process on CPU. Quality is lower than frontier API embeddings but free per call, private, and predictable. Useful for compliance-sensitive applications.

How often do I re-embed?

Only when item content or your embedding model changes. Adding a new product = embed the new product. Updating a description = re-embed that single item. Changing your embedding model = re-embed everything (rare but real).

Can I do "search by image"?

Yes — multi-modal models (e.g., CLIP variants) produce embeddings in a shared text+image space. Users upload an image, you embed it, retrieve nearest products. Often deployed for fashion and home-goods catalogs.

How do I measure if semantic search is actually better?

Maintain a golden set of (query, expected results) pairs. Track recall@5 and mean reciprocal rank. Also track product metrics: click-through rate, time-to-result, conversion rate. Both should improve before declaring victory.

Ship it

Semantic search is no longer a research project. Microsoft.Extensions.AI gives .NET 10 first-class embeddings; SQL Server, pgvector, and Qdrant all have mature .NET clients; Blazor handles the UI. Three weeks of focused work takes most teams from keyword search to a production-grade hybrid system.

Adaptive Web Hosting's ASP.NET hosting plans run all of this on real Windows + IIS, with SQL Server 2022 included for indexes and metadata, dedicated app pools, and free SSL on every plan.