AI Personalization Engines in .NET 10: ML.NET + Embeddings + LLM

Two visitors land on your e-commerce store. The same homepage greets them. The same featured products. The same recommendations. One leaves immediately because nothing matches their interest. The other lingers but never finds the niche item they would have loved. The fix is personalization — recommend, sort, and rank based on who the visitor is and what they've done. In 2026, this is no longer the exclusive domain of Amazon's data science team. The .NET 10 stack ships every piece: ML.NET for collaborative filtering, Microsoft.Extensions.AI for embedding-based content matching, and LLM re-rankers for the last 10% of relevance lift.

This guide walks through a layered personalization architecture that combines classical machine learning, vector similarity, and LLM judgment — each layer adding precision where the previous one fell short. The whole thing runs in a single Blazor application backed by SQL Server 2022.

ML.NETClassical CF + ranking

.NET 10LTS runtime

LLMRe-rank + explain

Three layers of personalization in .NET 10

ML.NET collaborative filtering

Matrix factorization on user-item interaction data. "People who bought X also bought Y." Cheap, fast, well-understood.

✅ Layer 2: Semantic

Embedding similarity

Embed user interests + product descriptions. Surface items similar to recently-viewed even when no other user has bought them.

✅ Layer 3: LLM

Contextual re-ranking

The LLM re-orders the top 20-30 candidates from layers 1-2 based on the user's stated intent and full context. Adds the "why this product?" explanation.

🟡 Required

Cold-start handling

New users have no history. New items have no interactions. Embedding similarity (Layer 2) handles both gracefully where Layer 1 would return nothing.

🟡 Optional

Diversity injection

Pure relevance produces homogeneous results. Inject 10-20% serendipity items to prevent the filter-bubble trap and surface unexpected wins.

🟡 Mandatory at scale

Eval harness

Track click-through, conversion, and time-on-site per recommendation slot. Without measurement, you'll spend months tuning the wrong layer.

Quick reference: the recommendation request flow

Layer 1: ML.NET collaborative filtering

Before any AI/LLM layers, build the basic "users who viewed/bought X also engaged with Y" model. ML.NET's MatrixFactorizationTrainer learns this from your interaction logs in minutes. For most product catalogs, this single layer drives 70% of the achievable lift. Adding LLMs without it is premature optimization.

Input is a table of (UserId, ProductId, InteractionStrength) — clicks worth 1, adds-to-cart worth 3, purchases worth 5. ML.NET factors the sparse matrix and learns latent factors per user and per product.

public class InteractionData

{

[LoadColumn(0)] public uint UserId;

[LoadColumn(1)] public uint ProductId;

[LoadColumn(2)] public float Strength;

}

public class RecommendationPrediction

{

public float Score;

}

public class CollaborativeFilter

{

private readonly MLContext _ml = new(seed: 42);

private ITransformer? _model;

public void Train(IEnumerable<InteractionData> interactions)

{

var data = _ml.Data.LoadFromEnumerable(interactions);

var pipeline = _ml.Transforms.Conversion.MapValueToKey(nameof(InteractionData.UserId))

.Append(_ml.Transforms.Conversion.MapValueToKey(nameof(InteractionData.ProductId)))

.Append(_ml.Recommendation().Trainers.MatrixFactorization(new()

{

MatrixColumnIndexColumnName = nameof(InteractionData.UserId),

MatrixRowIndexColumnName = nameof(InteractionData.ProductId),

LabelColumnName = nameof(InteractionData.Strength),

NumberOfIterations = 25,

ApproximationRank = 50,

LossFunction = MatrixFactorizationTrainer.LossFunctionType.SquareLossRegression

}));

_model = pipeline.Fit(data);

}

public float PredictScore(uint userId, uint productId)

{

var engine = _ml.Model.CreatePredictionEngine<InteractionData, RecommendationPrediction>(_model!);

return engine.Predict(new InteractionData { UserId = userId, ProductId = productId }).Score;

}

Train as a Hangfire-scheduled background job — daily for active catalogs, hourly for high-velocity sites. The model serializes to a single file (~tens of MB for most catalogs) and loads at app startup.

Layer 2: Embedding similarity for cold-start

Collaborative filtering fails when there's no interaction data — a brand new user, a freshly added product. Embeddings fill this gap. Embed every product (text + image) and every user-context signal. Find nearest items by vector similarity.

// Product side: embed at ingestion

public async Task EmbedProductAsync(Product product)

{

var text = $"{product.Name}. {product.Description}. Category: {product.Category}. Tags: {string.Join(", ", product.Tags)}";

var result = await _embedder.GenerateAsync(new[] { text });

product.Embedding = result[0].Vector.ToArray();

}

// User side: embed recent activity into a "current interest" vector

public async Task<float[]> ComputeUserInterestVectorAsync(int userId)

{

var recentInteractions = await _db.Interactions

.Where(i => i.UserId == userId && i.CreatedAt > DateTime.UtcNow.AddDays(-30))

.Include(i => i.Product)

.OrderByDescending(i => i.CreatedAt)

.Take(20)

.ToListAsync();

if (!recentInteractions.Any()) return Array.Empty<float>();

// Weighted average: more recent + stronger interactions count more

var weightedSum = new float[_embeddingDimension];

float totalWeight = 0f;

foreach (var interaction in recentInteractions)

{

var recencyDays = (DateTime.UtcNow - interaction.CreatedAt).TotalDays;

var weight = interaction.Strength * (float)Math.Exp(-recencyDays / 14.0); // 14-day half-life

for (int i = 0; i < _embeddingDimension; i++)

weightedSum[i] += interaction.Product.Embedding[i] * weight;

totalWeight += weight;

}

for (int i = 0; i < _embeddingDimension; i++)

weightedSum[i] /= totalWeight;

return weightedSum;

}

// Retrieval: find products similar to the user's current interest vector

public async Task<List<Product>> RecommendBySimilarityAsync(int userId, int topK = 30)

{

var interestVector = await ComputeUserInterestVectorAsync(userId);

if (interestVector.Length == 0)

return await GetPopularProductsAsync(topK); // cold-start fallback

return await _vectorStore.SearchAsync(interestVector, limit: topK);

}

The "recently viewed" trick

For anonymous visitors with no account, the same approach works against session-level interactions. Embed everything the visitor has viewed in this session, recommend similar items. Conversion rate impact is significant — the visitor sees products tailored to what they've already shown interest in, within their first 30 seconds on the site.

Combining the layers

Each layer surfaces different products. CF finds items popular among similar users. Embeddings find items similar to user interest. Combining them (union of top-K from each, deduplicated) gives the LLM re-ranker a richer candidate pool than either alone. Tuning the relative weights is a downstream A/B test, not an architectural decision.

public async Task<List<Product>> GenerateCandidatesAsync(int userId, int topK = 30)

{

var cfCandidatesTask = GetTopCfCandidatesAsync(userId, limit: 20);

var similarityCandidatesTask = RecommendBySimilarityAsync(userId, topK: 20);

await Task.WhenAll(cfCandidatesTask, similarityCandidatesTask);

var merged = (await cfCandidatesTask)

.UnionBy(await similarityCandidatesTask, p => p.Id)

.ToList();

// Apply hard filters: in-stock, price range, region availability

var filtered = await ApplyEligibilityFiltersAsync(merged, userId);

return filtered.Take(topK).ToList();

}

Layer 3: LLM re-ranking

The candidate pool from layers 1+2 is good but not ordered. The LLM does the final ranking — given the user's recent activity, their stated preferences (if any), and the candidate products, output a re-ranked list with brief reasoning per item.

public record RankedItem(int ProductId, double Score, string Reasoning);

public async Task<List<RankedItem>> RerankWithLlmAsync(

int userId,

List<Product> candidates,

string? statedIntent = null)

{

var userContext = await BuildUserContextAsync(userId);

var prompt = $@"Re-rank these product candidates for a user. Output JSON.

USER CONTEXT:

{userContext}

USER'S STATED INTENT (may be empty):

{statedIntent}

CANDIDATES:

{string.Join("\n", candidates.Select(c => $"- [{c.Id}] {c.Name} | {c.Category} | ${c.Price:F2} | {c.Description}"))}

Return JSON: {{ ""ranked"": [{{ ""id"": 42, ""score"": 0.92, ""reasoning"": ""matches their interest in X"" }}, ...] }}.

Score should reflect overall fit (0-1). Include all candidates; do not invent IDs.";

var response = await _chatClient.GetResponseAsync<RerankedResponse>(

new[] { new ChatMessage(ChatRole.User, prompt) },

new ChatOptions { Temperature = 0.2 });

return response.Result.Ranked.OrderByDescending(r => r.Score).ToList();

}

The "why this?" superpower

Because the LLM emits reasoning per item, your UI can show a one-line explanation under each recommendation: "Matches your interest in lightweight backpacks", "Highly rated by users who bought your last 3 items", "On sale and in your color preference." This dramatically increases click-through compared to raw recommendation lists.

The Blazor recommendation component

@page "/recommendations"

@inject IRecommendationService Recs

@foreach (var item in _ranked)

{

<Sparkles class="h-3 w-3 inline" /> @item.Reasoning

</p>

</ExplanationSlot>

</ProductCard>

}

</div>

@code {

[Parameter] public int UserId { get; set; }

List<ExplainedProduct> _ranked = new();

protected override async Task OnInitializedAsync()

{

// 1. Generate candidates from CF + embeddings (fast, parallel)

var candidates = await Recs.GenerateCandidatesAsync(UserId, topK: 30);

// 2. Re-rank with LLM, get explanations

var reranked = await Recs.RerankWithLlmAsync(UserId, candidates);

// 3. Hydrate into product records with explanations

_ranked = reranked

.Take(12)

.Select(r => new ExplainedProduct(

candidates.First(c => c.Id == r.ProductId),

r.Reasoning))

.ToList();

}

Skeletal loading + progressive enhancement

The LLM re-rank adds ~300-500 ms. Render the page immediately with CF + embedding candidates in the default order, then animate the LLM-reordered version in once the response arrives. Users see something quickly; the polish shows up moments later.

Where this gets deployed in production

E-commerce homepage. "Recommended for you" rails. The most common production deployment.

Product detail pages. "You might also like" carousels. Easier to nail since the seed item provides strong context.

Content sites. Recommended articles. Same architecture, different item type.

Subscription product retention. Recommend the next add-on. Surface usage-pattern matches.

Email campaigns. Personalize the products in each user's marketing email. Cron job pre-computes recommendations; the email send hydrates the template.

Every Adaptive Web Hosting plan includes SQL Server 2022. Interaction logs, product embeddings (as varbinary(max) or via an external vector DB), trained ML.NET model artifacts, A/B test results — all live cleanly in SQL Server with EF Core 10. Hangfire runs the training jobs and embedding refreshes in the same app pool.

Production patterns

Cold-start playbook

Three tiers of fallback when a user has no history: (1) similar items to what they're looking at right now, (2) popular items in their detected region, (3) top sellers overall. The transition is gradual — as interactions accumulate, CF takes over progressively.

Privacy-aware

Don't send PII to the LLM re-ranker. The user context should be category preferences, recent actions, and stated intent — not name, email, or address. Strip aggressively in the user-context builder.

Caching

LLM re-rank results can be cached per (user, candidate-set) for 5-15 minutes. For non-logged-in users on session-based personalization, the cache key is the session-interest-vector hash, which gets reasonably high hit rates within a session.

A/B testing

Personalization changes are notoriously hard to evaluate by eye. Run every model change through an A/B test against the current production system. Track CTR, conversion rate, and average order value. Roll out only when statistical significance is reached.

Diversity

Pure relevance optimization eventually collapses to the same 5 products everyone has bought. Inject controlled diversity: reserve 1-2 slots out of every 10 for higher-variance picks — items moderately related but not top-ranked. Tracks against long-tail discovery metrics.

Hosting recommendations

ASP.NET Business — $17.49/mo

Mid-size catalogs (5K-50K items), multi-shop e-commerce, content platforms with full LLM re-rank. 2 GB headroom for in-process ML.NET model.

View Business plan →

ASP.NET Professional — $27.49/mo

Large catalogs, multi-tenant agency platforms, high-concurrency recommendation APIs. 4 GB per pool, highest priority scheduling.

View Professional plan →

FAQs

Do I need the LLM layer? Isn't CF + embeddings enough?

For many catalogs, yes — CF + embeddings will give you 90% of the achievable lift. Add the LLM re-ranker when you want explanations ("why is this recommended?"), when you need to honor stated intent ("I'm shopping for a gift"), or when you're optimizing the last 10% of conversion rate.

How often should I retrain the ML.NET model?

Daily for most catalogs. Hourly for very high-velocity sites (flash sales, marketplace with constant new items). Monthly is too slow — your model will lag behind new products and trends.

What about real-time updates?

The ML.NET model is offline-trained, but the embedding-based layer responds to the user's current session in real time. A user who's been browsing hiking boots for 5 minutes immediately sees relevant items in the next page render, even before a CF retrain.

How do I measure if personalization is working?

A/B test against the unpersonalized baseline. Track CTR on recommendation slots, conversion rate from those slots, and average order value of users in each variant. Statistical significance usually takes 1-2 weeks of meaningful traffic. Don't ship from vibes.

What about negative signals (returns, dislikes)?

Feed them into the interaction strength. A return after purchase = -2 strength. A "not interested" click = -1. The CF model learns to deprioritize products with negative engagement.

How do I prevent filter-bubble degradation?

The diversity injection layer is the answer. Reserve 10-20% of recommendation slots for items chosen by a different policy — random, popular-but-unrelated, editorially curated. Users who only see hyper-personalized items eventually disengage; some serendipity sustains long-term engagement.

Ship it

Personalization is one of the highest-ROI investments a content or commerce site can make — typical conversion lift in the 15-30% range for sites starting from no personalization. The .NET 10 stack ships everything: ML.NET for the statistical layer, Microsoft.Extensions.AI for embeddings and LLM re-ranking, Blazor for the UI components, and SQL Server 2022 for the interaction log and trained model storage.

Adaptive Web Hosting's ASP.NET hosting plans run all of this on real Windows + IIS, with dedicated app pools that handle in-process ML.NET model serving without cold-start latency, SQL Server 2022 included, and Hangfire-driven background training that runs alongside your main web workload.