Insights

From the field, not from theory.

20 June 2026

Custom SoftwareBuild vs BuySecurityProduction

The SME tenant migration trap: why your build-vs-buy auth forgot the data layer

We bought an auth SaaS to handle roles and logins. It worked perfectly until we needed to migrate a 12-tenant database schema. The real build-vs-buy gap is never the login form; it is the data layer.

19 June 2026

Generative AIProductionCost ControlStructured Outputs

The structured output schema that doubled our token bill

Forcing LLMs into strict JSON schemas feels safe, but the hidden token cost of schema repetition and refusal loops will quietly destroy your unit economics. Here is how we measured and fixed it.

18 June 2026

Data IntelligencepgvectorRAGReliability

The HNSW probe budget that tanked our RAG latency

Switching to pgvector HNSW cut our index build time by 80% but spiked RAG p99 latency to 4.2 seconds. The culprit was a single default we failed to override for production queries.

17 June 2026

Process AutomationReliabilityQueuesIdempotency

The idempotency key saved our automation from a queue replay

When our message broker replayed 12,000 events overnight, only 14 duplicate actions escaped. Here is how idempotency keys turned a costly incident into a minor log entry, and the implementation pattern we now mandate.

16 June 2026

GEOAI SearchMeasurementLLMs

The brand mention variant problem in GEO measurement

AI engines cite you in ways you never planned. We tracked 12 unexpected mention variants across Perplexity and ChatGPT, revealing a massive blind spot in standard GEO tracking.

15 June 2026

AI AgentsProductionReliabilityCost Control

Cheaper to rollback: checkpointing in multi-step agent loops

When an LLM tool call fails halfway through a ten-step agent run, re-running from scratch doubles latency and cost. Checkpointing state at each step cut our retry spend by 62%.

14 June 2026

Semantic SearchEmbeddingsProductionEvals

The embedding drift that broke our semantic search

We swapped an embedding model to cut latency by 40ms. Three weeks later, recall silently dropped 18%. Here is the exact measurement that caught it and the reindexing trade-off we chose.

13 June 2026

Custom SoftwareBuild vs BuySecurityProduction

The SME build-vs-buy decision that actually matters: authentication and authorization

Most SMEs treat auth as a solved problem you just outsource. Here is why building custom access control cost us 40 hours upfront but saved 200 hours of integration hell later.

12 June 2026

Generative AIRAGEvalsProduction

The grounded assistant trade-off: precision at 96 %, recall at 41 %

Raising citation strictness to 96 % precision in grounded assistants cuts useful recall to 41 %. Here is how we calibrate the threshold and why 85 % precision is the pragmatic ceiling for most enterprise copilots.

11 June 2026

EvalsGuardrailsProduction

Evals and guardrails: shipping LLM features you can defend

Moving an LLM feature from demo to production requires more than prompt tweaks. Without structured evaluations and guardrails, you are deploying an unpredictable system and hoping for the best.

1 June 2026

AI VideoProductionMethod

An AI-generated TV spot in 48 hours: the method

From blank page to a broadcast-ready 15-second master: the exact timeline of a disciplined generative video production — visual anchoring, multi-shot generation, licensed audio.

18 May 2026

GEOSEOAI

GEO: getting cited by AI engines is the new SEO

ChatGPT, Perplexity and AI Overviews now answer instead of your pages. Generative Engine Optimization decides whether you are their source — or their blind spot.

22 April 2026

AI AgentsProductionReliability

AI agents in production: what actually breaks (and how to prevent it)

Agent demos impress; production forgives less. Lessons from eighteen months of agents in continuous operation: real failures, guardrails that work.