Bookmarked Posts
You haven't saved any bookmarks yet - start exploring and adding
your favorites! π€
-
π§ Under construction π§
Stand by - moving over my best notes from the old blog. -
π§ Under construction π§
Stand by - moving over my best notes from the old blog. -
π§ Under construction π§
Stand by - moving over my best notes from the old blog. -
π§ Under construction π§
Stand by - moving over my best notes from the old blog. -
Failure Categories: Signals, Impact, and First Response
Outages begin with signals - timeouts, 500s, missing data, user reports. Classify the failure type first. Skip guesswork, focus your triage, and go straight to whatβs broken. -
Alert When Context Window Usage Exceeds 90%
Prevent silent LLM failures by catching prompt truncation early - a single alert can save hours of debugging broken AI behavior. -
Hallucination Rate Tracking to Cut False Facts and Protect User Trust
Hallucinations undermine trust faster than outages. You canβt stop LLMs from making things up, but you can count it, log it, and cut it down. This post shows how to track hallucinations in prod and feed the signals back into your system. -
LLM Metrics That Actually Matter in Prod (Not BLEU or Accuracy)
Benchmarks like BLEU and accuracy stop being useful once your model hits prod. What matters are signals like user edits, off-topic drift, long-winded answers, and drop-offs. These are the metrics you should be tracking and charting.