A Deterministic Replacement for LLM-as-Judge in Stateful Agent Evaluation · HackerLangs