Dispersion loss counteracts embedding condensation in small language models · HackerLangs