Dispersion loss counteracts embedding condensation in small language models

· Hacker News