Subscribe and receive the latest tech updates, startup insights, and industry trends delivered straight to your inbox.
We respect your privacy. Unsubscribe at any time.
I use:
* Aider's Polyglot benchmark seems to be a decent indicator of which models are going to be good at coding:
https://aider.chat/docs/leaderboards/
* I generally assume OpenRouter usage to be an indicator of a model's popularity, and by proxy, utility:
https://openrouter.ai/rankings
* LLM-Stats has a lot of charts of benchmarks that I look at:
https://llm-stats.com/