With AI models clobbering every benchmark, it’s time for human evaluation
The latest frontier in AI research is having more humans in the loop assessing just how good the models are.
The latest frontier in AI research is having more humans in the loop assessing just how good the models are.