Takeaways

  • Published on
    In 2023, researchers introduced new benchmarks—MMMU, GPQA, and SWE-bench—to test the limits of advanced AI systems. Just a year later, performance sharply increased: scores rose by 18.8, 48.9, and 67.3 percentage points on MMMU, GPQA, and SWE-bench, respectively. Beyond benchmarks, AI systems made major strides in generating high-quality video, and in some settings, language model agents even outperformed humans in programming tasks with limited time budgets.
  • Published on
    Industry continues to dominate frontier AI research. AI support for scientific progress accelerates even further. However, it trails behind on more complex tasks like competition-level mathematics, visual commonsense reasoning and planning
  • Published on
    The resolution of artificial intelligence (AI) is entering a fierce competitive stage. Large machine learning models are being released almost monthly. Tasks performed by AI have far surpassed many unimaginable feats from a decade ago. In 2023, this revolution has created 10 takeaways worthy of consideration.