search icon
search icon
Flag Arrow Down
Română
Română
Magyar
Magyar
English
English
Français
Français
Deutsch
Deutsch
Italiano
Italiano
Español
Español
Русский
Русский
日本語
日本語
中国人
中国人

Change Language

arrow down
  • Română
    Română
  • Magyar
    Magyar
  • English
    English
  • Français
    Français
  • Deutsch
    Deutsch
  • Italiano
    Italiano
  • Español
    Español
  • Русский
    Русский
  • 日本語
    日本語
  • 中国人
    中国人
Sections
  • News
  • Exclusive
  • INSCOP Surveys
  • Podcast
  • Diaspora
  • Republic of Moldova
  • Politics
  • Economy
  • Current Affairs
  • International
  • Sport
  • Health
  • Education
  • IT&C knowledge
  • Arts & Lifestyle
  • Opinions
  • Elections 2025
  • Environment
About Us
Contact
Privacy policy
Terms and conditions
Quickly scroll through news digests and see how they are covered in different publications!
  • News
  • Exclusive
    • INSCOP Surveys
    • Podcast
    • Diaspora
    • Republic of Moldova
    • Politics
    • Economy
    • Current Affairs
    • International
    • Sport
    • Health
    • Education
    • IT&C knowledge
    • Arts & Lifestyle
    • Opinions
    • Elections 2025
    • Environment
  1. Home
  2. IT&C knowledge
167 new news items in the last 24 hours
21 November 06:55

A major study shows that many AI assessment tests exaggerate the real capabilities of the systems.

Adrian Rusu
whatsapp
facebook
linkedin
x
copy-link copy-link
main event image
IT&C knowledge
Foto: pixabay.com

A study conducted by the Oxford Internet Institute, in collaboration with over thirty institutions, analyzes 445 benchmarks used for evaluating artificial intelligence (AI). Researchers emphasize that many of these tests lack scientific rigor and do not accurately measure the abilities they claim to assess.

For example, some benchmarks do not clearly define the competencies being evaluated, while others reuse data from previous tests, affecting the reliability of the results. Adam Mahdi, one of the lead authors, warns that these deficiencies can distort the perception of AI progress. The study proposes eight recommendations for creating more transparent and reliable benchmarks, including a clear definition of the purpose of each test and the use of more representative task sets.

Sources

sursa imagine
Control F5
New Study Finds AI Abilities Are Often Overstated Because of Flawed Tests
app preview
Personalized news feed, AI-powered search, and notifications in a more interactive experience.
app preview app preview
AI test evaluation you
app preview
Personalized news feed, AI-powered search, and notifications in a more interactive experience.
app preview
app store badge google play badge
  • News
  • Exclusive
  • INSCOP Surveys
  • Podcast
  • Diaspora
  • Republic of Moldova
  • Politics
  • Economy
  • Current Affairs
  • International
  • Sport
  • Health
  • Education
  • IT&C knowledge
  • Arts & Lifestyle
  • Opinions
  • Elections 2025
  • Environment
  • About Us
  • Contact
Privacy policy
Cookies Policy
Terms and conditions
Open source licenses
All rights reserved Strategic Media Team SRL

Technology in partnership with

anpc-sal anpc-sol