This paper evaluates the performance of six open-weight LLMs (llama3-8b, llama3.1-8b, gemma2-9b, mixtral-8x7b, llama3-70b, llama3.1-70b) in recommending experts in physics across five tasks: top-k experts by field, influential scientists by discipline, epoch, seniority, and scholar counterparts. The evaluation examines consistency, factuality, and biases related to gender, ethnicity, academic popularity, and scholar similarity. Using ground-truth data from the American Physical Society and OpenAlex, we establish scholarly benchmarks by comparing model outputs to real-world academic records. Our analysis reveals inconsistencies and biases across all models. mixtral-8x7b produces the most stable outputs, while llama3.1-70b shows the highest variability. Many models exhibit duplication, and some, particularly gemma2-9b and llama3.1-8b, struggle with formatting errors. LLMs generally recommend real scientists, but accuracy drops in field-, epoch-, and seniority-specific queries, consistently favoring senior scholars. Representation biases persist, replicating gender imbalances (reflecting male predominance), under-representing Asian scientists, and over-representing White scholars. Despite some diversity in institutional and collaboration networks, models favor highly cited and productive scholars, reinforcing the rich-getricher effect while offering limited geographical representation. These findings highlight the need to improve LLMs for more reliable and equitable scholarly recommendations.
In the aftermath of the “AI boom,” this report examines how the push to integrate AI products everywhere grants AI companies - and the tech oligarchs that run them - power that goes far beyond their deep pockets.
This ILO Working Paper refines the global measurement of occupational exposure to generative AI by combining task-level data, expert input, and AI model predictions. It offers an improved methodological framework to assess how GenAI may impact jobs across countries and sectors.
AI can significantly reduce time spent on government tasks. This report describes a trial of 20,000 civil servants in the United Kingdom, showing they could save nearly two weeks each annually by using the technology. It suggests AI tools have the potential to transform productivity and public service delivery at scale.
Jisc report Over the last two years we have spoken to groups of students to get a broad understanding of the way they view artificial intelligence (AI), and how they are using it, and what their concerns and hopes are. We’ve published two reports, summarising our findings, in 2023 and 2024.
The use of chatbots equipped with artificial intelligence (AI) in educational settings has increased in recent years, showing potential to support teaching and learning. However, the adoption of these technologies has raised concerns about their impact on academic integrity, students' ability to problem-solve independently, and potential underlying biases. To better understand students' perspectives and experiences with these tools, a survey was conducted at a large public university in the United States. Through thematic analysis, 262 undergraduate students' responses regarding their perceived benefits and risks of AI chatbots in education were identified and categorized into themes. The results discuss several benefits identified by the students, with feedback and study support, instruction capabilities, and access to information being the most cited. Their primary concerns included risks to academic integrity, accuracy of information, loss of critical thinking skills, the potential development of overreliance, and ethical considerations such as data privacy, system bias, environmental impact, and preservation of human elements in education. While student perceptions align with previously discussed benefits and risks of AI in education, they show heightened concerns about distinguishing between human and AI generated work - particularly in cases where authentic work is flagged as AI-generated. To address students' concerns, institutions can establish clear policies regarding AI use and develop curriculum around AI literacy. With these in place, practitioners can effectively develop and implement educational systems that leverage AI's potential in areas such as immediate feedback and personalized learning support. This approach can enhance the quality of students' educational experiences while preserving the integrity of the learning process with AI.
On 3 and 4 April, LSE jointly hosted a conference (with Peking University) regarding 'Global Approaches to Gen AI in Higher Education'. You can watch a video playlist of highlights from the Conference via YouTube, featuring teaching colleagues from several LSE Departments, a student panel, and guest keynote speakers.
Just over a decade ago, the ORCID (Open Researcher and Contributor Identifier) was created to provide a unique digital identifier for researchers around the world. The ORCID has proven essential in identifying individual researchers and their publications, both for bibliometric research analyses and for universities and other organizations tracking the research productivity and impact of their personnel. Yet widespread adoption of the ORCID by individual researchers has proved elusive, with previous studies finding adoption rates ranging from 3% to 42%. Using a national survey of U.S. academic researchers at 31 research universities, we investigate why some researchers adopt an ORCID and some do not. We found an overall adoption rate of 72%, with adoptions rates ranging between academic disciplines from a low of 17% in the visual and performing arts to a high of 93% in biological and biomedical sciences. Many academic journals require an ORCID to submit a manuscript, and this is the main reason why researchers adopt an ORCID. The top three reasons for not having an ORCID are not seeing the benefits, being far enough in the academic career to not need it, and working in an academic discipline where it is not needed.
Cybersecurity and privacy maturity assessment and strengthening for digital health information systems (WHO/Europe) https://www.who.int/europe/publications/i/item/WHO-EURO-2025-11827-51599-78854 This guide focuses on cybersecurity and privacy risk assessments in digital health, as tailored to the WHO European Region. It provides a framework for technical audiences to develop risk assessment specifications suited to the unique needs and goals of their organizations and countries in order to comply with country-specific cybersecurity and privacy regulations. The assessment questionnaire that forms part of the assessment methodology is also available in the form of a Microsoft Excel spreadsheet and is published as a separate web annex.
An unprecedented look at the state of AI’s energy and resource usage, where it is now, where it is headed in the years to come, and why we have to get it right.
Research Security Report: Stronger cooperation, safer collaboration: driving research security cooperation across Europe Following on from the event series in 2024 (see details below), the team have released a report based on the findings.
This report outlines the findings from assessing the extent to which public sector activities in the United Kingdom are suited for Generative AI (GenAI) use. The findings demonstrate the potential supporting role that GenAI could play in freeing up valuable public sector time. However, its potential to support public sector work activities varies across different sectors.
Generative artificial intelligence (GenAI) is increasingly used to support a wide range of human tasks, yet empirical evidence on its effect on creativity remains scattered. Can GenAI generate ideas that are creative? To what extent can it support humans in generating ideas that are both creative and diverse? In this study, we conduct a meta-analysis to evaluate the effect of GenAI on the performance in creative tasks. For this, we first perform a systematic literature search, based on which we identify n = 28 relevant studies (m = 8214 participants) for inclusion in our meta-analysis. We then compute standardized effect sizes based on Hedges' g. We compare different outcomes: (i) how creative GenAI is; (ii) how creative humans augmented by GenAI are; and (iii) the diversity of ideas by humans augmented by GenAI. Our results show no significant difference in creative performance between GenAI and humans (g = -0.05), while humans collaborating with GenAI significantly outperform those working without assistance (g = 0.27). However, GenAI has a significant negative effect on the diversity of ideas for such collaborations between humans and GenAI (g = -0.86). We further analyze heterogeneity across different GenAI models (e.g., GPT-3.5, GPT-4), different tasks (e.g., creative writing, ideation, divergent thinking), and different participant populations (e.g., laypeople, business, academia). Overall, our results position GenAI as an augmentative tool that can support, rather than replace, human creativity-particularly in tasks benefiting from ideation support.
Honglin Bao, Mengyi Sun, Misha Teplitskiy; Where there’s a will there’s a way: ChatGPT is used more for science in countries where it is prohibited. Quantitative Science Studies 2025; doi: https://doi.org/10.1162/qss_a_00368
With the latest models achieving top scores in scientific and diagnostic reasoning tests, they could usher in a new era of growth. In our previous report w
To get content containing either thought or leadership enter:
To get content containing both thought and leadership enter:
To get content containing the expression thought leadership enter:
You can enter several keywords and you can refine them whenever you want. Our suggestion engine uses more signals but entering a few keywords here will rapidly give you great content to curate.