Enable javascript in your browser for better experience. Need to know to enable it? Go here.

Research for trusted, scalable AI deployments

 

Our AI research team advances model interoperability, control and robustness. Focusing on rigorous evaluation, interpretability and decision-making, we deliver insights through open-source solutions and peer-reviewed publications.

Why it matters

Too much of the market still treats evaluation, interpretability and reliability as secondary concerns. That works until teams need to confidently choose a model, mitigate hallucinations or explain how an AI decision was reached.

Our approach

We tackle these challenges from the ground up. Using a first-principles approach to evaluation, we build methods that rigorously assess AI systems at every stage: input, output and internal decision-making.

From the labs

Blog February 27, 2026

Concept consistency score

CCS measures how CLIP attention heads align with concepts; high-CCS heads preserve performance but can amplify social bias. ... Read more
Blog December 15, 2025

P-less Sampling: A robust hyperparameter-free approach for LLM decoding

Blog December 15, 2025

p-less Sampling: A robust LLM decoding strategy

Blog October 10, 2025

Evaluating LLM-generated summaries using the Lie algebra framework

Blog June 04, 2025

The next frontiers in AI — according to industry leaders

Blog May 06, 2025

Calculating uncertainty in generative AI

Blog March 07, 2025

Evaluating LLMs using semantic entropy

Blog October 31, 2024

LLM benchmarks, evals and tests

Blog October 16, 2023

Decoding LLM uncertainties for better predictability

Blog September 08, 2023

A surprisingly effective way to estimate token importance in LLM prompts

Blog September 02, 2021

Probabilistic machine learning and weak supervision

Blog September 01, 2021

A gentle introduction to machine teaching

Blog

TinySQL

Blog

Beyond linear steering: Unified multi-attribute control for language models

Blog

Turning up the heat: Min-p samling for creative and coherent creative outputs

Blog

Beyond I am sorry, I can’t: dissecting large language model refusal

Blog

Distribution-aware feature selection for SAEs

Blog

Towards transparent AI grading: Entropy as a signal for human-AI disagreement

Blog

Steering smarter

Partners and collaborations

Thoughtworks AI labs sit within a wider network of organizations spanning public AI research, semiconductor innovation, cloud platforms, open source and AI engineering.

These relationships strengthen the lab’s ability to contribute to the methods, tools and technical standards shaping reliable AI.

For partnerships and collaboration inquiries

email ai-labs@thoughtworks.com