Decoding lipid–protein interactions with AI | Sasaki Laboratory, Institute of Science Tokyo

Decoding disease mutations and designing experiments

There are many lipid-binding proteins involved in neurological, immune, endocrine, and metabolic diseases, retinal degeneration, hearing loss, cancer, and more. Yet how the gene mutations found in patients alter lipid binding, lipid-dependent structural changes, membrane localization, and signal transduction has, in many cases, not yet been adequately studied.

We aim to decode such disease-related mutations from the as-yet-untested viewpoint of "abnormal lipid binding." That is, we use AI to predict how a disease mutation changes a protein’s three-dimensional structure, its lipid-binding domains, and its lipid-contacting residues, and distill these into experimentally testable hypotheses.

In this approach, for both wild-type and mutant proteins we examine, at the structural level, which lipids tend to bind which regions, which amino acids are involved in lipid recognition, and how mutations change binding domains and contact residues. This lets us build experimentally testable hypotheses about disease mechanisms. If an abnormality caused by a disease mutation turns out to stem from dysregulated lipid binding or lipid signaling, it may also lead to new therapeutic strategies through intervention in lipid metabolism.

The arrival of structure-prediction AI — and its limits

Structure-prediction AIs such as AlphaFold3 can now predict candidate complex structures that include not only proteins but also nucleic acids, small molecules, and lipids. For lipid–protein interactions too, it may be possible to examine which pocket a lipid is placed in and which amino acids it lies close to.

On the other hand, the output of structure-prediction AI is not experimental proof itself. Predictions fluctuate each time, and even combinations that do not actually bind can yield plausible-looking complex structures. Therefore, a single prediction cannot be taken at face value as the correct answer.

Bringing the practices of experimental science into computation

So in our laboratory we incorporate into structure-prediction analysis the ideas of repetition, statistics, and controls that are routine in experimental biochemistry. For the same protein–lipid combination, we run thousands of repeated predictions and treat the resulting candidate structures as an ensemble.

For each structure, we analyze the distances between lipids and amino-acid residues, contact frequencies, the reproducibility of binding domains, and differences from control lipids. This lets us extract recurring lipid-binding modes and interaction candidates characteristic of particular lipids, rather than configurations obtained by chance.

Key Idea

Evaluate the many candidate structures proposed by deep learning using the mindset of experimental science, and narrow down candidate amino acids, binding domains, and mutants for lipid binding.

Using generative AI and supercomputers

Extracting atom-to-atom distances, contact frequencies, the positions of binding domains, and structural changes caused by mutations from many predicted structures requires analysis programs tailored to the purpose. In our laboratory, we use generative AI to create and revise such Python scripts and to organize analysis procedures. By clearly stating in words what we want to analyze and having the AI assist with drafting code, fixing errors, and improving workflows, even wet-lab researchers and students who are not programming specialists can more easily assemble analyses suited to their own research goals.

Furthermore, we use generative AI not merely as a coding aid but as a support technology for deciding which lipids, which amino-acid residues, which mutants, and which experimental systems to prioritize for verification. Large-scale structure prediction and molecular-dynamics simulations are run on the campus supercomputer TSUBAME4.0, and the resulting numerical data are tabulated and visualized with Python, R, and structure-viewing software.

Workflow for AI-assisted analysis of lipid–protein interactions

Conceptual diagram of lipid–protein interaction analysis. Repeated prediction, generative AI, statistical analysis, and experimental verification are linked, applying AI to hypothesis generation and experiment design.

Predict the structure

Using structure-prediction AI such as AlphaFold3, we predict the same protein–lipid complex thousands of times. Comparing wild-type and mutant, and candidate and control lipids, we obtain a structural ensemble.

Quantify

From each predicted structure, we extract the distances between lipids and amino-acid residues, the presence or absence of contacts, the positions of binding domains, and the placement of lipids. The necessary analysis code is created with the help of generative AI.

Evaluate statistically

We statistically analyze the set of structures obtained by repeated prediction and evaluate contact frequencies, the reproducibility of binding domains, and differences from control lipids. This lets us prioritize the amino acids most likely involved in lipid binding.

Verify experimentally

We generate mutants in which the predicted amino acids are substituted and verify them by mass spectrometry, binding assays, subcellular-localization analysis, transcriptional-activity measurements, and more. We connect AI predictions to an understanding of actual biological phenomena.

From an "inference tool" to an "experiment-design tool" for AlphaFold3

Rather than treating a single structure prediction as the correct answer, we use structure-prediction AI for experimental design by combining repeated prediction, statistical analysis, and control comparison. The key is not to draw conclusions with AI alone, but to use AI to narrow down the candidates that should be confirmed by experiment.

With this approach, we can more rationally select candidate amino acids for lipid binding, the effects of disease mutations, mutant-design strategies, and control lipids. Structure-prediction AI does not replace experiments; it is a tool that dramatically improves the precision of experimental design.

What we aim for

Our laboratory is not one that develops AI itself. We are a wet lab grounded in lipid biology, phosphoinositide analysis, mass spectrometry, cell experiments, and disease-model research. In recent years, AI has become a technology that is easy to use not only for specialized data scientists but also for experimental researchers in the life sciences. At the same time, verifying the candidates that AI proposes in actual cells, tissues, and disease models requires solid experimental skills and biological insight.

Our starting point is, above all, questions about biological phenomena and disease pathology. We aim to clarify experimentally how lipids control protein function and how their dysregulation leads to disease. To do so, we must precisely design which lipids to examine, which amino acids to mutate, and which experimental systems to use for verification.

AI is a powerful tool that supports this experimental design. By using structure-prediction AI, statistical analysis, and generative AI, we narrow down candidate amino acids involved in lipid binding, set strategies for control lipids and mutant design, and refine them into testable hypotheses.

Researchers pose the questions about biological phenomena and disease pathology, AI focuses the aim of the experiments, and experiments confirm. This is the lipid biology for the AI era that our laboratory aims for.

This page is a general-audience explanation based on a presentation at the 5th Lipid Forum (April 2026, Toshiyoshi Yamamoto). This research is carried out in collaboration with the Department of Computational Drug Discovery, Medical Research Laboratory, Institute of Science Tokyo (Prof. Ryuichiro Ishitani and Assoc. Prof. Yoshitaka Moriwaki). Details of specific analysis examples will be added after the related papers are published.

← Back to Learn