It is important not only to improve efficiency through digitalization but also to create new value and innovation in digital healthcare. Healthcare DX and digitalization bring about innovative value, such as new treatment methods and digital medical devices, thereby truly enhancing people’s well-being and utility. By focusing on innovation, we can invest in increasing the total amount of societal value, rather than merely improving efficiency through digitalization.
Now, let’s discuss healthcare DX and digitalization, particularly focusing on next-generation AI technologies such as “generative AI” and “foundation models,” as well as “medical and health data.” I’ll explain the three key concepts of next-generation digital healthcare and provide concrete examples of the value created.
Next time, I will delve into how humanity, armed with these new technologies and combined with medical and health data, can create new value. This will be explained in “Generative AI + Medical and Health Data = Next-Generation Healthcare [Part 2]: The Creation of Innovative Services and Businesses.”

[Glossary] What is a Personal Health Record (PHR)?
Before proceeding with our discussion, I would like to explain some terminology.
One of the terms related to medical and health data is the Personal Health Record (PHR). A PHR refers to a record system that allows individuals to manage their own health information. The key point is “self-management.” This includes a variety of health-related data such as medical records from healthcare institutions, medication prescriptions, test results, and vaccination information.
PHRs are designed to allow patients to access and share their health information as needed. This enables patients to better understand their health status, communicate effectively with their doctors, and make informed decisions about their health management and treatment.
The distinctive points are as follows:
Self-Management:
PHRs are managed by the patients themselves, providing a mechanism for directly accessing and sharing health data.
Integrated Information:
Information obtained from multiple healthcare institutions can be centralized, allowing for a comprehensive understanding of overall health status.
Information Sharing:
Patients can share information with healthcare providers and caregivers as needed, contributing to better treatment and care.
PHRs facilitate the management and sharing of personal health information, supporting the receipt of higher quality healthcare.
Three Key Concepts
I will introduce three key concepts: ① Time-Series, ② Multimodal, and ③ Foundation Models. Let’s look at each one in detail.

① Medical and Health Data as Time-Series Data

Medical and health data, including Personal Health Records (PHR), encompass a variety of information related to health and medicine, such as medical records from healthcare institutions, medication prescriptions, test results, and vaccination information. Additionally, life log data obtained from wearable devices are also considered part of PHR.
Medical and health data being time-series data means that the information recorded accumulates over time. For instance, regular measurements like blood pressure, weight, and blood glucose levels, records of tests and treatments received at medical institutions, vaccination histories, and medication schedules are all examples of this. These pieces of information are recorded in chronological order, allowing us to understand changes in health status over time, such as trends in blood pressure or fluctuations in weight from the past to the present.
PHRs differ from the fragmented and temporary data obtained through traditional methods like imaging or blood tests at hospitals. Instead, they are continuously accumulated over time. This helps in understanding an individual’s health trends and patterns, aiding in the development of appropriate preventive measures and medical strategies.

What We Want to Know in the Medical Field
What do patients want to know when they visit a hospital, and what do healthcare professionals want to know? From a patient’s perspective, it is “Will my illness be cured?” For doctors, it is “Can I save the patient in front of me?”
However, for example, in the approval of pharmaceuticals, the data used to confirm efficacy is NOT your data. You haven’t verified its effectiveness on yourself beforehand. In other words, you assume that the medication will likely work for you based on the results of research data from other people. Therefore, there are limitations in determining if it will truly be effective for you or this specific patient.

Why These Limitations Exist
The following diagram shows the design of traditional medical research.
This is called a Randomized Controlled Trial (RCT), where participants are randomly divided into two groups, one receiving the new treatment and the other receiving the usual treatment. This method allows for a fair and accurate evaluation of the treatment effect. RCTs are considered one of the most reliable scientific types of evidence and play an important role in verifying the effectiveness of new drugs and treatments.
RCTs are widely used in the marketing industry as well, known as A/B testing. It compares different versions of web pages, advertisements, or products to determine which one is more effective.
The most important aspects here are “sampling and random assignment.” Random assignment equalizes unadjusted individual differences and variances. This clarifies the “average” effect of treatments or drugs.
However, since these are studies conducted on others, what can be understood is the “average” effect in a group, not considering individual differences and characteristics.

Limitations of Traditional Medical Research
Furthermore, I will provide a more academic explanation (feel free to skip it).
Randomized Controlled Trials (RCTs) are one of the most reliable research designs for evaluating the quality of evidence in various fields, including medical research and social sciences. The main feature of RCTs is the random assignment of participants to experimental and control groups, allowing for a fair and accurate measurement of the effects of treatments or interventions. This randomization eliminates selection bias, enabling the true impact of treatment effects to be understood. One theoretical background for this is Rubin’s Causal Model (RCM).
Rubin’s Causal Model (RCM) is a statistical approach used to evaluate the effect of individual interventions or treatments on individuals. This model considers the “potential outcomes” that each individual would show under different conditions and aims to purely evaluate treatment effects through random assignment of interventions. By assuming that each individual’s intervention only affects their outcome (SUTVA), more accurate causal inference is possible. Rubin’s Causal Model measures the average treatment effect (ATE) and the conditional average treatment effect (CATE), assessing the average impact of interventions on a group or specific subgroups.
In Rubin’s Causal Model (RCM), randomization is crucial. This ensures that pre-existing characteristics are equal between the treatment and control groups, allowing the pure measurement of intervention effects, making the difference in results attributable to the treatment itself. Randomization eliminates selection bias and confounding variables, enhancing the reliability and validity of causal effects.
Rubin’s Causal Model is widely used in fields such as medicine, social sciences, and economics. When ethical or practical reasons prevent conducting experiments, this model becomes a powerful tool for estimating causal relationships from observational data.
However, as shown in the diagram below, randomization equalizes potential covariates (influencing factors) between the treatment and control groups, discarding individual differences and characteristic data (to avoid bias). This inability to perform causal inference on an individual level represents a limitation of modern medicine.

Personal Time-Series Life Log Data as the Ultimate Tool for Uncovering Causal Relationships
Recently, wearable devices have been equipped with numerous sensors capable of measuring various biomarkers. For instance, blood tests, which could only be measured once a month at medical institutions, are expected to yield hospital-equivalent results anytime and anywhere in the near future.

Personal time-series life log data are characterized by the continuous recording of an individual’s behavior, physiological responses, and environmental changes, resulting in a large number of samples. Time-series data are effective in uncovering causal relationships as they capture fluctuations over time rather than just a single moment.

As data science rapidly advances, personal time-series data have become a crucial information source. This data records an individual’s behavior, physiological responses, and environmental changes over time, providing a large sample size. Time-series data can compare and model oneself from the past to the present, predicting the future and serving as the ultimate tool for revealing causal relationships for specific individuals (n=1).
In other words, personal medical and health time-series data hold the potential to overcome the limitations of traditional medical research!
Utilizing this data allows for scientific tracking and analysis of the impact of individual behaviors on health and the effects of environmental changes on psychological states. However, this analysis requires new time-series data analysis methods not used in traditional medical research. The crucial aspect here is not the sample size (n) but the sampling rate of the data. The higher the data’s detail, the more precise causal inferences can be made, enabling personalized interventions.

Analyzing time-series data goes beyond identifying correlations, clarifying which factors actually influence outcomes. This has broad applications, from promoting personalized healthcare and wellness to optimizing lifestyle habits. This data-driven approach enables scientifically-based decision-making and provides a powerful foundation for designing the best possible future for each individual.

② Medical and Health Data are Multimodal

These days, you probably hear about AI and deep learning (deep learning) almost every day. But what exactly is “AI and deep learning”? How can we describe it in a nutshell?

AI is Matrix Calculation
I believe that “AI and deep learning” can be defined as “matrix calculations.” Deep learning is a technique for matrix calculations that learns features from large amounts of data. This technology enables the modeling of complex patterns and relationships, making predictions and analyses on new data possible.

In other words, since “AI and deep learning” are essentially “matrix calculations,” if something can be expressed as a matrix, it can be used for deep learning. In fact, medical data and individual time-series life log data come in the form of images, videos, waveforms, and more, all of which can be represented as matrices. This means that various types of medical data can be handled within a single model.

What is “Multimodality”?
Since its emergence, ChatGPT, and the latest GPT-4 running behind it, have garnered significant attention. GPT-4 can understand and process not only text but also other types of data, such as images, integrating them for comprehensive analysis. Specifically, GPT-4 can take both text and image inputs simultaneously, answer questions based on this information, and analyze the content.
This capability allows GPT-4 to describe image content, answer questions related to the text information associated with images, and provide richer background information and context by merging data from multiple sources. This function is called “generative AI,” and the ability to handle multiple types of data is referred to as “multimodality,” which is our second key concept.

Multimodality in Medical and Health Data
As mentioned above, medical data and individual time-series life log data include images, videos, and waveforms, all of which can be expressed as matrices. This means that medical and health data are multimodal, allowing various types of medical data to be handled within a single model.
This capability enables higher-level hypothesis testing, simulation, prediction, and data generation. This concept will be further explained in the next section.


③ Building Foundation Models
The third and final key concept is “foundation models.”

Among recent AI technologies, large language models have particularly garnered attention. GPT-4, in particular, has demonstrated its capabilities across many applications. This model has learned an enormous amount of textual data, including all sorts of language data available on the internet since the invention of writing by humankind. It has the ability to model human language capabilities, records, and memory, and generate new text based on that knowledge. Thus, it is often used in conversational AI like ChatGPT.
Models like GPT-4 are also referred to as “foundation models.” This term implies that the model can be applied to various tasks and applications, serving as a “foundation” for new technologies and products. For example, it is used in areas like language translation, text summarization, and question answering.

The term “foundation models” was proposed by Stanford University, referring to a single large-scale learning model that can be used for a wide range of applications. Researchers at Stanford University explore the potential of these models while also delving into ethical and societal issues, with experts from diverse fields such as law, ethics, and computer science collaborating to evaluate the technological advancements and their impact on society.
The strength of foundation models lies in their ability to solve many different problems with a single model. This eliminates the need for developers to train a new model from scratch for each specific task, making application development more efficient. Additionally, these models can quickly adapt to new data, improving performance over time.

From Partial Optimization by Deep Learning to Global Optimization by Foundation Models
Deep learning utilizes artificial neural networks with multiple layers to learn from data. This approach aims to generate models highly optimized for specific problems, targeting specific tasks or limited datasets. This is known as partial optimization, where one model is trained to solve one problem.
In contrast, foundation models are general-purpose models with broader applications. They learn from vast amounts of data and tasks, possessing the capability to address various types of problems. This represents global optimization, allowing one model to flexibly handle diverse tasks and be used for multiple purposes. Foundation models are easily upgradeable and adaptable to new tasks, integrating knowledge across extensive domains.
Due to these differences, foundation models are expected to be highly flexible in responding to new types of data and unknown problems, with a wide range of applications, while deep learning models are optimized for solving specific problems, thus having a relatively narrow scope of application.

[More Specialized Content] The Future of Foundation Models and Artificial General Intelligence (AGI)
Foundation models play a crucial role as foundational technologies in the realization of Artificial General Intelligence (AGI). These models are characterized by learning from large amounts of data and acquiring general knowledge and capabilities applicable to diverse tasks. This flexibility in addressing various new problems is essential for the development of highly versatile artificial intelligence.
Particularly noteworthy is the ability of foundation models in transfer learning. Transfer learning refers to the process where a model applies the knowledge learned from one task to another task. Foundation models leverage the general knowledge acquired through extensive initial training to quickly adapt to new tasks with minimal data. This provides the adaptability and generality needed on the path to AGI.
Thus, foundation models, with their generality and transfer learning abilities, enable the application of extensive knowledge beyond specialized domains. These characteristics position foundation models as central technologies in AGI, and their importance will likely grow in future AI research.

Author’s Research
As I explained, large language models are AI technologies that perform natural language processing tasks by learning from vast amounts of textual data. Nowadays, as mentioned above, they can handle various types of data, including images and videos.
So, what if we built a foundation model, a “large-scale human model,” using medical and health data, which are also multimodal and time-series data, similar to large language models? The non-profit organization I lead, the Institute for Sustainable Society, has been working on this development since 2020, a few years before ChatGPT was announced, and we are currently applying for a patent related to this technology.
Moreover, my company, Decades Inc., in collaboration with Jin Clinic (Minato-ku, Shirokane), is conducting research to build a large-scale human model based on lifestyle data such as diet and exercise.

What is the “Formula of Life”?
I call this technology, a large-scale human model in the medical field similar to ChatGPT/GPT, the ” Formula of Life.”
As stated in “① Medical and Health Data are Time-Series Data,” traditional medical research has often treated individual-level variability—i.e., the unique characteristics of individuals that affect outcomes—as bias that distorts true effects, and it has often been excluded. Typically, such variability is removed through anonymization or randomization, which can overlook the value of individual-specific data.
My approach with the ” Formula of Life ” focuses on the long-ignored individual-level variability—each person’s unique features. By leveraging vast personal data, from time-series data obtained from wearable devices and cognitive tendencies based on behavioral economics to molecular-level data at the genomic level, we model the whole picture of each individual.
This technology applies the latest deep learning techniques used in other fields like financial engineering, autonomous driving, and natural language processing to the medical field. Personalized medicine can propose the best treatment plans for each patient, enabling more effective health management, and has the potential to fundamentally change healthcare.
With this advanced approach, healthcare is transitioning from providing generalized solutions to offering customized care that meets the specific needs of individual patients. This is not merely a technological innovation but a paradigm shift towards more personalized treatments.

[More Specialized] The ” Formula of Life” and Homeostasis
Homeostasis refers to the mechanisms by which living organisms maintain a stable internal environment. This includes conditions necessary for life activities such as body temperature, pH levels, and electrolyte concentrations. For example, body temperature is maintained constant regardless of external temperatures, blood pH stays nearly neutral, and blood glucose levels are regulated by insulin after eating to maintain health. Thus, homeostasis is the basis for optimal biological activity in response to changes in the external environment.
I believe that by analyzing medical and health time-series data and building a large-scale human model, the ” Formula of Life,” which is a single, multimodal foundation model, we can model the body’s responses at the individual level as described above. Maintaining homeostasis is crucial for health, and disruptions in this balance can lead to many diseases.
We define the state where homeostasis is maintained as “health,” its breakdown as “disease,” finding the breakdown as “diagnosis,” and restoring the original homeostasis as “treatment.”
If the ” Formula of Life” I am developing could predict and simulate such states like generative AI, how exactly would current medicine change?

Next time, we will explore how humanity, combining this new technology with medical and health data, can create new value in “Generative AI + Medical and Health Data = Next-Generation Healthcare [Part 2]: Creation of Innovative Services and Businesses.”
コメント