[Medical AI Classroom: B8] Does AI Truly Understand the Meaning of Words? — How AI Learns to Treat Language as Something Meaningful —

2025年5月2日2025年6月16日

Introduction: Does AI Truly Understand “Meaning”?

Today’s AI systems have become remarkably fluent in using language.
They can read and summarize text, and even respond as if they’re having a natural conversation with a human.

For example, AI can now:

Infer possible diseases from a set of symptoms
Provide easy-to-understand explanations of test results
Summarize medical records clearly and concisely

These capabilities are made possible by a technology called large language models (LLMs).
ChatGPT is a well-known example of an LLM, trained on trillions of words to acquire a seemingly human-like command of language.

Seeing this, you may have found yourself wondering:

“Could it be that AI actually understands the meaning of words?”

But let’s pause and think about this more carefully.

Take the word “apple”, for instance. Even if you give this word to an AI, it cannot perceive:

Its shiny, red appearance
Its sweet or tart taste
The healthy image it evokes

That’s because AI has no senses—no sight, no taste, no feelings.
In other words, it doesn’t experience what apple is. It cannot feel meaning the way humans do.

And yet, AI behaves as if it does understand meaning.
How is this possible?

The secret lies in a clever trick:
representing words as numbers.

In the following chapters, we’ll gently unpack how this works—step by step.

1. To a Computer, Words Are Just Strings of Symbols

When humans see the word “apple”, a variety of rich images may come to mind:

A shiny, red fruit
A sweet or tangy taste
An image of something healthy

But to a computer, apple looks like this:

“apple” → [‘a’, ‘p’, ‘p’, ‘l’, ‘e’] → [97, 112, 112, 108, 101]

This is simply a string of characters converted into numeric codes.
At this stage, it carries zero meaning.

2. A Computer’s Only Tool: Numbers

Humans can instantly associate words with images and emotions.
For example, when you hear the word “apple”, you might imagine a sweet red fruit, think of something healthy, or even picture a tech company.

But computers have no emotions or senses.
So when they encounter the word “apple”, they have no clue what it actually means.

So how does AI get closer to understanding the meaning of a word?

The answer lies in a uniquely computer-like approach:
converting all words into numbers.

Representing each word as a set of numbers
Learning meaning from the relationships between those numbers
Using numbers to calculate similarities and differences between words

In this way, AI uses its only real tool—numbers—to approach the world of meaning.

And the key to this approach is the concept of a vector.

3. What Is a Vector? A Simple, Intuitive Explanation

The word “vector” might remind you of something complex from math class.
But in AI, the idea of a vector is actually quite simple.

A vector is just a set of numbers that represent different features of something.

For example, if we wanted to describe a person using numbers, it might look like this:

Age: 32
Height: 170 cm
Weight: 65 kg

This collection of numbers, representing multiple traits of a person, is a vector.

So you can think of a vector as:
“a bunch of numbers that capture different characteristics of something.”

Words can be represented this way too

Just like people, words can also be turned into vectors.

Take the word “apple”, for instance. It has several characteristics:

It’s a food
It’s sweet
It’s a fruit
It often appears in health-related contexts

These traits can all be encoded into numbers, forming what’s called a word vector.

This is how AI begins to treat words like “apple” or “banana” not as plain text, but as bundles of meaningful numbers.

4. Turning Words into Vectors: Word Embedding

In the previous section, we saw that words can be represented as bundles of features.
The technique used to do this is called Word Embedding.

“apple” becomes a long list of numbers

For example, AI might represent the word “apple” like this:

"apple" → [ 0.11, -0.04, 0.87, ..., 0.32 ]

This is a word vector—a sequence of dozens or even hundreds of numbers.

Each number encodes some aspect of what “apple” means or how it’s used.

Words are placed on a “map of meaning”

When all words are represented in this way, AI begins to form what you can think of as a map of meaning.

For example:

“apple” → [ 0.11, -0.04, 0.87, …, 0.32 ]
“banana” → [ 0.09, -0.02, 0.85, …, 0.30 ]
“hospital” → [ -0.55, 0.10, -0.90, …, 0.05 ]

The differences in these vectors reflect the differences in meaning:
similar words cluster together, while unrelated words are farther apart.

Using word vectors, AI can start to “manipulate” meaning

Once words are turned into numbers, AI can process them using math.

Similar vectors = similar meanings
Greater distance = less related
Vectors can be combined to form new meanings

This means AI can now understand and work with meaning through numbers.

We’ll explore how this “map of meaning” is used in practice in the next chapter.
It lies at the core of how generative AI models grasp context and produce natural language.

[Advanced Topic] What Is a “Distributed Representation”?

As we’ve seen, representing words as vectors is known as a distributed representation.

In the past, AI treated each word—like “apple” or “banana”—as an isolated symbol, such as ID=1 or ID=2.
With this approach, any sense of similarity between words was completely lost.

But in a distributed representation, a word’s characteristics are spread out across dozens or even hundreds of dimensions in its vector.
This allows the distance and relationship between words to be naturally expressed in space.

Words with similar meanings → have similar vectors
Words with different meanings → are placed far apart

In short, distributed representations offer a revolutionary way to capture meaning spatially.

This foundational idea leads directly into more advanced techniques such as contextual embeddings (where meaning depends on surrounding words), and the Attention mechanism, which we’ll explore in the next lesson.

5. Words with Similar Meanings End Up Close Together

As AI reads through massive amounts of text, it starts to observe how words are used.

For example, the words “apple” and “banana” often appear in similar contexts:

“I packed an apple and a banana in my lunch.”
“I had a banana and yogurt for breakfast.”

In contrast, “apple” and “hospital” rarely appear side by side.

Words used in similar situations tend to have similar meanings

This is true for humans too.
When we see words frequently used in the same types of situations, we intuitively feel:

“They seem like they belong to the same group.”
“Their meanings feel close.”

AI learns this same intuition through repeated exposure.

On the “map of meaning,” similar words are placed close together

As a result, “apple” and “banana” end up close to each other in the vector space,
while “apple” and “hospital” are positioned far apart.

[Diagram: Similar words are close in the vector space]

           ( far )   hospital
                ↑
                |
            apple  banana

In this way, words with similar meanings naturally “move closer” in the world of numbers.

The vector space becomes a map of meaning, and AI uses this map to gradually understand the relationships between words.

6. How Are Word Vectors Created?

The vector for “apple” wasn’t handcrafted by a human.
AI learns it gradually by reading massive amounts of text.

Not taught—learned through experience

No one tells the AI, “Apple is a fruit, so give it a vector like this.”

Instead, the model picks up on patterns like:

“apple” often appears with “banana” or “fruit”
“apple” and “hospital” rarely appear together
“patient” and “hospital” frequently co-occur

By noticing how words are used in relation to others, AI begins to infer what each word might mean.

Vectors are adjusted automatically, based on usage patterns

Each word starts with a random vector.
But as learning progresses, the vector gradually shifts to reflect its meaning.

The more the AI reads, the more refined the vector becomes—
just like humans gain a sense of a word’s meaning by seeing it in many contexts.

For AI, “experience” means reading lots of text

Humans learn meaning through real-life experiences.
For AI, its experience is reading—tons and tons of text.

Through this, it adjusts word vectors such that:

Similar words move closer together
Dissimilar words move farther apart

That’s how AI learns to represent the meaning of “apple”—not by feeling it, but by encoding it in numbers through patterns it discovers in text.

7. What Can AI Do with Word Vectors?

By representing words as vectors, AI can treat language not just as symbols, but as meaningful data.

This transformation allows AI to behave as if it understands word meanings and relationships.

Example 1: Swapping similar words still makes sense

Take these two sentences:

He bought an apple.
He bought a banana.

Both are natural and easy to understand.
Humans can tell that apple and banana are similar—they’re both fruits.

Thanks to word vectors, AI can recognize this similarity too.
Because their vectors are close, it knows both words “fit” in the same sentence.

Example 2: Understanding word relationships

Now consider the words “fever” and “infection”— commonly paired in medical texts.

As AI reads more documents, it learns:

“fever” often co-occurs with “infection”
They are related in meaning

So when a user says, “I have a fever,” AI can infer, “Maybe an infection is involved,” based on learned patterns.

Example 3: Guessing unfamiliar words from context

Even if AI hasn’t seen the word “grapefruit” often, it might notice that:

“grapefruit” appears near “apple” and “orange”
It shows up in similar contexts as “fruit”

From this, AI can guess: “grapefruit is probably a kind of fruit too.”

Vectors = Maps of Meaning

Thanks to word vectors, AI can navigate a “map of meaning” where:

Similar meanings are close together
Distant meanings are far apart
Related words form patterns and clusters

This mapping helps AI interpret unfamiliar words, understand sentence context, and communicate in ways that feel surprisingly human.

8. Vectors Enable “Meaningful Calculations”

When words are represented as vectors—essentially sets of numbers— AI gains the ability to do calculations on meaning.

It may sound surprising, but AI can perform operations like:

king - man + woman ≈ queen

This famous example comes from research on word embeddings (Mikolov et al., 2013).

What does this mean?

This equation expresses the idea that:

“king” includes the concept of being male
Subtracting “man” removes the “maleness”
Adding “woman” brings in the “femaleness”
-> The resulting meaning is close to “queen”

Meaning can be moved numerically

This isn’t just a coincidence—it’s possible because meaning is encoded in numbers.

Concepts like:

Gender (male/female)
Social role (royalty, occupation)
Emotions or abstract traits

…are spread across different dimensions of the vector, allowing AI to measure and manipulate meaning numerically.

This is essentially doing math in the space of meaning.

Just like humans might say:

“This word feels more masculine.”
“That one sounds more emotional.”

AI can make similar judgments—not through feelings, but through math.

In short: Vectors are tools for manipulating meaning

With word vectors, AI doesn’t just store words— it can shift, combine, and reason about their meanings.

This capability brings AI one step closer to having a “sense” of language.

Such semantic arithmetic was a major breakthrough in natural language processing (NLP), and it forms the foundation of how large language models and generative AI use language so fluently today.

**9. Why AI Seems to Understand Meaning**

AI doesn’t have emotions or consciousness.
It can’t visualize scenes from words, or feel moved by language the way we do.

And yet, today’s AI systems use language so naturally.
Why is that?

Because it treats words not as symbols, but as meaningful numbers

Thanks to word vectors—numerical representations of meaning—AI can do things like:

Read and summarize a medical record as if it understood the content
Answer questions with contextually appropriate words
Compose natural sentences by selecting words that fit smoothly

It acts like it understands

Of course, AI doesn’t actually “get it.”
It doesn’t know what a word means in the way humans do.

But by representing meaning as numbers, and manipulating those numbers through learned patterns,
AI can:

Choose the right word
Maintain semantic consistency
Respond with phrases that match the context

From our perspective, it feels like the AI understands meaning.

AI doesn’t feel meaning — but it can simulate it

In short, AI doesn’t experience meaning.
But it can work with meaning—in ways that mimic human understanding.

This is one of the most remarkable advances in modern language AI, and it’s what powers its growing impact in fields like healthcare, education, and business.

Summary of Lesson 8: The First Step Toward AI Understanding Language — Vectorization

Key Point	Explanation
Words are just symbols to computers	Computers cannot understand meaning the way humans do
Convert words into vectors	Using word embeddings to represent meaning numerically
Similar words are placed close together	Distance in vector space reflects similarity in meaning
AI can perform calculations on meaning	Examples like “king – man + woman ≈ queen” show semantic arithmetic

In the next lesson, we’ll take a deeper dive into what these vectors mean in context— focusing on the powerful mechanism known as Attention!

⚠️ Disclaimer

This content is based on information available at the time of writing.
Please note that updates to tools, libraries, or technologies may result in changes to the described content.

This material is intended for educational purposes only and should not be considered medical advice.
If applying these technologies in actual clinical settings, please ensure compliance with all relevant laws and guidelines (e.g., from the Ministry of Health, Labour and Welfare [MHLW], PMDA, METI, or relevant academic societies), and seek expert consultation as needed.

When using generative AI, particular caution must be taken regarding issues such as hallucinations (inaccurate outputs) and algorithmic bias.
A human expert should always review and validate any AI-generated outputs before clinical use.

This content includes portions drafted with the assistance of AI. While every effort has been made to ensure accuracy, any situation requiring professional judgment—such as in medicine, law, or education—should always be evaluated by a qualified specialist.

Disclaimers and Important Notices

Legal and Regulatory Compliance This educational material is intended for general learning and reference purposes only. All users are responsible for ensuring their own compliance with the latest laws, regulations, and guidelines issued by relevant authorities, including but not limited to the Ministry of Health, Labour and Welfare (MHLW), the Pharmaceuticals and Medical Devices Agency (PMDA), the Ministry of Economy, Trade and Industry (METI), and professional academic societies. This includes adherence to the Physicians’ Act, the Act on Securing Quality, Efficacy and Safety of Products Including Pharmaceuticals and Medical Devices (PMD Act), the Act on the Protection of Personal Information, and medical advertising guidelines.

Responsibility for Medical Practice While the AI technologies and methods introduced in this material are designed to assist in clinical diagnosis and treatment, the final decision-making authority for diagnostic and therapeutic strategies, as well as the approval of any reports, must reside with qualified medical professionals. Do not rely solely on the output of AI systems. Always implement a human-in-the-loop verification process, ensuring that all AI-generated results are reviewed and validated by a human expert.

Data Privacy and Protection When utilizing any datasets presented in this course, you must strictly adhere to the license terms and research ethics guidelines specific to each dataset. Ensure that all necessary measures for personal information protection, such as patient data anonymization and obtaining informed consent, are properly implemented.

Intellectual Property Rights If you wish to reuse or reproduce any figures, tables, code, or text from this article, you are required to provide clear attribution to the original source, obtain permission from the rights holder, and include the appropriate license notation.

Accuracy of Information The numerical data, case studies, and library version information provided in this material are based on information available at the time of writing. As functions and libraries are subject to updates, please refer to the latest official documentation and publications to ensure you are using the most current information.

Considerations for AI Utilization Portions of this content were created with the assistance of AI-powered suggestions; however, the final published material has been reviewed and edited by human experts. When using generative AI, be particularly cautious of potential hallucinations (factually incorrect information) and biases. All outputs from such systems must be rigorously verified by a human professional.

Disclaimer of Liability The authors and providers of this material assume no liability for any damages arising from its use. This content is intended for research and educational purposes only and should not be used for any other reason.

Target Audience and Prerequisites This course is primarily designed for healthcare professionals and engineers with an interest in AI technology. A foundational knowledge of Python and a basic understanding of statistics are presumed.

System Environment and Compatibility The code and libraries featured in this material are based on specific versions available at the time of writing (e.g., PyTorch 2.0, Transformers 4.x). Please be aware that compatibility is not guaranteed, and the code may not function as expected on different operating systems, hardware, or with different dependency packages.

Clarification of Scope Under no circumstances should the content of this educational material be considered a substitute for professional medical practice or advice. For the consideration and implementation of any products or services, a formal approval process and consultation with legal and quality assurance departments are required.

よかったらシェアしてね！

URLをコピーしました！

URLをコピーしました！

この記事を書いた人

髙﨑洋介（医師・医学博士・MBA） | Dr. TAKASAKI Yohsuke, MD, PhD, ScM, MPA, MBA, FRSM

AI医師科学者芸人・医学博士・連続起業家・元厚生労働省医系技官
ハーバード大学理学修士・ケンブリッジ大学MBA・コロンビア大学行政修士
岡山大学医学部卒業後、内科・地域医療に従事。厚生労働省で複数室長（医療情報・救急災害・国際展開等）を歴任し、内閣官房・内閣府・文部科学省でも医療政策に携わる。
退官後は、日本大手IT企業や英国VCで新規事業開発・投資を担当し、複数の医療スタートアップを創業。現在は医療AI・デジタル医療機器の開発に取り組むとともに、東京都港区で内科クリニックを開業。
複数大学で教授として教育・研究活動に従事し、医療者向けAIラボ「Medical AI Nexus」、医療メディア「The Health Choice | 健康の選択」、美・医・食ポータル「Food Connoisseur」を主宰。
ケンブリッジ大学Associate・社会医学系指導医・専門医・The Royal Society of Medicine Fellow