SOUNotes

AI-03 Knowledge Representation

Data, Information, and Knowledge

Data – Raw Facts

Definition: Data refers to raw, unprocessed, and unorganized facts. By itself, data does not carry any meaning until it is interpreted.

Key Points:

Data is the basic input for information systems.
It can be qualitative (e.g., colors, names) or quantitative (e.g., numbers, percentages).
Without processing, data is not useful for decision-making.

Examples:

Individual values: 25, Red, Shyam, 85%.
AI perspective: sensor readings, images, audio signals, text strings.

Storage Formats:

Databases
CSV files
Data warehouses
Data lakes

Information – Processed Data

Definition: Information is data that has been processed, organized, or structured in such a way that it becomes meaningful.

Key Points:

Information answers questions like who, what, where, when.
Information provides context and relevance to raw data.
Used for identifying trends, generating reports, and supporting analysis.

Examples:

"Rahul scored 85% in Mathematics."
In AI: "User clicked 5 times on an advertisement."

Applications:

Reporting systems
Data visualization
Trend analysis

Knowledge – Applied Information

Definition: Knowledge is the result of combining information with experience, insights, and reasoning. It involves interpretation and the ability to apply information to decision-making and problem-solving.

Key Points:

Knowledge is actionable.
It goes beyond facts to include understanding, reasoning, and judgment.
In AI, knowledge is stored using facts, rules, and procedures.

Examples:

"Raj is good at Mathematics; he should take advanced classes."
Rule-based example: If temperature > 100°C, display a warning message.

Storage Mechanisms in AI:

Knowledge bases
Expert systems
Rule engines
Ontologies

Analogy: Baking a Cake

Data: Ingredients such as flour, sugar, and eggs (raw, unprocessed items).
Information: The recipe (structured steps providing meaning to ingredients).
Knowledge: Understanding how to adjust oven temperature or substitute ingredients based on experience.

What is Knowledge?

Definition: Knowledge is information that has been processed, structured, and integrated with experience, enabling reasoning, learning, and decision-making.

In AI Systems:

Just as humans store knowledge in memory, AI systems store it in structured formats.
Knowledge helps in:
- Decision-making
- Problem-solving
- Learning from data and prior cases

Types of Knowledge:

Factual Knowledge – "Water boils at 100°C."
Conceptual Knowledge – "A square is a type of rectangle."
Procedural Knowledge – "Steps to make tea."
Heuristic Knowledge – "If someone is shivering, they might be cold."

Why Store Knowledge?

Reasons:

To enable intelligent decision-making
- Example: A chatbot stores user preferences to improve responses.
To ensure reusability
- Knowledge can be reused across different problems and applications.
To build structured knowledge bases
- Example: A medical knowledge base stores diseases and symptoms.
To represent information in formal formats
- Examples of representation: semantic networks, frames, logic rules, ontologies.
To adapt storage formats to problem domains
- Rules are effective in domains like legal reasoning and medical diagnosis.
To support inference engines in AI
- Example: If temperature > 100, suggest "Take rest."
- The inference engine uses stored knowledge to reason and derive new conclusions.

Types of Knowledge Representation

Declarative Knowledge – Facts and Rules

Definition: Declarative knowledge refers to "knowing what." It represents facts, truths, relationships, and conditions about the world in the form of statements or rules.

Key Characteristics:

Explains what is true about the world.
Includes facts, rules, relationships, and conditions.
Stored as statements, assertions, or logical rules.
Typically static in nature and does not change frequently.
Used in AI for reasoning and inference.
Forms the foundation of knowledge bases and expert systems.
Answers what, when, and where questions.
Separates knowledge representation from control or execution.
Commonly represented using logic, semantic networks, production rules, or frames.
Fundamental to symbolic AI and rule-based systems.

Examples:

"Paris is the capital of France."
"If age < 18, then classify as minor."
"Birds have feathers."
"All squares are rectangles."

Procedural Knowledge – Step-by-Step Procedures

Definition: Procedural knowledge refers to "knowing how." It consists of methods, instructions, or step-by-step processes required to perform a specific task or achieve a goal.

Key Characteristics:

Tells us how to perform actions or tasks.
Involves ordered steps, sequences, or methods.
Stored in the form of algorithms, programs, or scripts.
Dynamic and action-oriented in nature.
Essential for automation, planning, and control.
Answers how-to questions.
Commonly applied in robotics, gaming, and automated tools.
Requires execution to be useful (cannot remain theoretical).
Can be harder to explain or transfer compared to declarative knowledge.
Often works in combination with declarative knowledge for complete AI systems.

Examples:

Steps to sort numbers using Bubble Sort.
Procedure to make tea (boil water, add tea leaves, sugar, and milk).
Diagnostic steps for identifying an illness.
Path-following algorithms used by robots for navigation.

Difference

Aspect	Declarative Knowledge	Procedural Knowledge
Meaning	Knowing what (facts, truths, rules).	Knowing how (methods, steps, processes).
Representation	Represented as statements, facts, rules, or logical assertions.	Represented as algorithms, programs, scripts, or step-by-step instructions.
Nature	Static – rarely changes once established.	Dynamic – involves actions and execution, changes with tasks.
Purpose	Used for reasoning, inference, and understanding relationships.	Used for performing tasks, problem-solving, and achieving goals.
Answers	What, when, where questions.	How-to questions.
Storage	Stored in knowledge bases, expert systems, semantic networks, or frames.	Stored in procedures, algorithms, code, or action rules.
Execution	Does not require execution; can exist as facts or rules.	Requires execution to demonstrate knowledge.
Ease of Transfer	Easier to explain and transfer (facts can be stated directly).	Harder to explain or transfer (skills often need practice to learn).
Examples	- Paris is the capital of France.- If age < 18, classify as minor.- Birds have feathers.	- Bubble Sort algorithm.- Making tea step by step.- Robot navigation procedure.
Application in AI	Forms the basis of symbolic AI, expert systems, and knowledge representation.	Forms the basis of robotics, planning systems, games, and automation tools.

Expert Systems

An Expert System is a computer program that simulates the decision-making ability of a human expert. It uses a knowledge base (facts and rules) and an inference engine (reasoning mechanism) to solve problems and provide advice in a specific domain, such as medicine, education, or technical troubleshooting.

Characteristics of Expert Systems

Mimics Human Expertise – Provides advice and solutions similar to a human expert.
Domain-Specific Knowledge – Focuses on a particular area (e.g., medicine, law, or computer repair).
Rule-Based Reasoning – Works using "if–then" type rules.
Explanation Facility – Can explain how and why it reached a conclusion.
User-Friendly Interface – Easy to use without technical expertise.
Consistency – Provides the same answers every time, without fatigue or bias.
Inference Engine – The reasoning component that applies rules to the facts to draw conclusions.

Advantages of Expert Systems

Available 24/7 and can be accessed anytime.
Provides quick and reliable answers.
Cost-effective compared to hiring a full-time expert.
Ensures consistent decisions without variability.
Retains and preserves expert knowledge permanently.

Limitations of Expert Systems

Cannot replicate human intuition, creativity, or emotional understanding.
Requires human experts to build, update, and maintain the system.
Limited to a single specialized domain; cannot solve problems outside it.
Struggles with uncertain, incomplete, or entirely new situations.

Applications / Example Projects

Medical Expert Systems – Diagnose illnesses based on patient symptoms.
Laptop Troubleshooting Systems – Suggest fixes based on reported issues.
Learning Assistants – Recommend study material based on performance.
Car Repair Expert Systems – Identify faulty parts based on problem descriptions.

Project - What is MYCIN?

MYCIN was an early expert system developed in the 1970s at Stanford University. It was designed to help doctors diagnose bacterial infections (such as blood infections) and recommend suitable antibiotics.

Process: The system asked the doctor a series of questions, such as:
- Does the patient have a fever?
- Are there signs of infection in the blood?
- What is the patient’s age and weight?
Functionality: Based on the input (facts), MYCIN applied if–then rules to:
- Diagnose the type of infection (e.g., Streptococcus, E. coli).
- Recommend the best antibiotics with correct dosage.
- Provide an explanation for its reasoning, similar to a real doctor.
Modern Successors: Examples of modern expert systems in medicine include:
- IBM Watson Health
- DXplain
- Infermed
- AI-powered diagnostic tools integrated into hospital systems

Forward Reasoning

Forward reasoning—also called forward chaining—is a data-driven inference method used in rule-based systems. It starts from known facts, repeatedly applies if–then rules whose conditions match those facts, derives new facts, and continues until it reaches a goal/conclusion or no new facts can be inferred.

Key Characteristics

Direction: Facts → rules fire → new facts → conclusion (data-driven).
Trigger: A rule fires when all its antecedents (IF conditions) are satisfied by current facts.
Goal Handling: Conclusions emerge naturally as consequences of rule firing; goals need not be stated upfront (but may be monitored).
Determinism: With a fixed conflict-resolution strategy, results are reproducible.
Typical Use: Diagnosis, monitoring, configuration, simulation, control—where inputs are available and we want to enumerate consequences.

Core Components in a Forward-Chaining System

Rule Base (Knowledge Base): Production rules of the form IF condition1 ∧ condition2 ∧ … THEN assert new_fact / perform action.
Working Memory (Fact Base): The current set of known facts.
Inference Engine:
- Match: Find rules whose conditions match facts in working memory.
- Select (Conflict Resolution): If multiple rules match, choose one using a strategy (see below).
- Act (Fire): Execute the selected rule, typically adding new facts or actions.
Agenda (Conflict Set): The set of all currently fireable rules awaiting selection.

Common conflict-resolution strategies:

Specificity: Prefer rules with more specific (more conditions) antecedents.
Recency: Prefer rules using the most recently added facts.
Priority/Salience: Prefer rules with higher designer-assigned priority.
Refraction: Prevent the same rule from firing on the same fact pattern repeatedly.

Algorithm (High-Level)

Initialize working memory with given facts.
Repeat:
- Match rules whose conditions are satisfied by working memory.
- If no rules match, stop (no new inferences).
- Select one (or an ordered set) of rules from the agenda.
- Fire the selected rule(s): add new facts or perform actions.
- Optionally check: has a goal been derived? If yes, stop.
Return the derived conclusion(s) and, if supported, an explanation trace.

Termination conditions:

No new facts can be added (agenda empty), or
A desired goal/conclusion appears in working memory.

Worked Examples

Medical (diagnosis flow)

Facts: patient_has_fever, patient_has_cough, patient_has_fatigue
Rules (illustrative): R1: IF fever ∧ cough THEN suspect_infection R2: IF suspect_infection ∧ sneezing THEN suspect_viral_flu_or_covid R3: IF test_covid_positive THEN diagnose_covid
Flow: Start with symptoms → apply R1 → if sneezing appears, apply R2 → order test → if positive, apply R3 → diagnosis reached.

Causal chain

Fact: it_is_raining
Rules: R1: IF it_is_raining THEN road_is_wet R2: IF road_is_wet THEN traffic_is_slow
Conclusion: traffic_is_slow.

When to Prefer Forward over Backward Reasoning

Inputs are known; outputs are unknown. Example: real-time monitoring/diagnosis where new sensor facts arrive continuously.
You want all implications of current facts, not only to test a specific hypothesis.
Control/automation contexts where rule firing must react to changing facts.

Advantages

Naturally accommodates streaming/continually updated facts.
Good for discovering multiple consequences in one pass.
Transparent reasoning with an explanation facility (trace of fired rules).
Modular knowledge: rules can be added/edited without redesigning the whole system.

Limitations and Practical Considerations

Search space growth: May derive many intermediate facts that are irrelevant to a particular goal.
Rule interactions/loops: Requires controls (e.g., refraction, cycle checks) to avoid infinite firing.
Conflict resolution is crucial: Different strategies can change performance and even derived conclusions when knowledge is underspecified.
Uncertainty handling: In noisy domains, pair rules with certainty factors, probabilities, or fuzzy logic.
Efficiency: Industrial systems use pattern-matching algorithms (e.g., RETE) to speed up matching.

Typical Applications

Medical diagnosis, fault detection, and troubleshooting.
Business rule engines, eligibility and compliance checking.
Industrial control and process automation.
Configuration systems and decision support.
Event processing and monitoring systems.

Backward Reasoning

Backward reasoning—also called backward chaining—is a goal-driven inference method. It starts with a hypothesis/goal and works backwards through the rules to check if the facts support the conclusion.

Key Characteristics

Direction: Goal/conclusion → check rules backwards → verify facts.
Trigger: Begin with a hypothesis and search for rules that could produce it.
Goal Handling: Explicitly defined at the start; reasoning stops when proven or disproven.
Typical Use: Diagnosis, theorem proving, legal reasoning—where the goal is clear but facts need to be verified.

Core Components

Rule Base: Production rules of the form IF condition1 ∧ condition2 THEN conclusion.
Goal (Hypothesis): The conclusion to be proven.
Working Memory: Contains known facts.
Inference Engine:
- Select a rule whose conclusion matches the goal.
- Add its conditions as new subgoals.
- Recursively attempt to prove each subgoal from facts or other rules.
- If all subgoals succeed → goal proven; if not → backtrack.

Algorithm (High-Level)

Start with a goal.
Search for rules with the goal in their conclusion.
For each such rule:
- Add its conditions as subgoals.
- Try to prove subgoals from facts or by applying more rules.
- If all subgoals succeed, conclude the goal is true.
- Otherwise, try another rule (backtracking).
If no rules or facts support the goal, reasoning fails.

Worked Examples

Medical Diagnosis

Goal: Diagnose appendicitis.
Rule: IF abdominal_pain_lower_right ∧ nausea ∧ high_wbc THEN appendicitis.
Process:
- Goal: appendicitis?
- Check rule conditions → ask about pain, nausea, blood test.
- If all facts confirmed → goal proven (appendicitis).

Causal Chain

Goal: traffic_is_slow.
Rules: R1: IF road_is_wet THEN traffic_is_slow. R2: IF it_is_raining THEN road_is_wet.
Process:
- Goal: traffic_is_slow?
- Rule R1 → need to prove road_is_wet.
- Rule R2 → need to prove it_is_raining.
- Fact: it_is_raining is true → backtrack fills chain → conclude traffic_is_slow.

When to Prefer Backward Reasoning

Goal is clearly defined, but initial facts are unknown or large in number.
Efficient when only a few goals need to be tested.
Useful in question–answer systems, legal reasoning, diagnosis, and theorem proving.

Advantages

Focused search: only explores rules relevant to the goal.
Efficient in large fact spaces, since it avoids irrelevant paths.
Provides clear explanation chains ("Why is this goal true?").
Naturally aligns with problem-solving and diagnostic reasoning.

Limitations

Not efficient if there are many possible goals (each requires separate reasoning).
Requires the goal to be known before reasoning begins.
Backtracking can become computationally expensive.
Struggles with uncertain or incomplete data unless extended with probabilities/fuzzy logic.

Typical Applications

Medical expert systems (diagnosis).
Legal reasoning (proving innocence or guilt).
Theorem proving (logic and mathematics).
Troubleshooting systems (finding the cause of faults).
Knowledge-based tutoring systems.

Forward vs Backward Reasoning

Aspect	Forward Reasoning (Forward Chaining)	Backward Reasoning (Backward Chaining)
Approach	Data-driven (start with facts → derive conclusions)	Goal-driven (start with goal → check supporting facts)
Starting Point	Known facts/data in working memory	A hypothesis or goal to be tested
Direction of Reasoning	From facts → apply rules → reach conclusions	From goal → trace rules backwards → verify facts
When Useful	When all inputs are known but output is unknown	When goal/conclusion is known but inputs are uncertain
Efficiency	Can generate many conclusions, including irrelevant ones	Focused on proving only the given goal, avoids irrelevant paths
Example (Traffic)	Fact: “It is raining” → Rule: rain → wet road → slow traffic → Conclusion: traffic is slow	Goal: “Traffic is slow” → Need to prove road wet? → Need to prove it rained? → If true, confirm goal
Applications	Simulation, prediction, monitoring, control systems, weather forecasting	Diagnosis, theorem proving, legal reasoning, troubleshooting
Computational Behavior	Explores breadth of consequences; may be inefficient if many rules	Explores depth of reasoning chain; may backtrack heavily
Typical Systems	Pattern recognition, forecasting, expert monitoring systems	Medical expert systems, legal advisors, theorem provers
Analogy	Detective collecting clues step-by-step until reaching the truth	Lawyer starting with a claim and searching for evidence to prove it

AI-03 Knowledge Representation

Table of Contents

Data, Information, and Knowledge

Data – Raw Facts

Information – Processed Data

Knowledge – Applied Information

Analogy: Baking a Cake

What is Knowledge?

Why Store Knowledge?

Types of Knowledge Representation

Declarative Knowledge – Facts and Rules

Procedural Knowledge – Step-by-Step Procedures

Difference

Expert Systems

Characteristics of Expert Systems

Advantages of Expert Systems

Limitations of Expert Systems

Applications / Example Projects

Project - What is MYCIN?

Forward Reasoning

Key Characteristics

Core Components in a Forward-Chaining System

Algorithm (High-Level)

Worked Examples

Medical (diagnosis flow)

Causal chain

When to Prefer Forward over Backward Reasoning

Advantages

Limitations and Practical Considerations

Typical Applications

Backward Reasoning

Key Characteristics

Core Components

Algorithm (High-Level)

Worked Examples

Medical Diagnosis

Causal Chain

When to Prefer Backward Reasoning

Advantages

Limitations

Typical Applications

Forward vs Backward Reasoning