2 AI for Everyone: An Introductory Overview
Goals
This session aims to introduce AI to a non-specialist audience, ensuring that participants from any background can understand these essential concepts. The focus will be on explaining key terminology and the basic principles of machine learning and deep learning. By the end of this session, participants will have a solid foundational knowledge of key AI concepts, enabling them to better appreciate and engage with more advanced topics in the following sessions.
2.1 The Foundations of AI
Before reading the definition, take a moment: What do you think AI is? How would you define it?
Artificial Intelligence (AI) refers to the development of computer systems capable of performing tasks that typically require cognitive functions associated with human intelligence, such as recognizing patterns, learning from data, and making predictions.
But… there is a minor issue with this definition. What exactly is human intelligence?
Recognizing patterns, learning, and making predictions are all functions of intelligence, but what lies at the core of a “conscious human”? Why is self-awareness important in cognition, and what evolutionary function does subjective, conscious experience serve?
In the philosophy of mind, this phenomenon is referred to as qualia, and there is still no definitive scientific answer to why qualia exist—at least, not yet (see theories of consciousness for more information).
But today, let’s focus on a simpler question. How do humans think?
Historically, before the 1950s–1960s, scientists believed humans think through a series of if/else statements (e.g., “If I drink more coffee, I’ll be jittery,” or “If a seagull spots my pizza, it’ll try to snatch a bite”). Geoffrey Hinton, a cognitive psychologist and computer scientist, was one of the advocates for an opposing idea: that humans think more experientially or probabilistically. For instance, based on the cloud cover today and similar past experiences, there’s a high probability of rain, so I’ll grab an umbrella. This idea laid the foundation for probabilistic algorithms and, ultimately, the field of Machine Learning.
2.2 Machine Learning
To quickly recap, AI is a broad term encompassing efforts to replicate aspects of human cognition. Machine Learning is a subset of AI that focuses on algorithms enabling computers to learn from data and build probabilistic models.
Machine Learning (ML) is a subset of AI that specifically focuses on algorithms that allow computers to learn from data and create probabilistic models.
Machine Learning includes various types and techniques, but in this workshop we’ll primarily focus on Neural Networks (NNs).
2.3 Neural Networks
NNs are loosely inspired by the structure of the human brain and consist of interconnected nodes, or neurons, that process information. The principle “neurons that fire together, wire together” [1] captures the idea that the strength of their connections, known as weights, adjusts based on experience.
Neural Network (NN) is a foundational technique within the field of machine learning. NNs are designed to simulate the way the human brain processes information by using a series of connected layers, or neurons, that transform and interpret input data.
The Perceptron [3], one of the earliest neural network models, was invented in 1957 by psychologist Frank Rosenblatt, who unfortunately did not live long enough to witness the far-reaching impact of his work. Rosenblatt’s Perceptron was a physical machine with retina-like sensors as inputs, wires acting as the hidden layers, and a binary output system. This invention marked the early stages of artificial intelligence, laying the groundwork for the powerful neural networks we use today.
2.4 Exercise: NN Playground
Level 1: Browse around
Orange indicates negative values, while blue represents positive values. Typically, an 80/20 split for training and testing data is used. Smaller datasets may need a 90% training portion for more examples, while larger datasets can reduce training data to increase test samples. Background colors illustrate the network’s predictions, with more intense colors representing higher confidence in its prediction. Adding noise during training helps the model generalize by recognizing true patterns, enhancing robustness and stability with real-world noisy data.
Level 2: Things to try
Hidden Layers are the layers that are neither input nor output. You can think of the values computed at each layer of the network as a different representation for the input X. Each layer transforms the representation produced by the preceding layer to produce a new representation.
Start with one hidden layer and one or two neurons, observing predictions (orange vs. blue background) against actual data points (orange vs. blue dots). With minimal layers and neurons, predictions are often inaccurate. Increasing hidden layers and neurons improves alignment with the actual data, illustrating how added complexity helps the model learn and approximate complex patterns more accurately.
Level 3: More things to try!
Weights are parameters within the neural network that transform input data as it passes through layers. They determine the strength of connections between neurons, with each weight adjusting how much influence one neuron has on another. During training, the network adjusts these weights to reduce errors in predictions.
As you press the play button, you can see the number of epochs increase. In an Artificial Neural Network, an epoch represents one complete pass through the training dataset.
The learning rate is a key setting or hyperparameter that controls how much a model adjusts its weights during training. A higher rate speeds up learning but risks overshooting the optimal solution, while a lower rate makes learning more precise but slower. It’s one of the most crucial settings when building a neural network.
2.5 Backpropagation
Initially, neural networks were quite shallow feed-forward networks. Adding more hidden layers made training them difficult. However, in the 1980s—often referred to as the rebirth of AI—the invention of the backpropagation algorithm revolutionized the field. It allowed for efficient error correction and gradient calculation across layers, making it possible to train much deeper networks than before.
Backpropagation is an algorithm that calculates the error at the output layer of a neural network and then “back propagates” this error through the network, layer by layer. It updates the connections (weights) between neurons to reduce the error, allowing the model to improve its accuracy during training.
Thus, the backpropagation algorithm enabled the training of neural networks with multiple layers, laying the foundation for the field of deep learning.
2.6 Deep Learning
Deep Learning (DL) is a subset of ML, that uses multilayered neural networks, called deep neural networks.
Deep learning (DL) techniques are typically classified into three categories: supervised, semi-supervised, and unsupervised. Additionally, reinforcement learning (RL) is often considered a partially supervised technique, sometimes overlapping with unsupervised methods.
Supervised Learning involves learning from labeled data, where models directly learn from input-output pairs. Common examples include Recurrent Neural Networks (RNNs), Convolutional Neural Networks (CNNs), and Transformers. These models are generally simpler in terms of training and achieve high performance.
Semi-Supervised Learning combines a small amount of labeled data with a large amount of unlabeled data, often using auto-labeling techniques. Examples include Self-training models, where a model iteratively labels data to improve, and Graph Neural Networks (GNNs), which are useful for understanding relationships between data points.
Unsupervised Learning relies on unlabeled data, focusing on identifying patterns or structures. Popular models include Autoencoders, Generative Adversarial Networks (GANs), and Restricted Boltzmann Machines (RBMs).
Despite advances in backpropagation, deep learning, computing power, and optimization, neural networks still face the problem known as catastrophic forgetting — losing old knowledge when trained on new tasks. Current AI models are often “frozen” and specialized, needing complete retraining for updates, unlike even simple animals that can continuously learn without forgetting [4]. This limitation is one of the reason that led to the development of specialized deep learning models, each with unique architectures tailored to specific tasks. Let’s explore how each of these models can be applied in scientific research!
2.7 AI Beyond Machine Learning
Within the field of AI, there are many techniques that don’t rely on ML principles.
Technique | Description |
---|---|
If/Else or Rule-Based Systems | Collections of predefined rules or conditions (if statements) to make decisions. |
Symbolic AI (Logic-Based AI) | Logical rules and symbols to represent knowledge, focusing on reasoning through deductive logic. |
Genetic Algorithms (Evolutionary Algorithms) | Optimization algorithms inspired by natural selection. |
Fuzzy Logic | A form of logic that works with “degrees of truth”, making it useful for uncertain or ambiguous scenarios. |
Knowledge Representation and Reasoning (KR&R) | Techniques for structuring and processing information, often using ontologies and semantic networks. |
Bayesian Networks | Probabilistic graphical models that represent relationships between variables. |
Recent research increasingly combines various AI paradigms, such as symbolic AI and Knowledge Representation and Reasoning (KR&R), with Machine Learning (ML) to achieve a higher level of effectiveness tailored to specific tasks.
2.8 The Future of AI in Science
AI is transforming the scientific method by supporting each step of scientific discovery. Let’s consider how various AI techniques can be applied at each stage of the scientific process:
- Observation: Using computer vision for data collection.
- Hypothesis: Clustering data with unsupervised learning.
- Experiment: Simulating environments through reinforcement learning.
- Data Analysis: Simplifying data with PCA and classifying insights using neural networks or SVMs.
- Conclusion: Combining LLMs with KR&R to generate complex findings and insights.
In conclusion, regardless of model type, high-quality data is essential for accurate AI predictions and insights. In the next sessions, we’ll explore practical tips for working with well-prepared, high-quality data.
Do You Have Any Questions?
Feel free to reach out!
Email: alyonak@nceas.ucsb.edu
Website: alonakosobokova.com
YouTube: Dork Matter Girl