How to fast-track your learning with AI

How to Fast-track Your Learning with AI

“Any sufficiently advanced technology is indistinguishable from magic”

I’ve always obsessed with leveraging as much of the technology society has to offer. And would often be fearful of missing out on potential opportunities that I knew would give me an edge.

On November 30th, 2022, The world changed forever. Artificial Intelligence became accessible to the public, for free. From that point forward I’ve been able to use it to speed up my learning by large. Yet, I had to spend a lot of time researching, experimenting, and feeling frustrated. This was necessary to create frameworks and processes. I want to help you save time and effort. So, I’ve distilled my findings into one article. After reading it you’ll learn faster and amaze others with your speed of learning.

How well we learn depends on how and when we process and manipulate information. We can improve how well we do this by using ChatGPT.

Here’s a glimpse of the framework we will be going over.

The AI Enhanced Learning Process

Each step in our diagram will be like adding a turbocharger onto a car, the mechanics stay the same but the turbocharger amplifies the engine.

The goal of this system is to optimize the following:

Learning Rate = Enlightenment/Unit Time

We often trick ourselves into believing that studying and learning are the same. But they aren't. Studying is the action, and learning is the desired outcome. If our efforts don't result in the desired outcome, then we are wasting time. We shouldn't measure our learning by random things like the number of pages we read or how long we studied. Instead, we want the number of lightbulb/'aha' moments to guide us.

Mindset Shift: If you want to do this correctly, you need to transition from passive spectator to active agent throughout the learning process. This means that you'll need to stop doing the following activities.

— Memorizing without intuition
— Getting Lost in the details
— Constant repetition in practice (doing the same problems as opposed to switching it up)
— Re-reading,
— Passively going through the lecture instead of actively looking for information.

This kind of behavior frequently leads to the illusion of studying. - So be sure to watch out for it.

Before we start learning we need to understand the different types of knowledge that exist (this section) and how to acquire them as quickly as possible (the next few sections). While using AI to enhance the process.

There are two main types of knowledge sources I'm going to speak of (although there are more nuanced kinds, we'll stick to these for the purposes of this article)

Procedural and Declarative.

Procedural: This type of knowledge is often hard to articulate. Largely thought of as knowing 'how' as opposed to 'what'

For instance, how you can know how to ride a bike without really being able to explain the nuances of how you do it.

Declarative Knowledge: This type of knowledge is easier to articulate. And has more to do with understanding and remembering concepts and connections between them.

We can define these knowledge types more succinctly as follows:

Quality of Declarative Knowledge = Retention x Fluidity of Understanding

Quality of Procedural Knowledge = Accuracy x Speed of Execution

Note: What’s interesting is that these two knowledge types are interdependent. Meaning that increasing one has a side-effect on the other. But, their relationship does not stand on an equal footing. The relationship from declarative to procedural is much stronger than from procedural to declarative. Meaning, that translating declarative knowledge into procedural knowledge is easier than the opposite.

This theory is called ACT-R Theory. And we will use it in the section on acquiring procedural knowledge- more on that later.

Here’s what mastery looks like for both knowledge types.

A high level of Procedural knowledge looks like automatic execution with minimal errors.

A high level of declarative knowledge looks like being able to navigate concepts in the higher orders of learning with ease and high retention.

This is the ideal we are striving for.

The first step in trying to achieve the former lies in answering the following question.

How do lightbulb moments happen?

Answer:

The principle of relevance/connections.

Humans develop insight by connecting things to other things:

This might seem simple, but it’s actually quite nuanced and can come in many forms (as we will soon see).

Consider the following story.

Imagine a group of early humans who discovered fire. At first, they only understood that fire provided warmth and a way to cook food. This was their initial framework: fire leads to warmth and cooked food.

Later, these early humans faced a new problem: predators. One night, a large predator approached the camp. Feeling scared, someone grabbed a burning stick and waved it, making the animal run away. Now, the concept of fire expanded in their minds. It could be used not only for warmth and cooking but also for protection.

Over time, they discovered that fire could improve their wooden spears for hunting.-fire hardened the tips. Again, the framework expanded, fire could also be a tool for crafting better weapons.

Researchers know this effect as the ‘spreading activation’ effect. But I like to think about it as the ‘principle of connections’ in disguise.

Here’s how it works.

When a queue linked to a certain concept pops up, neural pathways connecting them light up. Then other related ideas to that certain concept light up as well.

In doing this, we get a chain of effects. (This is the crucial bit)

In the previous example, there were three queues that activated the idea of fire.

. Food
. Predators
. Hunting

Which then triggered not just the concept of ‘fire’ but also nearby pathways linked to it as a consequence:

. Warmth
. Protection
. Crafting better weapons

Note: The Spreading activation effect will be one of the most important drivers behind how we process information. And I’m going to be showing you a special way of doing it. Then, I’ll show you how you can speed it up with AI

How effective AI will be in helping us achieve this effect will be limited by our ability to ask the right questions and guide it in the right direction.

There are two types of prompts that are going to help us with this.

. Questions
. Commands

A clever combination of each will give us the effect we want.

The process we are going to be speaking of shortly revolves around actively finding the right information at the right time-this concept is known as ‘order control’ in the research on learning. ChatGPT gives us the ability to control what information we consume at what stage in the learning process- which makes a huge difference. (We will also see how this relates to the spreading activation effect).

But, before we go further, we should learn to create good prompts for better ChatGPT responses.- So let’s do that.

Prompt Engineering Principles

Below are a few tips that will help with getting improved outputs

1) Be specific with your prompts. Here are a few ways you can do that.

— Be specific in the subject you are testing out
— Be specific in the length of the response
— Be specific in the role that you give it
— Implement verbs like exhaustive, simple, short, etc…
— Provide examples of what you have in mind for the model so that it can give you a similar response.

2) Don’t overwhelm it with too much information in one go or else it will get ‘confused’. Split it up. There are two ways you can do this.

— Split up your prompts and prompt the bot several times

or

— Give the entire prompt in one go, and then prompt it something like: “You don’t have to give me all the information I’m asking of you at once, take your time and provide me answers for each keyword one at a time”. After you’ve been provided with each answer, use the follow-up prompt: “Keep going”.

If you still aren’t getting the output you want, re-prompt the model and explain what it got wrong and how to change it. Remember, it’s kind of like working with someone else, it can’t guess what’s in your head. Which means you’ll have to go back and forth until you’re satisfied.

Finding Resources

The first step to learning anything involves finding the right resources. — After all, You can’t learn if you don’t have the information to begin with.

But we have to be on the same page. What is a resource anyway?

Resource: Information about a subject presented through some medium- such as audio, video, text, etc…

Prompts for collecting resources

-Give me an explicit list of resources for learning [subject].

- Go through this paper/textbook/ etc… and extract the table of contents using the advanced data analysis feature- which lets you upload files.

or

-You could also ask it to provide you with a list of the most important keywords for learning [ ]

- You could gather keywords from various resources yourself and then paste them into the AI (the best alternative in my opinion).

Note: The purpose of this stage is to set things up for the true learning phase- nothing super special here.

Note: Based on this approach, having our content in text format is optimal. However, If the resources are audio or video there are two options we have when it comes to using this strategy.

You can use an AI like Whisper to transcribe it into text

Skip through sections in the video or audio and try to get the information you need by spotting keywords that you can then paste into the AI.

Quick Note: If you decide to have the AI extract the keywords from the text (instead of doing it manually), you’ll realize that it will sometimes have difficulty reading through PDFs. To get around this you can search online for a pdf to text converter (The AI is better at reading text files).

Here is an example of a prompt I did for trying to extract the table of contents for a simple grade 1 math workbook.

Example:

Me: Can you go in-depth into the various sections that it has in its table of contents

1. Extend a Pattern (Page 1)

2. Create a Pattern (Page 3)

3. Make a Pattern (Page 5)

4. Numbers to 10 (Page 7)

5. Numbers 11 to 15 (Page 9)

6. Numbers 16 to 20 (Page 11)

7. Choose a Strategy (Page 13)

8. Compare Numbers (Page 15)

9. Order Numbers (Page 17)

10. Addition Stories (Page 19)

11. Modeling Addition (Page 21)

12. Addition Sentences (Page 23)

13. Adding Zero (Page 25)

14. Write a Number Sentence (Page 27)

15. Ways to Make 4, 5, and 6 (Page 29)

16. Ways to Make 7, 8, and 9 (Page 31)

17. Ways to Make 10, 11, and 12 (Page 33)

18. Choose a Strategy (Page 35)

19. Vertical Addition (Page 37)

20. Subtraction Stories (Page 39)

Manually pasting the keywords

On the other hand, if I was using the ‘better’ approach- in most cases. I would look through the list of keywords from the various texts that I find or that I ask it to give me.

[ ]

Copy, and paste it into the AI. Then tell the bot that I’m going to be asking questions in the upcoming prompts.

Protip: After you have the resources it gives you, store them in your notes, word document, latex document, or wherever else so that you can easily refer to them when necessary.

Pro Tip: Have a high tolerance for how the information is given, if you can’t find good keywords go to another resource that you got from it or try to re-prompt it with an adjusted prompt for new resources.

Processing Information

Now we can move on to the interesting stuff. How do we encode information in a way that optimizes this ‘spreading activation’ effect?

Here are three ways we can do that (that I’ve found).

1) Grouping Concepts

2) Finding relationships between concepts

3) Establishing the right queues

Here’s why they work:

When we group concepts together, we can represent them as one item called a chunk. This helps us remember more information at once -based on how it affects our working memory. It also helps us organize information- making it easier to retrieve when we need it. (Groups are also the basic units of declarative knowledge).

Finding connections between concepts creates more pathways. This makes it navigate between concepts.

Establishing the right queues provides signals for when one concept should be used over another.

Note: Remember, even though these have the common factor of bringing ideas together in one way or another, they have different effects. Which is why we are distinguishing them.

The magic happens when you combine all 3. (which is what we will be going over in a later section)

But first, let’s dive into each one of them more deeply.

Queues

The main idea I want you to take with establishing queues is the following:

‘Concepts can be thought of as solutions to problems’.

Here are a few examples:

— Understanding plants solves the problem of classifying nature

— Metal solves the problem of being able to build certain infrastructures

— Newton’s Law solves the problem of us being able to predict the motion of objects

— Neural Networks help with solving the problem of learning in Machine Learning

— Cats solve the problem of having unwanted mice around — they eat them.

Why is this important?

We wanted to generate as much spreading activation as we could. But, only focusing on information, not cues, holds us back from activating our brain’s pathways to it. If we don’t also take this into account when a problem comes up, we won’t remember the ideas that solve it.

New Mindset

Instead of asking yourself about the concepts. Ask yourself about the queues.

What are the concepts? — — -> What are the queues?

Note:

It’s interesting that by knowing enough about the queues, you can figure out the object. For instance, if I were to tell you that people use cats under the following circumstances:

— Licking themselves (Licking themselves as a queue)

— Chase Mice (Mice Chasing as a queue)

— Eating Fish (Eating fish as a queue)

This detailed information helps us understand the concepts better. — Which is a nice side bonus.

Prompt for Queues:

“Given the following list of concepts, please identify the specific queues for when each should be employed.”

Grouping

We talked about grouping but it’s important that we be clear on the kind of grouping that we want.

Meaningful vs Non-Meaningful grouping

To group concepts meaningfully, we use shared cues instead of random features.

Bad example:

Group concepts that have the letter starting with A — No bueno.

Good Example:

Group concepts that predict the motion of an object- Much better.

Grouping information based on queues helps make it more meaningful and memorable.

Prompt:

“Group the following concepts based on similarities in their queues or criteria for use.”

Note: Sometimes the AI will ignore the queues and start grouping them as it wants. When faced with these situations, take the initiative to group things together yourself. This will also help you think more deeply about the concepts.

We should also know about this idea of subchunking-It helps us combine things and form our encoding process. To better understand it consider the following little experiment.

Experiment:
First Question: Think of a fruit.

Second Question: Think of a fruit that is yellow and often found in tropical locations.

In the first question, only one cue is provided: “fruit.” This could lead to a wide range of possible target memories. You might think of apples, oranges, grapes, cherries, and so on. The single cue activates many neural pathways linked to various types of fruits.

In the second question, two cues are given: “yellow” and “found in tropical locations.” These cues, when combined, generate a more specific and stronger activation in your memory network. The cues work together to limit the number of possible fruits you might think of, making it more likely that you’ll think of a “banana,” for example.

The two cues in the second question jointly produce enough activation of the word “banana” that the activation likely surpasses that of other fruits in your memory. Therefore, the word “banana” is more easily and quickly retrieved from your memory.

By using this idea of ‘layered queues’ we can start to not only create chunks but also subchunks. — This is going to be super important.

Note: Remember, our subchunks should be specific and meaningful, not irrelevant as mentioned earlier.

Here’s a useful tip: After putting things in groups, you can change the name of the group to something you know. Here are some examples that will help make more sense of this.

Examples of reframing

Let’s say that we are studying classical mechanics and I decided to reframe things differently as follows.

“Newton’s Laws“ — -> Understanding motion and Interaction
“ Force “ — -> Push or Pull
“ Moment of Inertia“ — -> Rotational Sluggishness

Notice that when you compare them side by side. The latter is far more intuitive. And the reason for this is because it reframes things in terms of prior knowledge.

Here’s another example:

Take a brief moment and go through the following.

Internal and External

Causes and Effects

Problems and Solutions

Beginning, Middle, End

Input vs Output

Evolution of Models

If I told you, you can understand the new groupings you’re learning using these frameworks. You would start to have a feel for what’s going on, without knowing the intricate details. You can make your queues/labels more familiar by using what you already know. Which leads to much quicker learning. -This is why analogies and metaphors can be so powerful.

Another Protip: To gain different viewpoints, ask the AI to group your previous queries in different ways. We will later see as we go on that this idea of ‘variability’ is important.

I’ve made a list of queues you encounter when studying self-driving cars to show how variations of chunks could occur.

1st Grouping:

Always Active Components
Sensors and Data Gathering: Continuously collects raw data from the environment.
Data Preprocessing: Cleans and fuses the raw data for downstream components.
Perception: Always operational to make sense of the vehicle’s surroundings.
Localization: Keeps track of the vehicle’s position within a known map.
Control: Executes the planned actions, sending real-time commands to the vehicle’s hardware.
Human-Machine Interface: Active for user interactions and emergency controls.

Conditional or Optional Components
Prediction: Operates when other agents like cars or pedestrians are detected, to anticipate their actions.
Planning: Activated during driving to determine the optimal path and actions.
Communication (V2X): Can be active at all times but especially beneficial in complex traffic scenarios.

User-Interactive Components
Human-Machine Interface: Provides interfaces like voice commands, touch screens, and emergency buttons for human interaction.

2nd Grouping:

Data Collection and Preprocessing
Sensors and Data Gathering: Collects initial raw data.
Data Preprocessing: Makes the raw data usable.

Understanding the Environment
Perception: Recognizes and understands objects and road conditions.
Localization: Determines the vehicle’s position relative to the world.
Prediction: Anticipates the actions of other agents.

Decision Making
Planning: Chooses what actions to take based on perceived data and predictions.

Execution and Control
Control: Translates planned actions into physical movements.

User and Vehicle Interaction
Human-Machine Interface: Manages user inputs and displays.
Communication (V2X): Handles external communications, like vehicle-to-vehicle or vehicle-to-infrastructure.
Safety and Overrides

Emergency Systems: Engages in critical situations to ensure safety.

3rd Grouping:

Hardware-Centric
Sensors and Data Gathering: Direct interaction with physical hardware.
Control: Executes physical movements, and interacts with the vehicle’s mechanical systems.
Emergency Systems: Directly tied to hardware for immediate response.

Software-Centric
Data Preprocessing: Purely a software task to clean and structure data.
Perception: Software algorithms for object recognition and understanding.
Localization: Software-based mapping and positioning.
Prediction: Algorithmic models for forecasting actions of other agents.
Planning: Decision-making algorithms.
Human-Machine Interface: Software interface for user interaction.

Communication-Centric
Communication (V2X): Manages all external communications, both hardware and software-based.

Connections

Connections are like a network of roads. A single road can get you from one place to another, but the network allows you to reach a wide range of destinations.

There are two different types of connections I’d like to speak of.

1) Connections between previous concepts that you’re familiar with.

2) Connections between new concepts that you’re learning.

There are two important pieces to getting this right.

Explore and Exploit

By ‘explore’ I’m referring to finding as many connections between ideas as possible. By ‘exploit’ I mean prioritize and get a deeper sense of the nature of these connections.

Let’s dive into both of them.

Explore

Here’s an example of finding connections for the concept of an Encoder in the field of NLP.

(Connecting new concepts together)

Prompt: “Can you map out how the concept of encoders in NLP is related to other concepts within the field? I’m looking for as many connections as possible.”

(Connecting it to familiar concepts)

Prompt: “Can you provide analogies or simpler explanations to help me understand how encoders in NLP connect to more basic or intuitive concepts?- Give me as many as possible”

Comparisons often lead to connections. Here are four approaches I’ve found to be effective when it comes to comparing ideas. (Not exhaustive of course)

Comparing them based off of:

1) Subcomponents

2) Queues for when to use the concepts

3) Connections to other concepts (Yes you can compare not just concepts but relationships between concepts)

4) Compare and contrast them by looking at examples

Prompt: “Can you analyze the similarities and differences between [Concept A] and [Concept B], based on their subcomponents/ appropriate use-cases/connections to other concepts/ concrete examples?”

Note: You should probably split the prompt up into three distinct ones in order to get a more accurate response- not the entire one above.

Exploit

Now that we have a decent amount of connections let’s try to explore the nature of these connections.

Prompt: “How are A and B connected?”

From experience, I’ve found the following examples useful to keep in mind when trying to better understand the nature of different possible relationships.

1) Sequential — like an evolution
2) Causal
3) Hierarchical

Protip: When establishing connections it’s often useful to have it all in a table.

Learning in Layers

Consider both of these pictures side by side:

Notice that on the left-hand side, we have more connections than on the right.

This is because we are using a layer/breadth-first approach as opposed to a depth-first approach.

When we learn in layers, the number of connections between ideas increases exponentially and we get a big-picture overview to start. On the other hand, when you learn by going depth-first, the number of connections between ideas increases linearly and we run the risk of getting lost in the details.

Exponential >>> Linear

So, as we are learning, we are going to want to learn things at a high level, connect ideas, and then dive into the details bit by bit.

The Encoding Cycle

Having the previous effects at our disposal will give us a huge increase in how quickly we learn. But we can do better. Consider this

If we combine the previous concepts together we end up with a self-reinforcing loop.

. Chunks help us make more connections

. Making connections helps us make more chunks

. Learning in layers helps us make more connections

. Learning through queues helps make more connections

. Making more connections can lead to more queues

These interdependencies make the system as a whole far more powerful than its individual parts- because each of the techniques are self-reinforcing.

This effect is known as a lollapalooza effect- a term coined by the famous investor Charlie Munger. Lollapalooza effects are created when an outcome is much bigger than the sum of its parts. Where one positive effect enhances the power of the next one and so forth.

Note: You could try a variation to this system and tailor it to your needs. This is just what I’ve found to be the most effective.

1. Collect queues for each of the concepts you are trying to learn

2. Group concepts together based on similar queues

3. Connect groups of information together by studying how they are related

4. Try to repeat steps 2 and 3 but now use more specific subqueues. — for instance, you break down the chunk ‘techniques for boiling tea’ into techniques for boiling white tea and black tea- refer to the previous section on subchunks.

5. Keep doing this until you feel like you can no longer keep subchunking. At this point, we should be left with a layer that only has concepts to deal with. Once we reach this layer we have to do things a little differently. Now we prompt the AI to give us a list of the subcomponents for each of the given concepts. Then repeat the cycle from step 1 all over again. Keep doing this as much as you can until you reach a certain level of nuanced detail where doing this no longer becomes feasible.

If you simplify the process, all we are doing is grouping queues and finding connections. At its simplest level that’s the system.

Procedural Knowledge

After having experimented and researched this topic, I found that treating procedural knowledge differently than declarative knowledge is more effective.

As it turns out, people don’t develop general-purpose problem-solving skills. Instead, they store common solution patterns- obtained from experience, that they can then apply when encountering a problem to solve.

This is why working through examples is so powerful. And why working through examples first is a better alternative to unnecessarily banging your head against the wall while solving a problem because you don’t have any patterns to apply.

When you work through an example and then start to do a problem, there are two possible ways in which you can respond.

1) If the example matches the problem you apply what was learned

2) If not you need to analogically extend the pattern learned from the example.

Then, with sufficient practice and error correction general rules develop, and the example is no longer needed.

This is how knowledge transitions from a declarative form (encoding of examples) to a procedural form (production rules)

Note: This theory is called the ACT-R theory. (Adaptive Control of Thought — Rational)- We saw a hint of it in the earlier sections.

There is an interesting analogy between ACT-R theory and how neural networks learn. Think about it:

Neural Networks learn by:

(1) Gathering an example data-set

(2) Training it on these examples by comparing it with the true answer

(3) Testing it on new problems to see how well it does

So the trick here is to separate a class of examples covering the same concepts so that you can learn abstract problem-solving patterns and then test them on new problems.

Here’s the process.

(i) Gather a set of high-quality examples.

(ii) Work through them, zoom out, and extract common solution patterns.

(iii) Implement patterns on other problems.

Mindset shift: Your goal with examples is to store the solution process. Not the specific features. Don’t solve problems for the sake of solving problems, solve them for the sake of extending your ‘pattern bank’.

When you are solving a problem here’s what you are doing:

You have an initial state and a final state. Your goal is to transform from the initial state into the final state. It seems simple but that’s it.

In between the steps are (IF-THEN) substeps that allow you to reach your goal. In ACT-R theory these are called productions. Each production rule represents a fundamental step or sub-step that, when conditions are met, triggers a specific action that moves the process forward to the next step.

In order to optimize the number of productions we store, we want to introduce as much variation into the ‘example consumption’ process as we can.

— We can vary the initial state (Concepts and constraints for solving the problem)
— We can vary the final state (What we want to solve)
— We can vary the path that we use to get from the initial to the final state (The strategy we plan to use).

If we want to come up with a more difficult example, we would:

— Incorporate more difficult concepts (Initial State)
— Change the constraints in a way that makes the problem more difficult (Initial State)
— Change the final state so that it is easier to get to
— Use a more difficult strategy to get there.

On the other hand, if we want to have simpler examples we would:

— Limit the number and complexity of concepts
— Ease the constraints of the task
— Make a simpler final state
— Use a simpler strategy to get to the final state

The idea is to start with simpler examples and then increase the complexity bit by bit using the former list.

Prompts:

“Can you give me an example of concept [ A ], walking me through it step by step and getting into the details/numbers? “

Note: This doesn’t mean that you don’t need to practice- we will later see that that’s a crucial part of our system. But when just starting out examples are more beneficial. But there does come a point where examples are less effective than consistently doing practice- this is known as the expert-reversal effect.

Actionable Steps:

Pick a concept or group of concepts that you want to learn the ‘how’ behind.

Start simple. Then bit by bit increase the complexity in one of the three- really four, ways that I mentioned.

Proptip: You can collect examples with different variations and compare them to spot differences. — comparisons make learning patterns a quicker process.

Protip: When asking for examples be sure to include words like “use numbers” or “detailed and concrete examples” when prompting. This ensures that it goes into the details- because that’s what we want at this stage.

Example of generating examples

Below is an example I did for trying to acquire different patterns when learning concepts in machine learning:

Prompt: Can you walk me through a step-by-step example of how to implement min-max scaling?

ChatGPT:

Certainly! Min-Max Scaling is a simple technique to scale your data into a specific range, usually 0 to 1, or -1 to 1. Here’s how you can do it step by step:

1) Identify the Data:

Assume you have data: X = [ 4, 8, 15, 26, 23, 42]

2) Find Min and Max:

Minimum Value (min): 4
Maximum Value (max): 42

3) Apply the Formula:

Formula: X_new = (X — X_min)/(X_max — X_min)

4) Apply this formula to each data point in X:

X_new[0] = (4−4)(42−4)=0
X_new[1] = (8−4)(42−4)≈0.10
X_new[2] = (15−4)(42−4)≈0.28
X_new[3] = (16−4)(42−4)≈0.31
X_new[4] = (23−4)(42−4)≈0.49
X_new[5] = (42−4)(42−4)=1

Now, your scaled data X_new is [0, 0.10, 0.28, 0.31, 0.49, 1]

Prompt: “Now with that in mind can you vary the constraints we are given for min-max scaling and give another example with these varied constraints.”

ChatGPT:

Certainly! Let’s say you want to scale your data into a range of -1 to 1 instead of 0 to 1. The formula will change slightly:

X_new=((X−X_min)(X_max − X-min))×2 − 1

Let’s apply this formula to your data X = [4, 8, 15, 16, 23, 42] with the same min and max values:

X_new[0]=((4−4)(42−4))× 2 − 1 = − 1
X_new[1]=((8−4)(42−4))× 2 − 1 ≈ − 0.8
X_new[2]=((15−4)(42−4))× 2 − 1 ≈ − 0.44
X_new[3]=((16−4)(42−4))× 2 − 1 ≈ − 0.4
X_new[4]=((23−4)(42−4))× 2 − 1 ≈ − 0.02
X_new[5]=((42−4)(42−4))× 2 − 1 = 1

Now, your scaled data X_new is [-1, -.08, -0.44, -0.4, -0.02, 1].

Prompt: “Now again give me another example but I want you to also mix in a few other concepts from NLP into it.”

ChatGPT

Certainly! Let’s consider a text dataset, where you are analyzing the frequency of certain words or phrases. You have the following word counts: X = [5, 10, 20, 25, 30, 60] . You want to scale these counts into a range of -1 to 1 for further analysis, perhaps as features for a machine learning model.

The formula for Min-Max scaling into a range of -1 to 1 is:

X_new=((X−min)(max−min))× 2 − 1

Applying this formula to your data:

X_new[0]=((5−5)(60−5))× 2 −1 = − 1
X_new[1]=((10−5)(60−5))× 2 − 1 ≈ − 0.8
X_new[2]=((20−5)(60−5))× 2 − 1≈ − 0.4
X_new[3]=((25−5)(60−5))× 2 − 1≈ − 0.2
X_new[4]=((30−5)(60−5))× 2 − 1= 0
X_new[5]=((60−5)(60−5))× 2 − 1 = 1

Now, your scaled word counts X_new are [-1, -0.8, -0.4, -0.2, 0, 1] which can now be used as features in, for instance, a sentiment analysis model or other NLP tasks, with the benefit of having normalized the varying frequency counts.

Note: I also found examples useful when it comes to learning about the details in declarative knowledge not just procedural knowledge.

Retrieval Methods

Now…

Even if you did all the previous steps correctly. You’re still not going to reach the level of mastery you want. And it’s because you’re missing an important piece of the puzzle.

Two of which are resolved by retrieval practice:

— A system for finding knowledge gaps

— A way to strengthen your acquired knowledge

Note: We will get into the third missing piece once we get in the feedback section.

Being able to identify gaps and strengthen prior knowledge is important for ‘obvious’ reasons. If we can identify and fill in gaps, we gain detailed knowledge that’s difficult to get using other methods. If we can strengthen what we’ve learned, we’ll be more flexible and confident with the ideas we work with.

As we do, something interesting happens.

We rise up through Noel Bursch’s hierarchy of competence model.

Unconscious competence is the kind of feeling you get when you can fluidly navigate between concepts and quickly execute processes with minimal to no errors.

It’s the feeling of mastery.

Important Note: You might think that the point of doing retrieval is to get the right answer. But this is a fallacy. The purpose is to get as many wrong answers out of the way as we can.

So…

Instead of measuring your success with the questions you get right. Focus on finding the most gaps per unit of time.

Gaps/Unit Time

Caveat: Repeated retrieval practice can be beneficial for improving the speed and strength of concepts. But beyond a certain point, repetition of the same processes produces hardly any benefits.

Note: I’ve found that examples interestingly enough can also be used to find gaps

Note: In the section that follows we will dive deeper into what signs to look for when identifying a gap as well as approaches for filling it in. For now, we will look into different retrieval techniques.

Retrieval Techniques

I’ll provide two different retrieval techniques. One aimed at helping correct declarative knowledge and the other aimed at helping correct procedural knowledge. Although they aren’t mutually exclusive- sometimes while doing one we will realize the gap is actually of a different nature.

Conceptual/Declarative

— Relational Teaching (A specific way)

Procedural
— Relational Question Generation

Note: There are other techniques we could use, but I’ve found these to be highly effective- especially when we combine them with AI.

Teaching:

Write/type a list of the concepts that you are trying to learn.

Then pick and choose different concepts in the list. And ask the AI to ask you to explain:

Different scenarios when the concepts can be used

2. Different relationships between the concepts

3. What happens when you incorporate different concepts

Protip: You can also ask it to prompt you to not just explain scenarios, relationships, and incorporations, but also to compare and contrast each of them.

Tips when teaching that don’t necessarily involve AI:

1) Simple

2) Condensed

3) Insightful/High value

There are different ways you can achieve the former effects. Here are a few ways in which you can do this- that I know.

Metaphors

Analogies

Examples/case studies

Lower grade language

Visuals

Frameworks.

Break it down into sub-concepts

Give a general overview of the idea first

Relational Problem Solving

This idea is similar to what we talked about in the section on exposing ourselves to different examples. Except instead of using problems as examples we know actually use them as problems.

The list isn’t exhaustive, but here are a few ways in which we can increase variation or reduce variation depending on which effect we want to generate.

1. Considering sub-goals of the multi-step process (Changing final state)

2. Adding or removing concepts to generate a new goal. (Changing final state and initial state)

3. Varying the constraints of the task (Changing initial states)

4. Varying the context in which we use the problem (Changing final states)

5. Varying the strategy we use to reach the goal (Changing the path)

(The same kind of variations that we saw in the example part)

Just like we saw in the using examples section. We want our problems to be kept in the Goldilocks zone of development.

Too complicated and we lack pre-requisite knowledge — and have too many gaps that are hard to spot all at once.

Too Simple and we might be repeating known patterns.

Examples of different Variations

1) Breaking it down into subproblems (Variation of the final state)

Notice how the question 7 times 8 can be split into the question. 7 times 10? 70–7? 63–7? It’s the same idea as how a large problem can be broken down into a bunch of subproblems.

Prompt: Given a problem, such as [Specify Problem], identify ways to break it down into simpler sub-problems or sub-questions.

2) Concept Addition/Removal (Variation of initial and final state)

Start with a list of concepts

For instance:

1. Force
2. Mass
3. Acceleration
4. Friction
5. Gravity
6. Air Resistance
7. Initial Speed
8. Final Speed
9. Distance
10. Time

Now the fun starts. Here are different prompts you could use to remove or add concepts in this specific case.

Prompt 1: Generate a problem using minimal concepts, such as just Force, Mass, and Acceleration.

Response: Problem 1: A 2 kg object is subjected to a force of 10 N. What is its acceleration?

Prompt 2: Add “Friction” to the list and generate a new problem.

Response: Problem 2: A 2 kg object is subjected to a force of 10 N while experiencing a frictional force of 2 N. What is its net acceleration?

Prompt 3: Add more concepts like Gravity and Time.

Response: Problem 4: A 2 kg object is subjected to a force of 10 N and experiences a frictional force of 2 N. It is also subject to gravity. How far will it move in 2 seconds?

Prompt 4: Now remove gravity

Response: A 2 kg object is subjected to a force of 10 N and experiences a frictional force of 2 N. How far will it move in 2 seconds without the influence of gravity?

3) Contextual Variation (Variation of Final State)

Contextual Variation Examples:

Physics: Newton’s Second Law (F = ma)
Standard Context: A box sliding down an inclined plane.

Contextual Variation: An elevator accelerating upward or downward.
Another Context: A rocket’s acceleration in space, considering the force of thrust and change in mass.

Protip: If you can use contextual variation in a way that implements a very weird context where those concepts aren’t typically used you get a lot of learning in those scenarios- weirdly enough.

Prompt: For the problem [Specify Problem], walk me through various different contexts in which it can be used. Specific contexts to consider could be [XYZ].

4) Task Variation (Variation of the initial conditions)

The idea here is to alter the constraints you have for getting to a certain goal.

Prompt: For the problem [Specify Problem], walk me through various ways that simplify or modify this problem by changing its values or conditions without altering the core concepts. Specific aspects to consider changing could be [XYZ].

5) Variable path (Variation of strategy)

Variability can also come in the form of maintaining the desired outcome but changing the strategy we use to get there. For instance in physics, you can try to solve the equations of motion of a classical mechanics problem using different solution techniques- Newton’s law, Lagrangian, or Hamiltonian framework.

Prompt:
Given [Problem XYZ], prompt me with different strategies [Strategy A, Strategy B, Strategy C, etc.] that could be used to solve it.

Protip: When solving problems, tackle them in layers. Try solving many problems at once- so that your subconscious can take over and aid you. This will prevent you from becoming too focused on one problem. — This is like the idea of layered learning except its usefulness is for a different reason.

Feedback Flywheel

One of the hardest parts about learning is not knowing what it is that you don’t know.

The point of this section is to solve this issue by:

1. Better understanding the different gaps we could encounter

2. Knowing how to fill them in

Here are a few signs you should watch out for.

Signs of a gap:

1) Feeling like you lack confidence about an idea

2) Getting it wrong

Note: These two things will happen as you go through your retrieval sessions.

But if we want to fill in knowledge gaps. We’ll need to define feedback from first principles.

Feedback: Information about the gap between where you are and where you want to be which is then used to alter the gap in some way.

The key takeaway is that we need to look for the ‘right’ information. To do this we need to better understand the gap itself, not just have awareness of the gap.

A useful way to get this done is by passing it through the following filter.

I call it the Knowledge Gap Quadrant- cheesy name I know.

Knowledge Gap Quadrant:

As you encounter signs of a gap, ask yourself:

Is it a singular concept that I’m missing or a connection between concepts?

Is the error conceptual? Or does it have more to do with the how behind using concepts to achieve an outcome?

Note: It’s often a combination of individual concepts and connections. And procedural and declarative. Not each individually. — So you’ll find gaps all around usually, more in some places than others though (this is what I’ve noticed).

After passing it through this filter, and understanding the nature of the gap, we send it through the following feedback cycles.

Note: Another reminder. Don’t worry too much about making sure that the gap you find is in the right quadrant. Each quadrant helps the others, so solving one will benefit them all. Furthermore, when you try to encode it again, you often notice the error is different.

Note: You could be more specific than the knowledge quadrant when looking for errors. For instance, if your issues arise in the form of declarative connections, then you could ask yourself more specific questions like is it about not understanding how it connects to the big picture concepts? Is it about not being so sure how it connects to certain queues? or maybe it’s just about how it relates to XYZ concept? etc…

Important note:
As you start to fill in one gap, you’ll notice others arising.

This is expected. And actually good because you are getting more gaps out of the way.

I call this the gap solution paradigm

Where one gap leads to many more gaps — eventually dying off as we manage to fill them in.

Note: There is one illusion we fall prey to when learning something new that makes it hard to spot and fill in gaps. It’s called silly mistake syndrome. When we can’t solve a problem, but the answer makes sense, we think we understand it. But if you don’t change your behavior when the same condition presents itself, you don’t. This is why it’s important to retrieve in similar ways after ‘filling’ in the gap to double-check.

Scheduling

The last piece of the puzzle has to do with scheduling when and what we should be covering.

But, we want to do it in a way that focuses on:

1) Weakest links in knowledge

2) Most important concepts

3) Proper spacing

Note: By proper spacing, we mean implementing the spaced repetition strategy in order to combat forgetting. The idea is that studying the same amount of time in spaced intervals is more effective than studying that same amount of time in one big chunk.

This forgetting effect is nicely captured by the so-called “Ebbinghaus Forgetting Curve”:

You’ll notice that after each spacing session, the curve gets less steep- this is what we want, because then the next time after, it takes longer to forget.

The way I like to set this up is by using the what, when, and how framework.

We give information about:

The concepts we are going to study, how weak we feel on each one, how important it is to the general subject as well as the time constraints we have each day for a set period of time so that we can optimize the spacing effect

We then want the AI to act like a calculator that quickly organizes a schedule for us so that we don’t have to do the manual work and figure out the optimal schedule ourselves.

Here’s an example of a prompt that I gave for studying different topics in trigonometry — I didn’t provide the list of concepts explicitly because It has already been provided in a previous prompt.

1st Prompt: “I’m going to provide you with my confidence level for each topic in trigonometry on a scale of 1–10. I also have an importance level for each topic. I plan to study these topics for 3 hours each day for a week. After I give you this information, could you please create an optimal study schedule for me that uses the spaced repetition method? I want to review all the concepts each day but with varying amounts of time based on my confidence and their importance.”

Example:

Prompt: Basic Concepts: 9, Trigonometric Functions: 7 Trigonometric Identities: 5, Trigonometric Equations: 7, Applications: 4, Advanced Topics: 4, Real World Applications 3. Also: The first 4 topics you gave me are the most important- just so you know.

Note: You could also be more specific and give numbers for the importance part too, but in this case, I decided to be a little bit more general and only provide numbers relevant to my weakest links.

Response

Day 1: Initial Exposure
Basic Concepts: 15 minutes
Trigonometric Functions: 30 minutes
Trigonometric Identities: 45 minutes
Trigonometric Equations: 30 minutes
Applications: 30 minutes
Advanced Topics: 15 minutes
Real-World Applications: 15 minutes

Day 2: First Review
Basic Concepts: 10 minutes
Trigonometric Functions: 20 minutes
Trigonometric Identities: 40 minutes
Trigonometric Equations: 20 minutes
Applications: 25 minutes
Advanced Topics: 10 minutes
Real-World Applications: 15 minutes

Day 3: Second Review
Basic Concepts: 10 minutes
Trigonometric Functions: 20 minutes
Trigonometric Identities: 35 minutes
Trigonometric Equations: 20 minutes
Applications: 25 minutes
Advanced Topics: 10 minutes
Real-World Applications: 20 minutes

Day 4: Third Review
Basic Concepts: 5 minutes
Trigonometric Functions: 15 minutes
Trigonometric Identities: 35 minutes
Trigonometric Equations: 20 minutes
Applications: 30 minutes
Advanced Topics: 15 minutes
Real-World Applications: 20 minutes

Day 5–7: Final Reviews and Practice
Days 5–7: 3 hours each day
Reason: Use these days to revisit topics based on your performance during the first four days. Allocate more time to areas where you find gaps.

Quick Tip: For the planning part you should put the importance metric and the confidence metric in a Google sheet so that it can return you a schedule that is table-based.

Summary

In this article, we learned:

— The difference between studying and learning

— How to provide effective prompts

— Collecting Resources using AI

— How to encode declarative knowledge more easily using the learning cycle

— How to use examples to encode procedural knowledge

— How to use retrieval methods to improve the strength of knowledge and find gaps

— How to be aware of gaps, analyze them, and fix them

— How to schedule study sessions once you’ve found gaps in knowledge using AI

References

This article arose as a combination of personal trial and error, a few research articles, and a big chunk of Justin Sung’s course and Scott Young’s articles on learning- couldn’t have done it without them, so thank you.

Scott Young: Long-Term Memory Guide (https://www.scotthyoung.com/blog/2019/02/15/memory/)

Scott Young Working Memory Guide (https://www.scotthyoung.com/blog/2019/04/24/working-memory/)

Scott Young: Cognitive Load Theory And Its Applications for Learning (https://www.scotthyoung.com/blog/2022/01/04/cognitive-load-theory/)

Scott Young: Do You Learn More by Struggling on Hard Problems (https://www.scotthyoung.com/blog/2022/05/25/do-you-learn-more-by-struggling-on-hard-problems/)

Scott Young: Variability, Not Repetition Is The Key to Mastery (https://www.scotthyoung.com/blog/2022/10/26/variable-mastery/)

Scott Young: How do we learn complex skills? Understanding ACT-R Theory (https://www.scotthyoung.com/blog/2022/02/15/act-r/#:~:text=In%20the%20ACT%2DR%20theory,current%20representation%20of%20the%20problem.)

Justin Sung’s Icanstudy course (https://icanstudy.com/)

John Hattie: The Power of Feedback [2007] (https://journals.sagepub.com/doi/abs/10.3102/003465430298487#:~:text=John%20Hattie%20and%20Helen%20Timperley,be%20either%20positive%20or%20negative)

The Incubation Effect (https://telrp.springeropen.com/articles/10.1186/s41039-021-00171-x)

Share your opinions and comments below. I would love to hear suggestions on topics for future articles.