Building a Personal Knowledge Base for Deep Learning
Building a Personal Knowledge Base for Deep Learning
Most people do not have a knowledge base. They have a download folder.
If you have been studying a hard subject for more than a few months, you probably have dozens of PDFs, a sprawling Notion workspace, half-written notes in three different apps, and a vague sense that you should "organise all this someday." When you sit down with an AI to study, you re-paste the same five paragraphs into the chat window every time, because pulling up the actual source is more friction than just retyping.
That is the problem this guide solves. Not "how to take perfect notes" — that is a lifetime project. Just how to assemble a personal knowledge base that your tools can actually reach into, in an afternoon.
What a knowledge base is, exactly
A knowledge base is a curated bundle of sources — your notes, your readings, your references — attached to a study goal. Three properties matter:
- It is searchable. You (or your tutor) can pull up the relevant passage in under five seconds.
- It is grounded. Answers cite your documents, not the model's training data.
- It is reusable. Next session, the bundle is still there. You do not rebuild it from scratch.
The third property is the one most people skip. They build context for one chat session, then walk away. Two weeks later they are doing it again. A knowledge base persists.
A four-step setup
Here is the setup we recommend for a single subject. Pick one — networking, organic chemistry, distributed systems, art history. Do not start with three subjects at once.
Step 1: Choose the spine
Every knowledge base has a spine. That is the canonical source you would point a friend at if they asked "what should I read?" Usually one textbook, one paper, or one course. Sometimes two.
Resist the urge to add five spines. The spine sets the vocabulary and the scope. Multiple spines drift the vocabulary and confuse the tutor — every term has two definitions, every concept has two notations.
Step 2: Add the supporting cast
Around the spine, you can add:
- Lecture notes (yours or someone else's)
- Survey papers or review articles
- Specific chapters from related textbooks
- A cheat sheet or formula reference
- Anything you have already written — drafts, problem-set solutions, summaries
Aim for 5–15 documents. More than that and your tutor will struggle to retrieve the right passage; fewer than that and you have not given it enough to ground answers.
Step 3: Write a one-page goal doc
This is the unsexy step that makes everything else work. Open a fresh document and write, in plain language:
- What are you trying to learn?
- Why are you trying to learn it? (Exam? Job? Curiosity?)
- What do you already know?
- What is the deadline, if any?
- What does "done" look like?
Save this as the first document in your knowledge base. Every tutoring session can reach for it. It is the difference between asking a friend "explain quantum entanglement" and asking "explain quantum entanglement at the level of someone who just finished Griffiths' chapter 4 and has a final in three weeks."
Step 4: Pick a working space
Now you need somewhere to actually do the studying — write notes, ask questions, draft explanations. The trap is using the same tool for storing sources and for working. Storing wants stability; working wants to be messy.
RoxWhy splits these on purpose. Knowledge bases hold the spine and supporting cast. The chat and Co-Writer surfaces are where you actually work. Whatever tool you pick, look for that separation.
Maintaining the knowledge base
A knowledge base that nobody touches is a graveyard. A few small habits keep it alive:
Add things back
When you understand something well — really well, in your own words — write that explanation as a new document and add it to the knowledge base. Now your tutor has access to your phrasing of the idea, which compounds with every session.
This is the most underrated move in self-directed learning. You start with sources you read. Over time, half the knowledge base becomes things you wrote.
Prune ruthlessly
Once a month, walk through the knowledge base and delete anything you do not actually use. Stale sources confuse retrieval; they pollute every search the tutor does. If you have not opened a document in six weeks, delete it. You can always re-add it.
Keep the goal doc current
The original goal doc you wrote in step 3 will rot. By week four, "I want to pass the exam" should have evolved into "I want to be solid on chapters 5–7 and shaky-but-functional on chapter 8 by Friday." Update it. Your tutor reads it.
What knowledge bases unlock
Once your sources are in one place and your tutor can reach them, a few things become possible that were not before:
- Cross-source synthesis. "How does the definition in chapter 4 relate to the proof in the survey paper?"
- Personalised practice. "Make me five practice problems at the level of the chapter 6 exercises, but using examples from my lecture notes."
- Gap finding. "Based on what I have studied, what topics am I missing?"
- Drafting from sources. "Write a one-paragraph summary of my understanding of consensus algorithms, citing the readings."
None of those work with a generic chatbot, because the chatbot does not have your sources. They become trivial once it does.
A real example
We worked with one early user who was studying for a graduate-level distributed systems course. Their initial knowledge base:
- The textbook (1 spine document)
- 4 papers the professor had assigned
- Lecture notes from weeks 1–6
- A goal doc with the exam date and the three topics they felt weakest on
After three weeks of tutoring sessions, the same knowledge base had grown to include:
- The original 6 documents
- 8 of their own written summaries (one per major concept)
- A "frequently confused" doc they had built up of pairs of terms they kept mixing up
- A revised goal doc
The compounding is the point. The first session, the tutor mostly explained. By week three, the tutor was challenging their explanations against their own past summaries. That only works if the summaries are in the knowledge base.
Try it this week
A small commitment that pays back fast:
- Pick one subject.
- Spend 30 minutes assembling 5–10 sources into a knowledge base.
- Write a one-page goal doc.
- Have one tutoring session against it.
- Add your written explanation back as a new source.
That is the loop. It works on any tool that supports persistent knowledge bases.
If you want a calm, focused workspace built around exactly this loop, create a RoxWhy account and start from one question.

