The Basics: What is a CPU Core?
Experienced Java/JavaScript full-stack developer over 6 years of extensive expertise serving key role on elite technical teams developing enterprise software for healthcare, apple ad-platform, banking, and e-commerce. Adaptable problem-solver with high levels of skill in Groovy, Java, Spring, Spring Boot, Hibernate, JavaScript, TypeScript, Angular, Node, Express, React, MongoDB, IBM DB2, Oracle, PL/SQL, Docker, Kubernetes, CI/CD pipelines, AWS, Micro-service and Agile/Scrum. Strong technical skills paired with business-savvy UI design expertise. Personable team player with experience collaborating with diverse cross-functional teams.
A CPU core is a physical processing unit inside your CPU chip. Think of it as a worker in a factory. A single-core CPU has one worker; a quad-core CPU has four. Each core can independently execute instructions.
Modern CPUs also support hyperthreading (Intel) or SMT — Simultaneous Multi-Threading (AMD). This makes one physical core appear as two logical cores to the OS, by allowing two threads to share the core's execution resources when one thread is waiting (e.g., for memory).
Let's visualize the hardware first:
The Basics: What is a CPU Core?
So with hyperthreading, a 4-core CPU gives you 8 logical cores. The OS sees 8 workers, but physically only 4 exist — the extra 4 are "virtual" slots that help fill in idle cycles.
What is a Process vs. a Thread?
Before multithreading makes sense, you need to understand the difference between a process and a thread.
A process is a running program — it has its own private memory space (RAM), file handles, and resources. When you open Chrome, that's a process. When you open VS Code, that's another process. They cannot see each other's memory.
A thread is a unit of execution within a process. A single process can have many threads, and they all share the same memory. This is powerful (fast communication) but dangerous (race conditions).
Key takeaway: threads within a process share memory — that's what makes them fast, but also what requires careful synchronization (more on that later).
What Does the OS Do? The Scheduler
Your OS has a scheduler — its job is to decide which thread runs on which core at what time. The CPU doesn't know about threads directly; the OS manages this entirely.
The OS uses a technique called time-slicing: every thread gets a small slice of CPU time (typically ~1–10 milliseconds), then is paused and another thread gets a turn. This happens so fast that it feels like everything is running simultaneously, even on a single core.
This is called concurrency — multiple tasks making progress over time. True parallelism is when tasks literally run at the same instant on different cores.
Scenario 1: More cores than threads (cores > threads)
This is the easy case. If you have 8 cores and only 3 threads running, each thread gets its own dedicated core. No competition, no waiting. Every thread runs at full speed simultaneously.
In this scenario, unused cores simply sit idle. No scheduling overhead, no context switching. This is the most efficient state.
Scenario 2: Threads equal cores (threads == cores)
Each thread maps exactly to one core — still ideal. Every core is busy, no idle wasted capacity, and no thread has to wait for CPU time.
Scenario 3: More threads than cores (threads > cores) — The interesting case
This is the most common real-world situation. Your OS runs hundreds of threads even on a basic laptop. With more threads than cores, the OS must multiplex — rapidly switching cores between threads.
This is called a context switch: the OS saves the full state of the currently running thread (registers, program counter, stack pointer) and loads the state of the next thread. It's like pausing a video game, saving your spot, loading a different save file, playing a bit, then swapping back.
Click Play to watch the OS scheduler rotate threads across 2 cores. Notice the "context switch" ticks — that's overhead. The CPU does useful work during all other ticks.
What Happens During a Context Switch?
A context switch is not free. The OS must:
Save the current thread's CPU state — all registers (including the program counter, stack pointer, general-purpose registers)
Update the thread's entry in the OS thread table (marking it as "waiting" or "ready")
Load the next thread's saved state into the CPU registers
Resume execution from where the next thread left off
This takes roughly 1–10 microseconds. That sounds tiny, but if you have thousands of threads switching thousands of times per second, it adds up. This is why creating 10,000 threads for 10,000 tasks is usually a bad idea — the context switch overhead dominates.
Thread States
At any moment, each thread is in one of several states:
The Blocked state is important — when a thread is waiting for something (a file to load, a network response, a lock to be released), it voluntarily gives up the CPU. The OS immediately gives that core to a Ready thread. This is why I/O-heavy servers can handle thousands of connections with just a few threads.
Synchronization: The Danger of Shared Memory
Since threads in a process share memory, two threads can try to read and write the same variable at the same time. This causes race conditions — bugs that are hard to reproduce because they depend on exact timing.
Classic example — the counter problem:
Imagine two threads both doing counter = counter + 1. Each thread reads the current value, adds 1, and writes it back. If both read 5 before either writes, both write 6 — but the counter should be 7. You just lost an increment.
The solution is synchronization primitives:
Mutex (mutual exclusion lock): only one thread can hold it at a time. Others block until it's released. Like a bathroom key — one person in at a time.
Semaphore: a counter-based lock — allows N threads in at once (useful for limiting resource access).
Atomic operations: CPU-level instructions that read-modify-write in one uninterruptible step. No lock needed for simple increments.
Condition variable: lets a thread sleep and wake up when a condition becomes true (e.g., "wake me when the queue has items").
Thread Pools: The Smart Way to Use Threads
Creating and destroying threads is expensive. A thread pool pre-creates a fixed number of worker threads and reuses them for tasks. Tasks go into a queue; idle workers pick them up.
Thread pools are used everywhere — web servers, databases, game engines. The optimal pool size is roughly equal to the number of CPU logical cores (for CPU-bound work) or somewhat higher (for I/O-bound work where threads frequently block).
Summary: Everything Together
Here's a mental map of how all the concepts connect:
| Concept | What it is |
|---|---|
| CPU Core | Physical processing unit — can run one thread at a time |
| Logical core | Virtual core via hyperthreading — 2 per physical core |
| Process | Running program with its own private memory |
| Thread | Unit of execution within a process; shares memory with siblings |
| OS Scheduler | Decides which thread runs on which core and when |
| Time-slice | Small window (~5ms) a thread gets before being preempted |
| Context switch | Saving one thread's state and loading another's (~1–10µs cost) |
| Concurrency | Multiple tasks making progress (can be on 1 core via time-slicing) |
| Parallelism | Tasks truly running simultaneously on different cores |
| Race condition | Bug caused by unsynchronized access to shared data |
| Mutex / lock | Tool to prevent race conditions — only one thread at a time |
| Thread pool | Pre-created workers that reuse threads to avoid creation overhead |
The key insight tying it all together: the OS is the air traffic controller — cores are the runways, threads are the planes, and context switches are the planes waiting on the tarmac. The scheduler tries to keep all runways busy without letting any plane wait too long.