đ Github Interview University
Extracted from: https://github.com/jwasham/coding-interview-university
Table of Content
- Algorithmic complexity / Big-O / Asymptotic analysis
- Data Structures
- More Knowledge
-
Trees
- Trees - Intro
- Binary search trees: BSTs
- Heap / Priority Queue / Binary Heap
- balanced search trees (general concept, not details)
- traversals: preorder, inorder, postorder, BFS, DFS
-
Sorting
- selection
- insertion
- heapsort
- quicksort
- mergesort
-
Graphs
- directed
- undirected
- adjacency matrix
- adjacency list
- traversals: BFS, DFS
- Even More Knowledge
- Final Review
ââââââââ Everything below this point is optional ââââââââ
Optional Extra Topics & Resources
- Additional Books
- System Design, Scalability, Data Handling (if you have 4+ years experience)
-
Additional Learning
- Compilers
- Emacs and vi(m)
- Unix command line tools
- Information theory
- Parity & Hamming Code
- Entropy
- Cryptography
- Compression
- Computer Security
- Garbage collection
- Parallel Programming
- Messaging, Serialization, and Queueing Systems
- A*
- Fast Fourier Transform
- Bloom Filter
- HyperLogLog
- Locality-Sensitive Hashing
- van Emde Boas Trees
- Augmented Data Structures
-
Balanced search trees
- AVL trees
- Splay trees
- Red/black trees
- 2-3 search trees
- 2-3-4 Trees (aka 2-4 trees)
- N-ary (K-ary, M-ary) trees
- B-Trees
- k-D Trees
- Skip lists
- Network Flows
- Disjoint Sets & Union Find
- Math for Fast Processing
- Treap
- Linear Programming
- Geometry, Convex hull
- Discrete math
- Additional Detail on Some Subjects
- Video Series
- Computer Science Courses
- Papers
Algorithmic complexity / Big-O / Asymptotic analysis
- Nothing to implement here, youâre just watching videos and taking notes! Yay!
- There are a lot of videos here. Just watch enough until you understand it. You can always come back and review.
- Donât worry if you donât understand all the math behind it.
- You just need to understand how to express the complexity of an algorithm in terms of Big-O.
- [ ] Harvard CS50 - Asymptotic Notation (video)
- [ ] Big O Notations (general quick tutorial) (video)
- [ ] Big O Notation (and Omega and Theta) - best mathematical explanation (video)
- [ ] Skiena (video)
- [ ] UC Berkeley Big O (video)
- [ ] Amortized Analysis (video)
- [ ] TopCoder (includes recurrence relations and master theorem):
- [ ] Cheat sheet
- [ ] [Review] Analyzing Algorithms (playlist) in 18 minutes (video)
Well, thatâs about enough of that.
When you go through âCracking the Coding Interviewâ, there is a chapter on this, and at the end there is a quiz to see if you can identify the runtime complexity of different algorithms. Itâs a super review and test.
Data Structures
Arrays
- [ ] About Arrays:
-
[ ] Implement a vector (mutable array with automatic resizing):
- [ ] Practice coding using arrays and pointers, and pointer math to jump to an index instead of using indexing.
-
[ ] New raw data array with allocated memory
- can allocate int array under the hood, just not use its features
- start with 16, or if the starting number is greater, use power of 2 - 16, 32, 64, 128
- [ ] size() - number of items
- [ ] capacity() - number of items it can hold
- [ ] is_empty()
- [ ] at(index) - returns the item at a given index, blows up if index out of bounds
- [ ] push(item)
- [ ] insert(index, item) - inserts item at index, shifts that indexâs value and trailing elements to the right
- [ ] prepend(item) - can use insert above at index 0
- [ ] pop() - remove from end, return value
- [ ] delete(index) - delete item at index, shifting all trailing elements left
- [ ] remove(item) - looks for value and removes index holding it (even if in multiple places)
- [ ] find(item) - looks for value and returns first index with that value, -1 if not found
-
[ ] resize(new_capacity) // private function
- when you reach capacity, resize to double the size
- when popping an item, if the size is 1/4 of capacity, resize to half
-
[ ] Time
- O(1) to add/remove at end (amortized for allocations for more space), index, or update
- O(n) to insert/remove elsewhere
-
[ ] Space
- contiguous in memory, so proximity helps performance
- space needed = (array capacity, which is >= n) * size of item, but even if 2n, still O(n)
Linked Lists
-
[ ] Description:
- [ ] Linked Lists CS50 Harvard University - this builds the intuition.
- [ ] Singly Linked Lists (video)
- [ ] CS 61B - Linked Lists 1 (video)
- [ ] CS 61B - Linked Lists 2 (video)
- [ ] [Review] Linked lists in 4 minutes (video)
-
[ ]
C Code (video)
- not the whole video, just portions about Node struct and memory allocation
- [ ] Linked List vs Arrays:
- [ ] Why you should avoid linked lists (video)
- [ ] Gotcha: you need pointer to pointer knowledge: (for when you pass a pointer to a function that may change the address where that pointer points) This page is just to get a grasp on ptr to ptr. I donât recommend this list traversal style. Readability and maintainability suffer due to cleverness.
-
[ ] Implement (I did with tail pointer & without):
- [ ] size() - returns the number of data elements in the list
- [ ] empty() - bool returns true if empty
- [ ] value_at(index) - returns the value of the nth item (starting at 0 for first)
- [ ] push_front(value) - adds an item to the front of the list
- [ ] pop_front() - remove the front item and return its value
- [ ] push_back(value) - adds an item at the end
- [ ] pop_back() - removes end item and returns its value
- [ ] front() - get the value of the front item
- [ ] back() - get the value of the end item
- [ ] insert(index, value) - insert value at index, so the current item at that index is pointed to by the new item at the index
- [ ] erase(index) - removes node at given index
- [ ] value_n_from_end(n) - returns the value of the node at the nth position from the end of the list
- [ ] reverse() - reverses the list
- [ ] remove_value(value) - removes the first item in the list with this value
-
[ ] Doubly-linked List
- Description (video)
- No need to implement
Stack
- [ ] Stacks (video)
- [ ] [Review] Stacks in 3 minutes (video)
- [ ] Will not implement. Implementing with the array is trivial
Queue
- [ ] Queue (video)
- [ ] Circular buffer/FIFO
- [ ] [Review] Queues in 3 minutes (video)
-
[ ] Implement using linked-list, with tail pointer:
- enqueue(value) - adds value at a position at the tail
- dequeue() - returns value and removes least recently added element (front)
- empty()
-
[ ] Implement using a fixed-sized array:
- enqueue(value) - adds item at end of available storage
- dequeue() - returns value and removes least recently added element
- empty()
- full()
-
[ ] Cost:
- a bad implementation using a linked list where you enqueue at the head and dequeue at the tail would be O(n) because youâd need the next to last element, causing a full traversal of each dequeue
- enqueue: O(1) (amortized, linked list and array [probing])
- dequeue: O(1) (linked list and array)
- empty: O(1) (linked list and array)
Hash table
-
[ ] Videos:
- [ ] Hashing with Chaining (video)
- [ ] Table Doubling, Karp-Rabin (video)
- [ ] Open Addressing, Cryptographic Hashing (video)
- [ ] PyCon 2010: The Mighty Dictionary (video)
- [ ] PyCon 2017: The Dictionary Even Mightier (video)
- [ ] (Advanced) Randomization: Universal & Perfect Hashing (video)
- [ ] (Advanced) Perfect hashing (video)
- [ ] [Review] Hash tables in 4 minutes (video)
-
[ ] Online Courses:
- [ ] Core Hash Tables (video)
- [ ] Data Structures (video)
- [ ] Phone Book Problem (video)
- [ ] distributed hash tables:
-
[ ] Implement with array using linear probing
- hash(k, m) - m is the size of the hash table
- add(key, value) - if the key already exists, update value
- exists(key)
- get(key)
- remove(key)
More Knowledge
Binary search
- [ ] Binary Search (video)
- [ ] Binary Search (video)
- [ ] detail
- [ ] blueprint
- [ ] [Review] Binary search in 4 minutes (video)
-
[ ] Implement:
- binary search (on a sorted array of integers)
- binary search using recursion
Bitwise operations
-
[ ]
Bits cheat sheet
- you should know many of the powers of 2 from (2^1 to 2^16 and 2^32)
-
[ ] Get a really good understanding of manipulating bits with: &,
|, ^, ~, >>, <<
- [ ] words
- [ ] Good intro: Bit Manipulation (video)
- [ ] C Programming Tutorial 2-10: Bitwise Operators (video)
- [ ] Bit Manipulation
- [ ] Bitwise Operation
- [ ] Bithacks
- [ ] The Bit Twiddler
- [ ] The Bit Twiddler Interactive
- [ ] Bit Hacks (video)
- [ ] Practice Operations
- [ ] 2s and 1s complement
- [ ] Count set bits
- [ ] Swap values:
- [ ] Absolute value:
Trees
Trees - Intro
- [ ] Intro to Trees (video)
- [ ] Tree Traversal (video)
-
[ ]
BFS(breadth-first search) and DFS(depth-first search) (video)
-
BFS notes:
- level order (BFS, using queue)
- time complexity: O(n)
- space complexity: best: O(1), worst: O(n/2)=O(n)
-
DFS notes:
- time complexity: O(n)
- space complexity: best: O(log n) - avg. height of tree worst: O(n)
- inorder (DFS: left, self, right)
- postorder (DFS: left, right, self)
- preorder (DFS: self, left, right)
-
BFS notes:
- [ ] [Review] Breadth-first search in 4 minutes (video)
- [ ] [Review] Depth-first search in 4 minutes (video)
- [ ] [Review] Tree Traversal (playlist) in 11 minutes (video)
Binary search trees: BSTs
- [ ] Binary Search Tree Review (video)
- [ ] Introduction (video)
- [ ] MIT (video)
-
[ ] C/C++:
- [ ] Binary search tree - Implementation in C/C++ (video)
- [ ] BST implementation - memory allocation in stack and heap (video)
- [ ] Find min and max element in a binary search tree (video)
- [ ] Find the height of a binary tree (video)
- [ ] Binary tree traversal - breadth-first and depth-first strategies (video)
- [ ] Binary tree: Level Order Traversal (video)
- [ ] Binary tree traversal: Preorder, Inorder, Postorder (video)
- [ ] Check if a binary tree is a binary search tree or not (video)
- [ ] Delete a node from Binary Search Tree (video)
- [ ] Inorder Successor in a binary search tree (video)
-
[ ] Implement:
- [ ] insert // insert value into tree
- [ ] get_node_count // get count of values stored
- [ ] print_values // prints the values in the tree, from min to max
- [ ] delete_tree
- [ ] is_in_tree // returns true if a given value exists in the tree
- [ ] get_height // returns the height in nodes (single nodeâs height is 1)
- [ ] get_min // returns the minimum value stored in the tree
- [ ] get_max // returns the maximum value stored in the tree
- [ ] is_binary_search_tree
- [ ] delete_value
- [ ] get_successor // returns the next-highest value in the tree after given value, -1 if none
Heap / Priority Queue / Binary Heap
- visualized as a tree, but is usually linear in storage (array, linked list)
- [ ] Heap
- [ ] Introduction (video)
- [ ] Binary Trees (video)
- [ ] Tree Height Remark (video)
- [ ] Basic Operations (video)
- [ ] Complete Binary Trees (video)
- [ ] Pseudocode (video)
- [ ] Heap Sort - jumps to start (video)
- [ ] Heap Sort (video)
- [ ] Building a heap (video)
- [ ] MIT 6.006 Introduction to Algorithms: Binary Heaps
- [ ] CS 61B Lecture 24: Priority Queues (video)
- [ ] Linear Time BuildHeap (max-heap)
- [ ] [Review] Heap (playlist) in 13 minutes (video)
-
[ ] Implement a max-heap:
- [ ] insert
- [ ] sift_up - needed for insert
- [ ] get_max - returns the max item, without removing it
- [ ] get_size() - return number of elements stored
- [ ] is_empty() - returns true if the heap contains no elements
- [ ] extract_max - returns the max item, removing it
- [ ] sift_down - needed for extract_max
- [ ] remove(x) - removes item at index x
- [ ] heapify - create a heap from an array of elements, needed for heap_sort
- [ ] heap_sort() - take an unsorted array and turn it into a sorted array in place using a max heap or min heap
Sorting
-
[ ] Notes
-
[ ] Implement sorts & know best case/worst case, average complexity of each
- no bubble sort - itâs terrible - O(n^2), except when n <= 16
-
[ ] Stability in sorting algorithms (âIs Quicksort stable?â)
-
[ ] Which algorithms can be used on linked lists? Which on arrays? Which of both?
- I wouldnât recommend sorting a linked list, but merge sort is doable.
- Merge Sort For Linked List
-
For heapsort, see the Heap data structure above. Heap sort is great, but not stable
-
[ ] Sedgewick - Mergesort (5 videos)
- [ ] 1. Mergesort
- [ ] 2. Bottom-up Mergesort
- [ ] 3. Sorting Complexity
- [ ] 4. Comparators
- [ ] 5. Stability
-
[ ] Sedgewick - Quicksort (4 videos)
- [ ] 1. Quicksort
- [ ] 2. Selection
- [ ] 3. Duplicate Keys
- [ ] 4. System Sorts
-
- [ ] UC Berkeley:
- [ ] Bubble Sort (video)
- [ ] Analyzing Bubble Sort (video)
- [ ] Insertion Sort, Merge Sort (video)
- [ ] Insertion Sort (video)
- [ ] Merge Sort (video)
- [ ] Quicksort (video)
- [ ] Selection Sort (video)
- [ ] Merge sort code:
- [ ] Quick sort code:
- [ ] [Review] Sorting (playlist) in 18 minutes
-
[ ] Implement:
- [ ] Mergesort: O(n log n) average and worst case
- [ ] Quicksort O(n log n) average case
- Selection sort and insertion sort are both O(n^2) average and worst-case
- For heapsort, see Heap data structure above
- [ ] Not required, but I recommended them:
As a summary, here is a visual representation of 15 sorting algorithms. If you need more detail on this subject, see the âSortingâ section in Additional Detail on Some Subjects
Graphs
Graphs can be used to represent many problems in computer science, so this section is long, like trees and sorting.
Notes:
-
There are 4 basic ways to represent a graph in memory:
- objects and pointers
- adjacency matrix
- adjacency list
- adjacency map
- Familiarize yourself with each representation and its pros & cons
- BFS and DFS - know their computational complexity, their trade-offs, and how to implement them in real code
- When asked a question, look for a graph-based solution first, then move on if none
Courses:
- [ ] MIT(videos):
-
[ ] Skiena Lectures - great intro:
- [ ] CSE373 2020 - Lecture 10 - Graph Data Structures (video)
- [ ] CSE373 2020 - Lecture 11 - Graph Traversal (video)
- [ ] CSE373 2020 - Lecture 12 - Depth First Search (video)
- [ ] CSE373 2020 - Lecture 13 - Minimum Spanning Trees (video)
- [ ] CSE373 2020 - Lecture 14 - Minimum Spanning Trees (conât) (video)
- [ ] CSE373 2020 - Lecture 15 - Graph Algorithms (conât 2) (video)
-
[ ] Graphs (review and more):
- [ ] 6.006 Single-Source Shortest Paths Problem (video)
- [ ] 6.006 Dijkstra (video)
- [ ] 6.006 Bellman-Ford (video)
- [ ] 6.006 Speeding Up Dijkstra (video)
- [ ] Aduni: Graph Algorithms I - Topological Sorting, Minimum Spanning Trees, Primâs Algorithm - Lecture 6 (video)
- [ ] Aduni: Graph Algorithms II - DFS, BFS, Kruskalâs Algorithm, Union Find Data Structure - Lecture 7 (video)
- [ ] Aduni: Graph Algorithms III: Shortest Path - Lecture 8 (video)
- [ ] Aduni: Graph Alg. IV: Intro to geometric algorithms - Lecture 9 (video)
- [ ] CS 61B 2014: Weighted graphs (video)
- [ ] Greedy Algorithms: Minimum Spanning Tree (video)
- [ ] Strongly Connected Components Kosarajuâs Algorithm Graph Algorithm (video)
- [ ] [Review] Shortest Path Algorithms (playlist) in 16 minutes (video)
- [ ] [Review] Minimum Spanning Trees (playlist) in 4 minutes (video)
- Full Coursera Course:
-
Iâll implement:
- [ ] DFS with adjacency list (recursive)
- [ ] DFS with adjacency list (iterative with stack)
- [ ] DFS with adjacency matrix (recursive)
- [ ] DFS with adjacency matrix (iterative with stack)
- [ ] BFS with adjacency list
- [ ] BFS with adjacency matrix
- [ ] single-source shortest path (Dijkstra)
- [ ] minimum spanning tree
-
[ ] DFS-based algorithms (see Aduni videos above):
- [ ] check for a cycle (needed for topological sort, since weâll check for the cycle before starting)
- [ ] topological sort
- [ ] count connected components in a graph
- [ ] list strongly connected components
- [ ] check for bipartite graph
Even More Knowledge
Recursion
-
[ ] Stanford lectures on recursion & backtracking:
-
When it is appropriate to use it?
-
How is tail recursion better than not?
Dynamic Programming
- You probably wonât see any dynamic programming problems in your interview, but itâs worth being able to recognize a problem as being a candidate for dynamic programming.
- This subject can be pretty difficult, as each DP soluble problem must be defined as a recursion relation, and coming up with it can be tricky.
- I suggest looking at many examples of DP problems until you have a solid understanding of the pattern involved.
-
[ ] Videos:
- [ ] Skiena: CSE373 2020 - Lecture 19 - Introduction to Dynamic Programming (video)
- [ ] Skiena: CSE373 2020 - Lecture 20 - Edit Distance (video)
- [ ] Skiena: CSE373 2020 - Lecture 20 - Edit Distance (continued) (video)
- [ ] Skiena: CSE373 2020 - Lecture 21 - Dynamic Programming (video)
- [ ] Skiena: CSE373 2020 - Lecture 22 - Dynamic Programming and Review (video)
- [ ] Simonson: Dynamic Programming 0 (starts at 59:18) (video)
- [ ] Simonson: Dynamic Programming I - Lecture 11 (video)
- [ ] Simonson: Dynamic programming II - Lecture 12 (video)
- [ ] List of individual DP problems (each is short): Dynamic Programming (video)
- [ ] Yale Lecture notes:
-
[ ] Coursera:
- [ ] The RNA secondary structure problem (video)
- [ ] A dynamic programming algorithm (video)
- [ ] Illustrating the DP algorithm (video)
- [ ] Running time of the DP algorithm (video)
- [ ] DP vs. recursive implementation (video)
- [ ] Global pairwise sequence alignment (video)
- [ ] Local pairwise sequence alignment (video)
Design patterns
- [ ] Quick UML review (video)
-
[ ] Learn these patterns:
- [ ] strategy
- [ ] singleton
- [ ] adapter
- [ ] prototype
- [ ] decorator
- [ ] visitor
- [ ] factory, abstract factory
- [ ] facade
- [ ] observer
- [ ] proxy
- [ ] delegate
- [ ] command
- [ ] state
- [ ] memento
- [ ] iterator
- [ ] composite
- [ ] flyweight
- [ ] Series of videos (27 videos)
-
[ ]
Book: Head First Design Patterns
- I know the canonical book is âDesign Patterns: Elements of Reusable Object-Oriented Softwareâ, but Head First is great for beginners to OO.
- Handy reference: 101 Design Patterns & Tips for Developers
Combinatorics (n choose k) & Probability
- [ ] Math Skills: How to find Factorial, Permutation, and Combination (Choose) (video)
- [ ] Make School: Probability (video)
- [ ] Make School: More Probability and Markov Chains (video)
-
[ ] Khan Academy:
- Course layout:
- Just the videos - 41 (each are simple and each are short):
NP, NP-Complete and Approximation Algorithms
- Know about the most famous classes of NP-complete problems, such as the traveling salesman and the knapsack problem, and be able to recognize them when an interviewer asks you them in disguise.
- Know what NP-complete means.
- [ ] Computational Complexity (video)
- [ ] Simonson:
- [ ] Skiena:
- [ ] Complexity: P, NP, NP-completeness, Reductions (video)
- [ ] Complexity: Approximation Algorithms (video)
- [ ] Complexity: Fixed-Parameter Algorithms (video)
- Peter Norvig discusses near-optimal solutions to the traveling salesman problem:
- Pages 1048 - 1140 in CLRS if you have it.
How computers process a program
- [ ] How CPU executes a program (video)
- [ ] How computers calculate - ALU (video)
- [ ] Registers and RAM (video)
- [ ] The Central Processing Unit (CPU) (video)
- [ ] Instructions and Programs (video)
Caches
- [ ] LRU cache:
- [ ] CPU cache:
Processes and Threads
-
[ ] Computer Science 162 - Operating Systems (25 videos):
- for processes and threads see videos 1-11
- Operating Systems and System Programming (video)
- What Is The Difference Between A Process And A Thread?
-
Covers:
-
Processes, Threads, Concurrency issues
- Difference between processes and threads
- Processes
- Threads
- Locks
- Mutexes
- Semaphores
- Monitors
- How do they work?
- Deadlock
- Livelock
- CPU activity, interrupts, context switching
- Modern concurrency constructs with multicore processors
- Paging, segmentation, and virtual memory (video)
- Interrupts (video)
- Process resource needs (memory: code, static storage, stack, heap, and also file descriptors, i/o)
- Thread resource needs (shares above (minus stack) with other threads in the same process but each has its own PC, stack counter, registers, and stack)
- Forking is really copy on write (read-only) until the new process writes to memory, then it does a full copy.
- Context switching
-
Processes, Threads, Concurrency issues
- [ ] threads in C++ (series - 10 videos)
- [ ] CS 377 Spring â14: Operating Systems from University of Massachusetts
- [ ] concurrency in Python (videos):
Testing
-
To cover:
- how unit testing works
- what are mock objects
- what is integration testing
- what is dependency injection
- [ ] Agile Software Testing with James Bach (video)
- [ ] Open Lecture by James Bach on Software Testing (video)
- [ ] Steve Freeman - Test-Driven Development (thatâs not what we meant) (video)
-
[ ] Dependency injection:
- [ ] video
- [ ] Tao Of Testing
- [ ] How to write tests
String searching & manipulations
- [ ] Sedgewick - Suffix Arrays (video)
- [ ] Sedgewick - Substring Search (videos)
- [ ] Search pattern in a text (video)
If you need more detail on this subject, see the âString Matchingâ section in Additional Detail on Some Subjects.
Tries
- Note there are different kinds of tries. Some have prefixes, some donât, and some use strings instead of bits to track the path
- I read through the code, but will not implement
- [ ] Sedgewick - Tries (3 videos)
- [ ] Notes on Data Structures and Programming Techniques
- [ ] Short course videos:
- [ ] The Trie: A Neglected Data Structure
- [ ] TopCoder - Using Tries
- [ ] Stanford Lecture (real-world use case) (video)
- [ ] MIT, Advanced Data Structures, Strings (can get pretty obscure about halfway through) (video)
Floating Point Numbers
- [ ] simple 8-bit: Representation of Floating Point Numbers - 1 (video - there is an error in calculations - see video description)
Unicode
- [ ] The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets
- [ ] What Every Programmer Absolutely, Positively Needs To Know About Encodings And Character Sets To Work With Text
Endianness
- [ ] Big And Little Endian
- [ ] Big Endian Vs Little Endian (video)
-
[ ]
Big And Little Endian Inside/Out (video)
- Very technical talk for kernel devs. Donât worry if most is over your head.
- The first half is enough.
Networking
- **If you have networking experience or want to be a reliability engineer or operations engineer, expect questions **
- Otherwise, this is just good to know
- [ ] Khan Academy
- [ ] UDP and TCP: Comparison of Transport Protocols (video)
- [ ] TCP/IP and the OSI Model Explained! (video)
- [ ] Packet Transmission across the Internet. Networking & TCP/IP tutorial. (video)
- [ ] HTTP (video)
- [ ] SSL and HTTPS (video)
- [ ] SSL/TLS (video)
- [ ] HTTP 2.0 (video)
- [ ] Video Series (21 videos) (video)
- [ ] Subnetting Demystified - Part 5 CIDR Notation (video)
- [ ] Sockets:
Final Review
This section will have shorter videos that you can watch pretty quickly to review most of the important concepts.
It's nice if you want a refresher often.
- [ ] Series of 2-3 minutes short subject videos (23 videos)
- [ ] Series of 2-5 minutes short subject videos - Michael Sambol (48 videos):
- [ ] Sedgewick Videos - Algorithms I
- [ ] Sedgewick Videos - Algorithms II
Everything below this point is optional. It is NOT needed for an entry-level interview. However, by studying these, youâll get greater exposure to more CS concepts and will be better prepared for any software engineering job. Youâll be a much more well-rounded software engineer.
Additional Books
These are here so you can dive into a topic you find interesting.
-
The Unix Programming Environment
- An oldie but a goodie
-
The Linux Command Line: A Complete Introduction
- A modern option
- TCP/IP Illustrated Series
-
Head First Design Patterns
- A gentle introduction to design patterns
-
Design Patterns: Elements of Reusable Object-Oriented Software
- AKA the âGang Of Fourâ book or GOF
- The canonical design patterns book
-
Algorithm Design Manual
(Skiena)
- As a review and problem-recognition
- The algorithm catalog portion is well beyond the scope of difficulty youâll get in an interview
-
This book has 2 parts:
-
Class textbook on data structures and algorithms
-
Pros:
- Is a good review as any algorithms textbook would be
- Nice stories from his experiences solving problems in industry and academia
- Code examples in C
-
Cons:
- Can be as dense or impenetrable as CLRS, and in some cases, CLRS may be a better alternative for some subjects
- Chapters 7, 8, and 9 can be painful to try to follow, as some items are not explained well or require more brain than I have
- Donât get me wrong: I like Skiena, his teaching style, and mannerisms, but I may not be Stony Brook material
-
Pros:
-
Algorithm catalog:
- This is the real reason you buy this book.
- This book is better as an algorithm reference, and not something you read cover to cover.
-
Class textbook on data structures and algorithms
- Can rent it on Kindle
- Answers:
- Errata
- Algorithm (Jeff Erickson)
-
Write Great Code: Volume 1: Understanding the Machine
- The book was published in 2004, and is somewhat outdated, but itâs a terrific resource for understanding a computer in brief
- The author invented HLA, so take mentions and examples in HLA with a grain of salt. Not widely used, but decent examples of what assembly looks like
-
These chapters are worth the read to give you a nice foundation:
- Chapter 2 - Numeric Representation
- Chapter 3 - Binary Arithmetic and Bit Operations
- Chapter 4 - Floating-Point Representation
- Chapter 5 - Character Representation
- Chapter 6 - Memory Organization and Access
- Chapter 7 - Composite Data Types and Memory Objects
- Chapter 9 - CPU Architecture
- Chapter 10 - Instruction Set Architecture
- Chapter 11 - Memory Architecture and Organization
-
Introduction to Algorithms
- Important: Reading this book will only have limited value. This book is a great review of algorithms and data structures, but wonât teach you how to write good code. You have to be able to code a decent solution efficiently
- AKA CLR, sometimes CLRS, because Stein was late to the game
-
Computer Architecture, Sixth Edition: A Quantitative Approach
- For a richer, more up-to-date (2017), but longer treatment
System Design, Scalability, Data Handling
You can expect system design questions if you have 4+ years of experience.
- Scalability and System Design are very large topics with many topics and resources, since there is a lot to consider when designing a software/hardware system that can scale. Expect to spend quite a bit of time on this
-
Considerations:
-
Scalability
- Distill large data sets to single values
- Transform one data set to another
- Handling obscenely large amounts of data
-
System design
- features sets
- interfaces
- class hierarchies
- designing a system under certain constraints
- simplicity and robustness
- tradeoffs
- performance analysis and optimization
-
Scalability
- [ ] START HERE: The System Design Primer
- [ ] System Design from HiredInTech
- [ ] How Do I Prepare To Answer Design Questions In A Technical Interview?
- [ ] 8 steps guide to ace your system design interview
- [ ] Database Normalization - 1NF, 2NF, 3NF and 4NF (video)
- [ ] System Design Interview - There are a lot of resources in this one. Look through the articles and examples. I put some of them below
- [ ] How to ace a systems design interview
- [ ] Numbers Everyone Should Know
- [ ] How long does it take to make a context switch?
- [ ] Transactions Across Datacenters (video)
- [ ] A plain English introduction to CAP Theorem
- [ ] MIT 6.824: Distributed Systems, Spring 2020 (20 videos)
-
[ ] Consensus Algorithms:
- [ ] Paxos - Paxos Agreement - Computerphile (video)
-
[ ] Raft -
An Introduction to the Raft Distributed Consensus Algorithm
(video)
- [ ] Easy-to-read paper
- [ ] Infographic
- [ ] Consistent Hashing
- [ ] NoSQL Patterns
-
[ ] Scalability:
- You donât need all of these. Just pick a few that interest you.
- [ ] Great overview (video)
- [ ] Short series:
- [ ] Scalable Web Architecture and Distributed Systems
- [ ] Fallacies of Distributed Computing Explained
- [ ] Jeff Dean - Building Software Systems At Google and Lessons Learned (video)
- [ ] Introduction to Architecting Systems for Scale
- [ ] Scaling mobile games to a global audience using App Engine and Cloud Datastore (video)
- [ ] How Google Does Planet-Scale Engineering for Planet-Scale Infra (video)
- [ ] The Importance of Algorithms
- [ ] Sharding
- [ ] Engineering for the Long Game - Astrid Atkinson Keynote(video)
- [ ] 7 Years Of YouTube Scalability Lessons In 30 Minutes
- [ ] How PayPal Scaled To Billions Of Transactions Daily Using Just 8VMs
- [ ] How to Remove Duplicates in Large Datasets
- [ ] A look inside Etsyâs scale and engineering culture with Jon Cowie (video)
- [ ] What Led Amazon to its Own Microservices Architecture
- [ ] To Compress Or Not To Compress, That Was Uberâs Question
- [ ] When Should Approximate Query Processing Be Used?
- [ ] Googleâs Transition From Single Datacenter To Failover, To A Native Multihomed Architecture
- [ ] The Image Optimization Technology That Serves Millions Of Requests Per Day
- [ ] A Patreon Architecture Short
- [ ] Tinder: How Does One Of The Largest Recommendation Engines Decide Who Youâll See Next?
- [ ] Design Of A Modern Cache
- [ ] Live Video Streaming At Facebook Scale
- [ ] A Beginnerâs Guide To Scaling To 11 Million+ Users On Amazonâs AWS
- [ ] A 360 Degree View Of The Entire Netflix Stack
- [ ] Latency Is Everywhere And It Costs You Sales - How To Crush It
- [ ] What Powers Instagram: Hundreds of Instances, Dozens of Technologies
- [ ] Salesforce Architecture - How They Handle 1.3 Billion Transactions A Day
- [ ] ESPNâs Architecture At Scale - Operating At 100,000 Duh Nuh Nuhs Per Second
- [ ] See âMessaging, Serialization, and Queueing Systemsâ way below for info on some of the technologies that can glue services together
- [ ] Twitter:
- For even more, see the âMining Massive Datasetsâ video series in the Video Series section
-
[ ] Practicing the system design process: Here are some ideas to try
working through on paper, each with some documentation on how it was
handled in the real world:
- review: The System Design Primer
- System Design from HiredInTech
- cheat sheet
-
flow:
-
Understand the problem and scope:
- Define the use cases, with the interviewerâs help
- Suggest additional features
- Remove items that the interviewer deems out of scope
- Assume high availability is required, add as a use case
-
Think about constraints:
- Ask how many requests per month
- Ask how many requests per second (they may volunteer it or make you do the math)
- Estimate reads vs. writes percentage
- Keep the 80/20 rule in mind when estimating
- How much data is written per second
- Total storage required over 5 years
- How much data read per second
-
Abstract design:
- Layers (service, data, caching)
- Infrastructure: load balancing, messaging
- Rough overview of any key algorithm that drives the service
- Consider bottlenecks and determine solutions
-
Understand the problem and scope:
- Exercises:
Additional Learning
I added them to help you become a well-rounded software engineer and to be aware of certain
technologies and algorithms, so you'll have a bigger toolbox.
Compilers
- [How a Compiler Works in ~1 minute (video)](https://www.youtube.com/watch?v=IhC7sdYe-Jg)
- [Harvard CS50 - Compilers (video)](https://www.youtube.com/watch?v=CSZLNYF4Klo)
- [C++ (video)](https://www.youtube.com/watch?v=twodd1KFfGk)
- [Understanding Compiler Optimization (C++) (video)](https://www.youtube.com/watch?v=FnGCDLhaxKU)
Emacs and vi(m)
- Familiarize yourself with a UNIX-based code editor
- vi(m):
- emacs:
- The Absolute Beginnerâs Guide to Emacs (video by David Wilson)
- The Absolute Beginnerâs Guide to Emacs (notes by David Wilson)
Unix/Linux command line tools
- I filled in the list below from good tools.
- bash
- cat
- grep
- sed
- awk
- curl or wget
- sort
- tr
- uniq
- strace
- tcpdump
- Essential Linux Commands Tutorial
DevOps
Information theory (videos)
- Khan Academy
- More about Markov processes:
- See more in the MIT 6.050J Information and Entropy series below
Parity & Hamming Code (videos)
- Intro
- Parity
- Hamming Code:
- Error Checking
Entropy
- Also see the videos below
- Make sure to watch information theory videos first
- Information Theory, Claude Shannon, Entropy, Redundancy, Data Compression & Bits (video)
Cryptography
- Also see the videos below
- Make sure to watch information theory videos first
- Khan Academy Series
- Cryptography: Hash Functions
- Cryptography: Encryption
Compression
- Make sure to watch information theory videos first
- Computerphile (videos):
- Compressor Head videos
- (optional) Google Developers Live: GZIP is not enough!
Computer Security
Garbage collection
- GC in Python (video)
- Deep Dive Java: Garbage Collection is Good!
- Deep Dive Python: Garbage Collection in CPython (video)
Parallel Programming
Messaging, Serialization, and Queueing Systems
- Thrift
- Protocol Buffers
- gRPC
- Redis
- Amazon SQS (queue)
- Amazon SNS (pub-sub)
- RabbitMQ
- Celery
- ZeroMQ
- ActiveMQ
- Kafka
- MessagePack
- Avro
A*
Fast Fourier Transform
- An Interactive Guide To The Fourier Transform
- What is a Fourier transform? What is it used for?
- What is the Fourier Transform? (video)
- Divide & Conquer: FFT (video)
- Understanding The FFT
Bloom Filter
- Given a Bloom filter with m bits and k hashing functions, both insertion and membership testing are O(k)
- Bloom Filters (video)
- Bloom Filters | Mining of Massive Datasets | Stanford University (video)
- Tutorial
- How To Write A Bloom Filter App
HyperLogLog
Locality-Sensitive Hashing
- Used to determine the similarity of documents
- The opposite of MD5 or SHA which are used to determine if 2 documents/strings are exactly the same
- Simhashing (hopefully) made simple
van Emde Boas Trees
Augmented Data Structures
Balanced search trees
-
Know at least one type of balanced binary tree (and know how itâs implemented):
-
âAmong balanced search trees, AVL and 2/3 trees are now passĂ© and red-black trees seem to be more popular. A particularly interesting self-organizing data structure is the splay tree, which uses rotations to move any accessed key to the root.â - Skiena
-
Of these, I chose to implement a splay tree. From what Iâve read, you wonât implement a balanced search tree in your interview. But I wanted exposure to coding one up and letâs face it, splay trees are the beeâs knees. I did read a lot of red-black tree code
- Splay tree: insert, search, delete functions If you end up implementing a red/black tree try just these:
- Search and insertion functions, skipping delete
-
I want to learn more about B-Tree since itâs used so widely with very large data sets
-
AVL trees
- In practice: From what I can tell, these arenât used much in practice, but I could see where they would be: The AVL tree is another structure supporting O(log n) search, insertion, and removal. It is more rigidly balanced than redâblack trees, leading to slower insertion and removal but faster retrieval. This makes it attractive for data structures that may be built once and loaded without reconstruction, such as language dictionaries (or program dictionaries, such as the opcodes of an assembler or interpreter)
- MIT AVL Trees / AVL Sort (video)
- AVL Trees (video)
- AVL Tree Implementation (video)
- Split And Merge
- [Review] AVL Trees (playlist) in 19 minutes (video)
-
Splay trees
- In practice: Splay trees are typically used in the implementation of caches, memory allocators, routers, garbage collectors, data compression, ropes (replacement of string used for long text strings), in Windows NT (in the virtual memory, networking and file system code) etc
- CS 61B: Splay Trees (video)
-
MIT Lecture: Splay Trees:
- Gets very mathy, but watch the last 10 minutes for sure.
- Video
-
Red/black trees
- These are a translation of a 2-3 tree (see below).
- In practice: Redâblack trees offer worst-case guarantees for insertion time, deletion time, and search time. Not only does this make them valuable in time-sensitive applications such as real-time applications, but it makes them valuable building blocks in other data structures that provide worst-case guarantees; for example, many data structures used in computational geometry can be based on red-black trees, and the Completely Fair Scheduler used in current Linux kernels uses redâblack trees. In version 8 of Java, the Collection HashMap has been modified such that instead of using a LinkedList to store identical elements with poor hashcodes, a Red-Black tree is used
- Aduni - Algorithms - Lecture 4 (link jumps to the starting point) (video)
- Aduni - Algorithms - Lecture 5 (video)
- Red-Black Tree
- An Introduction To Binary Search And Red Black Tree
- [Review] Red-Black Trees (playlist) in 30 minutes (video)
-
2-3 search trees
- In practice: 2-3 trees have faster inserts at the expense of slower searches (since height is more compared to AVL trees).
- You would use 2-3 trees very rarely because its implementation involves different types of nodes. Instead, people use Red-Black trees.
- 23-Tree Intuition and Definition (video)
- Binary View of 23-Tree
- 2-3 Trees (student recitation) (video)
-
2-3-4 Trees (aka 2-4 trees)
- In practice: For every 2-4 trees, there are corresponding redâblack trees with data elements in the same order. The insertion and deletion operations on 2-4 trees are also equivalent to color-flipping and rotations in redâblack trees. This makes 2-4 trees an important tool for understanding the logic behind red-black trees, and this is why many introductory algorithm texts introduce 2-4 trees just before redâblack trees, even though 2-4 trees are not often used in practice.
- CS 61B Lecture 26: Balanced Search Trees (video)
- Bottom Up 234-Trees (video)
- Top Down 234-Trees (video)
-
N-ary (K-ary, M-ary) trees
- note: the N or K is the branching factor (max branches)
- binary trees are a 2-ary tree, with branching factor = 2
- 2-3 trees are 3-ary
- K-Ary Tree
-
B-Trees
- Fun fact: itâs a mystery, but the B could stand for Boeing, Balanced, or Bayer (co-inventor).
- In Practice: B-trees are widely used in databases. Most modern filesystems use B-trees (or Variants). In addition to its use in databases, the B-tree is also used in filesystems to allow quick random access to an arbitrary block in a particular file. The basic problem is turning the file block address into a disk block (or perhaps to a cylinder head sector) address
- B-Tree
- B-Tree Datastructure
- Introduction to B-Trees (video)
- B-Tree Definition and Insertion (video)
- B-Tree Deletion (video)
-
MIT 6.851 - Memory Hierarchy Models (video)
- covers cache-oblivious B-Trees, very interesting data structures
- the first 37 minutes are very technical, and may be skipped (B is block size, cache line size)
- [Review] B-Trees (playlist) in 26 minutes (video)
k-D Trees
- Great for finding a number of points in a rectangle or higher-dimensional object
- A good fit for k-nearest neighbors
- kNN K-d tree algorithm (video)
Skip lists
- âThese are somewhat of a cult data structureâ - Skiena
- Randomization: Skip Lists (video)
- For animations and a little more detail
Network Flows
- Ford-Fulkerson in 5 minutes â Step by step example (video)
- Ford-Fulkerson Algorithm (video)
- Network Flows (video)
Disjoint Sets & Union Find
Math for Fast Processing
- Integer Arithmetic, Karatsuba Multiplication (video)
- The Chinese Remainder Theorem (used in cryptography) (video)
Treap
- Combination of a binary search tree and a heap
- Treap
- Data Structures: Treaps explained (video)
- Applications in set operations
Linear Programming (videos)
- Linear Programming
- Finding minimum cost
- Finding maximum value
- Solve Linear Equations with Python - Simplex Algorithm
Geometry, Convex hull (videos)
- Graph Alg. IV: Intro to geometric algorithms - Lecture 9
- Geometric Algorithms: Graham & Jarvis - Lecture 10
- Divide & Conquer: Convex Hull, Median Finding
Discrete math
- Computer Science 70, 001 - Spring 2015 - Discrete Mathematics and Probability Theory
- Discrete Mathematics by Shai Simonson (19 videos)
- Discrete Mathematics By IIT Ropar NPTEL
Additional Detail on Some Subjects
I added these to reinforce some ideas already presented above, but didnât want to include them above because itâs just too much. Itâs easy to overdo it on a subject. You want to get hired in this century, right?
-
SOLID
- [ ] Bob Martin SOLID Principles of Object Oriented and Agile Design (video)
- [ ] S - Single Responsibility Principle | Single responsibility to each Object
- [ ] O - Open/Closed Principle | On production level Objects are ready for extension but not for modification
- [ ] L - Liskov Substitution Principle | Base Class and Derived class follow âIS Aâ Principle
- [ ] I - Interface segregation principle | Clients should not be forced to implement interfaces they donât use
- [ ] D -Dependency Inversion principle | Reduce the dependency In composition of objects.
-
Union-Find
-
More Dynamic Programming (videos)
- 6.006: Dynamic Programming I: Fibonacci, Shortest Paths
- 6.006: Dynamic Programming II: Text Justification, Blackjack
- 6.006: DP III: Parenthesization, Edit Distance, Knapsack
- 6.006: DP IV: Guitar Fingering, Tetris, Super Mario Bros.
- 6.046: Dynamic Programming & Advanced DP
- 6.046: Dynamic Programming: All-Pairs Shortest Paths
- 6.046: Dynamic Programming (student recitation)
-
Advanced Graph Processing (videos)
-
MIT Probability (mathy, and go slowly, which is good for mathy things) (videos):
-
String Matching
- Rabin-Karp (videos):
- Knuth-Morris-Pratt (KMP):
- BoyerâMoore string search algorithm
-
Coursera: Algorithms on Strings
- starts off great, but by the time it gets past KMP it gets more complicated than it needs to be
- nice explanation of tries
- can be skipped
-
Sorting
- Stanford lectures on sorting:
- Shai Simonson:
- Steven Skiena lectures on sorting:
-
NAND To Tetris: Build a Modern Computer from First Principles
Video Series
Sit back and enjoy.
- List of individual Dynamic Programming problems (each is short)
- x86 Architecture, Assembly, Applications (11 videos)
- MIT 18.06 Linear Algebra, Spring 2005 (35 videos)
- Excellent - MIT Calculus Revisited: Single Variable Calculus
- Skiena lectures from Algorithm Design Manual - CSE373 2020 - Analysis of Algorithms (26 videos)
- UC Berkeley 61B (Spring 2014): Data Structures (25 videos)
- UC Berkeley 61B (Fall 2006): Data Structures (39 videos)
- UC Berkeley 61C: Machine Structures (26 videos)
- OOSE: Software Dev Using UML and Java (21 videos)
- MIT 6.004: Computation Structures (49 videos)
- Carnegie Mellon - Computer Architecture Lectures (39 videos)
- MIT 6.006: Intro to Algorithms (47 videos)
- MIT 6.033: Computer System Engineering (22 videos)
- MIT 6.034 Artificial Intelligence, Fall 2010 (30 videos)
- MIT 6.042J: Mathematics for Computer Science, Fall 2010 (25 videos)
- MIT 6.046: Design and Analysis of Algorithms (34 videos)
- MIT 6.824: Distributed Systems, Spring 2020 (20 videos)
- MIT 6.851: Advanced Data Structures (22 videos)
- MIT 6.854: Advanced Algorithms, Spring 2016 (24 videos)
- Harvard COMPSCI 224: Advanced Algorithms (25 videos)
- MIT 6.858 Computer Systems Security, Fall 2014
- Stanford: Programming Paradigms (27 videos)
- Introduction to Cryptography by Christof Paar
- Mining Massive Datasets - Stanford University (94 videos)
- Graph Theory by Sarada Herke (67 videos)
Computer Science Courses
Algorithms implementation
Papers
- Love classic papers?
- 1978: Communicating Sequential Processes
-
2003: The Google File System
- replaced by Colossus in 2012
-
2004: MapReduce: Simplified Data Processing on Large Clusters
- mostly replaced by Cloud Dataflow?
- 2006: Bigtable: A Distributed Storage System for Structured Data
- 2006: The Chubby Lock Service for Loosely-Coupled Distributed Systems
-
2007: Dynamo: Amazonâs Highly Available Key-value Store
- The Dynamo paper kicked off the NoSQL revolution
- 2007: What Every Programmer Should Know About Memory (very long, and the author encourages skipping of some sections)
- 2012: AddressSanitizer: A Fast Address Sanity Checker:
- 2013: Spanner: Googleâs Globally-Distributed Database:
- 2015: Continuous Pipelines at Google
- 2015: High-Availability at Massive Scale: Building Googleâs Data Infrastructure for Ads
- 2015: How Developers Search for Code: A Case Study
- More papers: 1,000 papers