Data Structures in Python: Arrays, Linked Lists, and Trees

Data Structures in Python: Arrays, Linked Lists, and Trees

My Interviewer Asked About Linked Lists. I Froze.

Pune, 2021. Second-round technical at a fintech startup. I’d been writing Python for two years by that point — Flask APIs, pandas pipelines, even a small Django app running in production. Felt pretty good about myself walking in.

“Implement a linked list with insert and delete,” the interviewer said. Calm voice. Simple whiteboard marker.

And I just… sat there.

Not because I didn’t know what a linked list was. I’d read about them. Watched YouTube videos. Could probably sketch the pointer diagram on a napkin. But write one from scratch, under pressure, with someone watching? My hands wouldn’t move. I mumbled something about Python lists being dynamic arrays, tried to buy time, and eventually produced code so broken it wouldn’t survive a single test case.

Didn’t get the job. Obviously.

That evening, on the train back to my PG in Kothrud, I made a decision. I was going to implement every common data structure myself. Not copy from GeeksforGeeks. Not watch someone else code them. Actually build them, debug them, understand why each line existed. Stacks, queues, linked lists, binary search trees — all of it.

What follows is everything I built during those weeks. We’ll go through four data structures in Python, each one implemented from scratch with real methods you can drop into your own projects. If you’re preparing for interviews or just want to understand what’s actually happening under the hood when you call .append() on a list, this is for you.

Why Build Data Structures by Hand When Python Has Built-Ins?

Fair question. Python gives you lists, dicts, and sets right out of the box. They’re fast, well-tested, backed by decades of CPython optimization. So why bother?

Three reasons, and I think the first one matters most:

  • Interview survival. Every FAANG screen, every startup coding round, every competitive programming contest assumes you can implement basic data structures without reaching for a library. You probably already know this.
  • Debugging intuition. When your code runs slow and you can’t figure out why, understanding that a Python list is secretly a dynamic array — and that inserting at index 0 means copying every single element — changes how you think about the problem. I’ve caught O(n) bottlenecks in production code because I understood the internals.
  • Knowing when NOT to use a list. Sounds weird, but once you’ve built a linked list and felt where it shines (constant-time insertions at known positions) versus where it’s awful (random access by index), you start picking the right tool instinctively.

Alright. Let’s build things.

Stack: Last In, First Out

Stacks are everywhere and you’ve been using them without realizing it. Every function call your Python interpreter makes? Pushed onto a call stack. That undo button in your text editor? Stack. Depth-first search in a graph? Also a stack, often implicit through recursion.

LIFO — last in, first out. Picture a stack of thalis at a wedding buffet. You grab from the top. You never reach into the middle. That constraint is the whole point.

Here’s a clean implementation. I’ve kept it simple on purpose — a Python list as the backing store, O(1) push and pop, proper error handling for edge cases.


class Stack:
    """A stack implementation using a Python list."""

    def __init__(self):
        self._items = []

    def push(self, item):
        """Add an item to the top of the stack. O(1)."""
        self._items.append(item)

    def pop(self):
        """Remove and return the top item. O(1). Raises IndexError if empty."""
        if self.is_empty():
            raise IndexError("Pop from empty stack")
        return self._items.pop()

    def peek(self):
        """Return the top item without removing it. O(1)."""
        if self.is_empty():
            raise IndexError("Peek at empty stack")
        return self._items[-1]

    def is_empty(self):
        """Check if the stack is empty. O(1)."""
        return len(self._items) == 0

    def size(self):
        """Return the number of items. O(1)."""
        return len(self._items)

    def __repr__(self):
        return f"Stack({self._items})"


# Usage
stack = Stack()
stack.push(10)
stack.push(20)
stack.push(30)
print(stack)          # Stack([10, 20, 30])
print(stack.peek())   # 30
print(stack.pop())    # 30
print(stack.pop())    # 20
print(stack.size())   # 1

# Practical example: Check balanced parentheses
def is_balanced(expression):
    """Check if parentheses, brackets, and braces are balanced."""
    stack = Stack()
    pairs = {")": "(", "]": "[", "}": "{"}

    for char in expression:
        if char in "([{":
            stack.push(char)
        elif char in ")]}":
            if stack.is_empty() or stack.pop() != pairs[char]:
                return False
    return stack.is_empty()

print(is_balanced("({[()]})"))   # True
print(is_balanced("({[()]}"))    # False
print(is_balanced(")("))         # False

Few things to notice. We’re using self._items with a leading underscore — a Python convention that says “hey, don’t touch this directly from outside the class.” It’s not enforced, but it signals intent. And is_empty() gets called before every pop() and peek(), because popping from an empty stack should raise an explicit error, not some cryptic IndexError from the underlying list.

Tip: That is_balanced() function at the end? It comes up in interviews constantly. I’ve been asked some version of it at three different companies. Worth memorizing the pattern even if you forget the exact code.

One thing people sometimes get wrong: they try to implement a stack using a linked list for “better performance.” For Python, it’s almost never worth it. list.append() and list.pop() are amortized O(1) already because CPython over-allocates the underlying array. A linked list stack would give you true O(1) without amortization, but the constant-factor overhead from node object creation eats that advantage alive. Stick with the list.

Queue: First In, First Out

Where stacks go depth-first, queues go breadth-first. FIFO — first in, first out. Standing in line at Saravana Bhavan for a dosa on Sunday morning? That’s a queue. (Unless someone cuts the line. No data structure handles that gracefully.)

Queues show up in BFS traversals, task schedulers, print spoolers, message brokers like RabbitMQ and Kafka — basically anywhere work needs to be processed in the order it arrived.

Now here’s where it gets interesting from an implementation standpoint. You could use a regular Python list. But you shouldn’t. Removing from the front of a list is O(n) because every remaining element has to shift left. For a queue that processes thousands of items, that’s brutal.

Instead, we’ll use collections.deque — a double-ended queue backed by a doubly linked list of fixed-size blocks. Both append() and popleft() run in O(1). It’s what the standard library uses internally, and it’s what we should use too.


from collections import deque

class Queue:
    """A queue implementation using collections.deque for O(1) operations."""

    def __init__(self):
        self._items = deque()

    def enqueue(self, item):
        """Add an item to the back of the queue. O(1)."""
        self._items.append(item)

    def dequeue(self):
        """Remove and return the front item. O(1). Raises IndexError if empty."""
        if self.is_empty():
            raise IndexError("Dequeue from empty queue")
        return self._items.popleft()

    def front(self):
        """Return the front item without removing it. O(1)."""
        if self.is_empty():
            raise IndexError("Front of empty queue")
        return self._items[0]

    def is_empty(self):
        return len(self._items) == 0

    def size(self):
        return len(self._items)

    def __repr__(self):
        return f"Queue({list(self._items)})"


# Usage
q = Queue()
q.enqueue("Task A")
q.enqueue("Task B")
q.enqueue("Task C")
print(q)              # Queue(['Task A', 'Task B', 'Task C'])
print(q.dequeue())    # Task A
print(q.front())      # Task B
print(q.size())       # 2

Tip: If you ever need a thread-safe queue in Python (say, for a producer-consumer pipeline), skip this implementation and use queue.Queue from the standard library. It handles locking internally. Our version here is great for single-threaded work and learning, but it’ll break in concurrent scenarios.

A pattern I’ve noticed in my own projects: I reach for queues way less often than stacks. Maybe that’s just the kind of work I do. But when you need one — processing events in order, managing a job pipeline, running BFS on a graph — nothing else will do. Trying to fake FIFO behavior with a list or a stack just creates bugs.

Linked List: The Interview Favorite

And here we are. The one that tripped me up in that Pune interview room.

A linked list stores elements in nodes. Each node holds some data and a reference (pointer, if you’re coming from C) to the next node. No contiguous memory block like an array. No index-based access. Just a chain of objects pointing forward through the heap.

Why would anyone want that?

Insertions and deletions. If you’ve got a reference to a specific node, adding or removing an element next to it costs O(1). No shifting. No copying. Compare that to a Python list where inserting at position 0 means moving every other element one slot to the right — O(n) work for a single insert.

The tradeoff? You can’t jump to element 47 directly. Want the fifth element? Start at the head, follow four pointers. That’s O(n) access time, which is why linked lists are terrible for random lookups but excellent for scenarios involving lots of insertions and deletions at known positions.

Here’s a full singly linked list. Head pointer, tail pointer, and methods for appending, prepending, deleting, finding, reversing, and converting to a regular Python list.


class Node:
    """A single node in a linked list."""

    def __init__(self, data):
        self.data = data
        self.next = None

    def __repr__(self):
        return f"Node({self.data})"


class LinkedList:
    """A singly linked list with head and tail pointers."""

    def __init__(self):
        self.head = None
        self.tail = None
        self._size = 0

    def append(self, data):
        """Add a node to the end of the list. O(1)."""
        new_node = Node(data)
        if self.head is None:
            self.head = new_node
            self.tail = new_node
        else:
            self.tail.next = new_node
            self.tail = new_node
        self._size += 1

    def prepend(self, data):
        """Add a node to the beginning of the list. O(1)."""
        new_node = Node(data)
        new_node.next = self.head
        self.head = new_node
        if self.tail is None:
            self.tail = new_node
        self._size += 1

    def delete(self, data):
        """Delete the first node with the given data. O(n)."""
        if self.head is None:
            return False

        if self.head.data == data:
            self.head = self.head.next
            if self.head is None:
                self.tail = None
            self._size -= 1
            return True

        current = self.head
        while current.next:
            if current.next.data == data:
                if current.next == self.tail:
                    self.tail = current
                current.next = current.next.next
                self._size -= 1
                return True
            current = current.next
        return False

    def find(self, data):
        """Search for a node by value. Returns the node or None. O(n)."""
        current = self.head
        while current:
            if current.data == data:
                return current
            current = current.next
        return None

    def to_list(self):
        """Convert the linked list to a Python list. O(n)."""
        result = []
        current = self.head
        while current:
            result.append(current.data)
            current = current.next
        return result

    def reverse(self):
        """Reverse the linked list in place. O(n)."""
        self.tail = self.head
        prev = None
        current = self.head
        while current:
            next_node = current.next
            current.next = prev
            prev = current
            current = next_node
        self.head = prev

    def __len__(self):
        return self._size

    def __repr__(self):
        return " -> ".join(str(d) for d in self.to_list()) + " -> None"


# Usage
ll = LinkedList()
ll.append(10)
ll.append(20)
ll.append(30)
ll.prepend(5)
print(ll)             # 5 -> 10 -> 20 -> 30 -> None
print(len(ll))        # 4
print(ll.find(20))    # Node(20)

ll.delete(20)
print(ll)             # 5 -> 10 -> 30 -> None

ll.reverse()
print(ll)             # 30 -> 10 -> 5 -> None

Let’s walk through the tricky parts, because these are exactly the spots where I messed up during that interview.

Deletion Is Where Bugs Live

Deleting a node from a singly linked list means you need access to the node before it. You can’t look backward — there’s no prev pointer in a singly linked list. So the delete() method walks the chain checking current.next, not current. Once it finds the target, it skips over it by setting current.next = current.next.next.

Special case: deleting the head. There’s no node before the head, so we handle it separately. And if the head was also the tail (single-element list), we need to set self.tail = None too. Miss that edge case and your tail pointer dangles into garbage.

Tip: In interviews, always ask: “Can there be duplicate values? Should I delete the first occurrence or all of them?” Shows you’re thinking about edge cases before writing code. Interviewers love that.

Reversal: The Classic Interview Problem

Reversing a linked list in place shows up in probably 40% of linked list interview questions, based on what I’ve seen on LeetCode and heard from friends. The algorithm uses three pointers — prev, current, and next_node — and walks through the list flipping each pointer backward. After the loop, prev points to what was the last node, which becomes the new head.

I’ll be honest: I had to draw this out on paper about six times before the pointer arithmetic clicked. Don’t feel bad if it doesn’t make sense on the first read. Grab a pen, draw four nodes, trace the loop iteration by iteration. It’ll click.

Binary Search Tree: Sorted Storage with O(log n) Lookups

Trees are where data structures start feeling genuinely powerful. Not “oh that’s a nice trick” powerful. More like “I can search a million records and only touch twenty of them” powerful.

A binary search tree — BST for short — maintains a single rule at every node: everything in the left subtree is smaller, everything in the right subtree is larger. That ordering property means searching for a value is like playing a guessing game where you eliminate half the remaining options at every step. On a balanced tree with a million nodes, that’s roughly 20 comparisons to find anything. Wild.

Average-case complexity for insert, search, and delete: O(log n). Worst case, if you insert already-sorted data, the tree degenerates into a linked list and everything drops to O(n). We won’t fix that problem here — that’s what AVL trees and Red-Black trees are for — but it’s worth knowing the limitation.

Here’s a full BST with insert, search, delete (including the three-case logic), in-order and pre-order traversals, and height calculation.


class TreeNode:
    """A node in a binary search tree."""

    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None


class BinarySearchTree:
    """A binary search tree with insert, search, delete, and traversal."""

    def __init__(self):
        self.root = None
        self._size = 0

    def insert(self, value):
        """Insert a value into the BST. O(log n) average."""
        if self.root is None:
            self.root = TreeNode(value)
        else:
            self._insert_recursive(self.root, value)
        self._size += 1

    def _insert_recursive(self, node, value):
        if value < node.value:
            if node.left is None:
                node.left = TreeNode(value)
            else:
                self._insert_recursive(node.left, value)
        elif value > node.value:
            if node.right is None:
                node.right = TreeNode(value)
            else:
                self._insert_recursive(node.right, value)
        # Duplicate values are ignored

    def search(self, value):
        """Search for a value. Returns True if found. O(log n) average."""
        return self._search_recursive(self.root, value)

    def _search_recursive(self, node, value):
        if node is None:
            return False
        if value == node.value:
            return True
        elif value < node.value:
            return self._search_recursive(node.left, value)
        else:
            return self._search_recursive(node.right, value)

    def delete(self, value):
        """Delete a value from the BST. O(log n) average."""
        self.root, deleted = self._delete_recursive(self.root, value)
        if deleted:
            self._size -= 1
        return deleted

    def _delete_recursive(self, node, value):
        if node is None:
            return None, False

        deleted = False
        if value < node.value:
            node.left, deleted = self._delete_recursive(node.left, value)
        elif value > node.value:
            node.right, deleted = self._delete_recursive(node.right, value)
        else:
            deleted = True
            # Node with one child or no child
            if node.left is None:
                return node.right, True
            elif node.right is None:
                return node.left, True
            # Node with two children: find in-order successor
            successor = self._find_min(node.right)
            node.value = successor.value
            node.right, _ = self._delete_recursive(node.right, successor.value)
        return node, deleted

    def _find_min(self, node):
        current = node
        while current.left:
            current = current.left
        return current

    def inorder(self):
        """In-order traversal: returns sorted values. O(n)."""
        result = []
        self._inorder_recursive(self.root, result)
        return result

    def _inorder_recursive(self, node, result):
        if node:
            self._inorder_recursive(node.left, result)
            result.append(node.value)
            self._inorder_recursive(node.right, result)

    def preorder(self):
        """Pre-order traversal. O(n)."""
        result = []
        self._preorder_recursive(self.root, result)
        return result

    def _preorder_recursive(self, node, result):
        if node:
            result.append(node.value)
            self._preorder_recursive(node.left, result)
            self._preorder_recursive(node.right, result)

    def height(self):
        """Calculate the height of the tree. O(n)."""
        return self._height_recursive(self.root)

    def _height_recursive(self, node):
        if node is None:
            return -1
        left_h = self._height_recursive(node.left)
        right_h = self._height_recursive(node.right)
        return 1 + max(left_h, right_h)

    def __len__(self):
        return self._size

    def __contains__(self, value):
        return self.search(value)


# Usage
bst = BinarySearchTree()
for val in [50, 30, 70, 20, 40, 60, 80, 10]:
    bst.insert(val)

print("In-order:", bst.inorder())    # [10, 20, 30, 40, 50, 60, 70, 80]
print("Pre-order:", bst.preorder())  # [50, 30, 20, 10, 40, 70, 60, 80]
print("Height:", bst.height())       # 3
print("Size:", len(bst))             # 8
print("Contains 40:", 40 in bst)     # True
print("Contains 45:", 45 in bst)     # False

bst.delete(30)
print("After deleting 30:", bst.inorder())  # [10, 20, 40, 50, 60, 70, 80]

Okay, let’s break down the hard part: deletion. It’s got three cases and every CS textbook covers them, but I think most explanations make it sound more complicated than it actually is.

Case 1: Leaf Node (No Children)

Easy. Just remove it. Set the parent’s reference to None. Done.

Case 2: One Child

Almost as easy. Replace the node with its only child. If the node-to-delete has a left child but no right child, that left child takes its place. Works identically in reverse.

Case 3: Two Children (The Annoying One)

Can’t just remove the node — you’d orphan two subtrees. Instead, find the in-order successor: the smallest value in the right subtree. Copy that value into the current node, then delete the successor from the right subtree. Since the successor is the minimum of that subtree, it has at most one child (a right child), so deleting it falls into Case 1 or Case 2. Recursive elegance.

Tip: When interviewers ask you to delete from a BST, they’re mostly testing whether you know Case 3. Practice it until you can explain it without looking at code. Draw the tree, find the successor, do the swap, delete the duplicate. That’s it.

Traversals: Reading the Tree in Different Orders

In-order traversal (left, root, right) gives you sorted output. Always. If your in-order traversal doesn’t return sorted values, your BST has a bug. I use this as a sanity check after implementing insert() and delete().

Pre-order (root, left, right) captures the tree’s structure. If you serialize a BST using pre-order traversal, you can reconstruct the exact same tree from that sequence. Handy for saving/loading tree data.

Post-order exists too (left, right, root), and it’s useful for cleanup operations — deleting an entire tree, for example, where you want to free children before parents. We haven’t implemented it here, but the pattern is identical to the other traversals with the result.append() call moved after both recursive calls.

When to Pick What: A Cheat Sheet

After building all four, here’s the mental model I use when choosing between them:

  • Stack — need LIFO behavior. Undo systems, DFS, expression evaluation, backtracking algorithms. Reach for it when you’re processing things in reverse order.
  • Queue — need FIFO behavior. BFS, task scheduling, event processing, rate limiting. Reach for it when order-of-arrival matters.
  • Linked list — frequent insertions/deletions at known positions, and you don’t care about random access. LRU caches (paired with a hash map), music playlists, undo histories where each action points to the previous one.
  • Binary search tree — need sorted data with fast insert/search/delete. Autocomplete systems, in-memory indexes, range queries. If your data arrives in random order, a BST gives you O(log n) everything without sorting.

And when should you just use Python’s built-in list? Honestly, most of the time. For general-purpose storage, iteration, and index-based access, nothing beats it. Python lists are dynamic arrays with over-allocation, meaning append() is amortized O(1) and random access is O(1). Unless your specific use case involves one of the patterns above, a plain list is probably fine.

Complexity at a Glance

Data Structure Insert Delete Search Access by Index
Stack (list-backed) O(1) push O(1) pop O(n) O(1) top only
Queue (deque-backed) O(1) enqueue O(1) dequeue O(n) O(1) front only
Linked List O(1) at head/tail O(n) by value O(n) O(n)
Binary Search Tree O(log n) avg O(log n) avg O(log n) avg N/A
Python list (array) O(1) amortized append O(n) by value O(n) O(1)

Keep that table somewhere handy. I had a printout of something similar taped to my monitor for months while I was prepping for interviews back in late 2021. It seems basic, but under interview pressure your brain blanks on whether linked list access is O(1) or O(n), and having drilled the table helps.

Mistakes I Made (So You Don’t Have To)

Building these taught me a bunch of things that no textbook mentioned. Or maybe they did and I wasn’t paying attention.

  • Forgetting to update the tail pointer. In the linked list, every delete() and prepend() has to consider whether the tail needs updating. I spent two hours debugging a case where my tail still pointed to a deleted node. Embarrassing.
  • Using a Python list as a queue backing store. list.pop(0) is O(n). I benchmarked it once — 10,000 dequeues took 0.8 seconds with a list versus 0.003 seconds with a deque. That’s not a small difference.
  • Not handling the empty-structure case. Popping from an empty stack, dequeuing from an empty queue, deleting from an empty linked list. Each one needs an explicit check. I’d sometimes let the underlying Python data structure throw its own error, but the error messages were confusing (“pop from empty list” when the user thinks they’re using a stack).
  • BST insertion order matters hugely. Insert [10, 20, 30, 40, 50] into a BST and you get a straight line — basically a linked list. Every operation becomes O(n). Shuffling the input or using a self-balancing tree fixes this, but I didn’t realize how dramatic the difference was until I plotted it.

Where to Go From Here

You’ve got four working data structures. That’s a strong foundation, but there’s more ground to cover if you’re serious about interviews or systems work. A few directions I’d suggest, roughly in order of how useful I’ve found them:

  • Hash tables. Understand how Python’s dict works under the hood — open addressing, hash functions, collision resolution. Building one from scratch is a great exercise.
  • Balanced BSTs. AVL trees or Red-Black trees guarantee O(log n) even in the worst case. They’re significantly harder to implement than a plain BST (the rotation logic is gnarly), but worth studying conceptually even if you don’t code them from zero.
  • Heaps and priority queues. Python has heapq in the standard library. Build a min-heap yourself first, then switch to the built-in. Used everywhere in Dijkstra’s algorithm, scheduling, and top-K problems.
  • Graphs. Adjacency list representation, BFS, DFS, shortest path. Probably the single most important topic for interview prep after trees.

And practice on LeetCode or HackerRank. I know, I know — everyone says that. But genuinely, the difference between “I understand linked lists” and “I can implement a linked list reversal in 4 minutes under pressure” is just reps. It’s muscle memory.

Back to That Interview Room

Six months after that disaster in Pune, I interviewed again. Different company, similar round. “Implement a linked list,” the interviewer said. “Then reverse it.”

My hands moved before my brain caught up. Node class, three attributes. LinkedList class, head and tail pointers. append(), delete(), reverse() with the three-pointer trick. I even handled the edge case where deleting the tail required updating the tail pointer — the exact bug that had cost me hours months earlier.

Finished in under ten minutes. The interviewer nodded, asked a follow-up about time complexity, and moved on to system design.

Got the offer.

It wasn’t talent. Wasn’t some flash of insight. I’d just built the thing enough times that my fingers knew the code. There’s no shortcut past that. You read about data structures, you nod along, you think you get it. Then you sit in front of a blank editor and realize you don’t. Not yet. But you will, once you’ve written it, broken it, fixed it, and written it again.

Start with the stack — it’s the easiest. Move to the queue. Then the linked list. Then the BST. By the time you’ve debugged all four, you’ll be a different programmer than you were when you started. I promise.

Leave a Comment

Your email address will not be published. Required fields are marked with an asterisk.