Python Easy Asked at GoogleAsked at AmazonAsked at Meta

What is the most Pythonic way to count word frequencies in a string, and what does Counter return for missing keys?

For Data Analyst Data Scientist Data Engineer

The short answer

collections.Counter is the standard tool: it accepts any iterable and returns a dict-like object with counts. For a missing key it returns 0 rather than raising KeyError, which makes downstream arithmetic safe without extra guards.

How to think about it

This is the classic frequency count, and the interesting part isn’t getting the numbers — it’s knowing the whole Counter API: the zero default for missing keys, top-k with a heap, counter arithmetic, and the dict.get fallback for when you want to show what’s happening underneath.

Counter takes any iterable and hands back a dict-like object of counts. The one behaviour that sets it apart from a plain dict: a missing key returns 0, not a KeyError, which keeps downstream arithmetic safe without guard clauses.

A worked example

from collections import Counter

text = "the cat sat on the mat the cat ate the rat"
words = text.split()

freq = Counter(words)
print("Full counter:", freq)
print()

print("Top 3:", freq.most_common(3))          # heap-based, O(n log k)
print("Count of 'dog':", freq["dog"])          # 0, not KeyError
print()

# Case-insensitive: normalise as you feed the Counter
text2 = "Python python PYTHON is great"
print("Case-insensitive:", Counter(w.lower() for w in text2.split()))
print()

# Counters support arithmetic — merge two with +
print("After merging:", (freq + Counter(["cat", "dog", "cat"])).most_common(4))
print()

# The manual fallback, if collections isn't available
manual = {}
for w in words:
    manual[w] = manual.get(w, 0) + 1           # the zero default, by hand
print("Manual top 3:", dict(sorted(manual.items(), key=lambda x: -x[1])[:3]))

Full counter: Counter({'the': 4, 'cat': 2, 'sat': 1, 'on': 1, 'mat': 1, 'ate': 1, 'rat': 1})

Top 3: [('the', 4), ('cat', 2), ('sat', 1)]
Count of 'dog': 0

Case-insensitive: Counter({'python': 3, 'is': 1, 'great': 1})

After merging: [('the', 4), ('cat', 4), ('sat', 1), ('on', 1)]

Manual top 3: {'the': 4, 'cat': 2, 'sat': 1}

Three things to lift out. most_common(3) returns the top entries already ranked — and uses a heap, so it’s cheaper than sorting the whole thing. freq["dog"] returned 0 instead of raising, which is what makes Counter safe to do arithmetic on. And freq + extra merged the two by adding counts key by key (cat went 2 → 4), turning a fiddly loop into a single operator.

Why Counter beats a manual loop

The hand-rolled dict.get(w, 0) + 1 version is the same O(n) and clarifies what’s happening underneath — worth showing if the interviewer asks “without collections?” But in real code there’s no reason to write it: Counter gives you most_common, the arithmetic operators (+, -, &, |), and the zero-default for free.

Learn it properly Dictionaries

What is the most Pythonic way to count word frequencies in a string, and what does Counter return for missing keys?

A worked example

Why Counter beats a manual loop

Keep practising

Explore further