Most of the time, Python’s built-in sorted() and list.sort() are all you need.
But if you ever try to sort a lot of data—millions to billions of values, big numeric logs, or giant SQL exports—you quickly run into a wall: RAM, speed, or both.
So I built Quill-Sort (quill-sort on PyPI).
quill-sort
Adaptive Ultra-Sort — the fastest general-purpose sorting library for Python.
from quill import quill_sort, quill_sorted
quill_sort([3, 1, 4, 1, 5, 9]) # → [1, 1, 3, 4, 5, 9]
quill_sort(records, key=lambda r: r['age']) # sort objects
quill_sort(big_data, parallel=True) # use all CPU cores
result = quill_sorted(iterable, reverse=True) # mirrors built-in sorted()
Installation
pip install quill-sort # core (no dependencies)
pip install quill-sort[fast] # + numpy + psutil for max speed
pip install quill-sort[all] # + numpy + pandas + psutil
How it works
Quill profiles your data at intake and routes to the optimal algorithm automatically:
| Data type | Strategy | Complexity |
|---|---|---|
| Dense integer range |
np.bincount counting sort |
O(n + k) |
| uint8 / uint16 | Radix sort (kind='stable', 1-2 passes) |
Quill is a drop-in Python sorting library that behaves like sorted() and list.sort(), but is specifically optimized for high-volume, numeric, and external (disk-backed) sorting. It’s designed to attack the weak spots of Python’s sort, not replace it for every tiny list.
Yes, I actually sorted 1,000,000,000 int32 values (4 GB) on a 28‑core machine in about 21 seconds using Quill’s external mode.
Important: Quill-Sort is still under active, major development. APIs, performance characteristics, and internal behavior may change, and bugs may occur. Additionally, the github repo isn't fully set up yet. Please don’t throw your only copy of critical production data at it without backups yet.
What is Quill-Sort?
Quill-Sort is a hyper-optimized sorting library for Python with a familiar API:
from quill import quill_sort, quill_sorted
data =[1][2][3][4][5]
quill_sort(data) # in-place, like list.sort()
result = quill_sorted(data) # returns new list, like sorted()
It’s built for two worlds:
- Everyday data: millions of integers/floats/strings in memory.
-
Stupidly large data: hundreds of millions or even 1 billion integers using external disk-backed sorting (
.qwritefiles) plus parallel workers.
You don’t have to think about algorithms or chunk sizes—Quill routes to the right strategy internally.
Key features (in plain English)
-
Drop-in API
-
quill_sort()behaves likelist.sort(). -
quill_sorted()behaves likesorted().
-
-
Type-aware strategies
- Specialized integer paths (radix, counting, NumPy-backed paths).
- Float paths with proper
infhandling. - Strings, bytes, and custom keys supported.
Fully stable
Equal keys preserve order, matching Python’s sort semantics.Early-exit detection
Detects already sorted / reverse-sorted / all-same data and bails out fast instead of doing full work.Parallel sorting
Can fan out across cores for large workloads.External mode (.qwrite)
When data is too big for RAM, Quill can spill to disk, partition into pages, and run a multi-phase external sort with automatic temp-file management.Plugins for NumPy and Pandas
Drop in anumpyarray,Series, orDataFrameand let Quill handle it.
Benchmarks vs Python’s sort (real numbers)
All of these were run on my own machine:
- Python 3.13.12
- 28 cores
-
quill-sortv4.0.4
In-memory benchmarks
These are single-process, in-RAM tests where Python can reasonably compete.
Integers with many duplicates (0–1000, 2M elements)
Quill : 0.1203s
Python: 0.1734s
Speedup: 1.44x
Quill uses integer-specific strategies here, so it wins especially hard on high-dup distributions.
Floats with ±inf (~1.02M elements)
Quill : 0.1069s
Python: 0.1206s
Speedup: 1.13x
Even against CPython’s heavily optimized timsort, Quill still edges ahead on large float arrays.
Random strings (3–40 chars, 500K elements)
Quill : 0.1089s
Python: 0.0954s
Speedup: ~0.88x (Python slightly faster)
For string-heavy workloads, Python’s built-in sort is already excellent. Quill stays in the same ballpark, but its real superpowers show up on numeric and huge-scale datasets.
The fun part: 1,000,000,000 integers
Here’s the one-billion 32‑bit integer stress test I ran:
- Target: 1,000,000,000 integers
- Type: int32
- Range: 0–999,999,999
- Data size: 4.0 GB
- Machine: 28 cores, ~17 GB RAM
- Quill: v4.0.4
High-level strategy:
- 28 worker processes (one per core).
- Phase 1: generate + radix partition into buckets (.qwrite).
- Phase 2: parallel bucket merge + sort + concatenation.
Result:
Elements sorted : 1,000,000,000
Sort errors : 0 [OK] perfect
Total time : 21.3s
Throughput : 46.9M elements/second
Output file : 4.00 GB
Temp files : 7408 .qwrite files (cleaned automatically)
This is not something CPython’s built-in sort can do directly on this hardware. With normal Python lists of int, you would run out of RAM long before 1B elements. Even with compact buffers, you’d still need hand-written external sorting logic (chunk → sort → write → merge) plus temp-file management.
Quill wraps all of that into:
quill_sort(..., high_performance_mode=True)
External mode in action
When Quill detects that a dataset is too big for the available RAM, it prompts to enable external mode and shows you the plan:
- How many pages it will create.
- Approximate sizes.
- Where temp
.qwritefiles will live. - That cleanup is automatic (even on crash).
Once enabled, Quill:
- Splits data into pages.
- Sorts each page (in parallel).
- Writes each sorted page as a
.qwritefile. - Runs a k-way merge over those files.
- Streams the final sorted output.
From your perspective, it’s still just:
quill_sort(data, high_performance_mode=True)
When should you use Quill?
You probably don’t need Quill if:
- You’re sorting small or medium lists (thousands → low millions) casually.
- You only care about simple string sorting and are happy with Python’s built-ins.
Quill starts to earn its keep when:
- You’re regularly sorting millions of ints/floats and care about shaving off 20–40% time.
- You work with log data, telemetry, event streams, or numeric analytics.
- You hit RAM ceilings and need a proper external sort that doesn’t require reinventing merge sort on disk.
- You want drop-in speedups without rewriting everything in C++ or Rust.
Remember: this project is still under major development. Expect sharp edges, and please report bugs if you hit them.
Quick start
Install:
pip install quill-sort[fast]
Basic usage:
from quill import quill_sort, quill_sorted
# In-place
data =[2][4][5][6][7][1]
quill_sort(data)
# New list
result = quill_sorted(data)
# Reverse sort
quill_sort(data, reverse=True)
# With key function
users = [{"name": "Alice", "age": 30},
{"name": "Bob", "age": 20}]
quill_sort(users, key=lambda u: u["age"])
External “I have way too much data” mode:
quill_sort(big_data,
high_performance_mode=True, # allow external sort
silent=False) # see progress output
Top comments (0)