Isaiah Tucker

Posted on Apr 1

QuillSort — A data sorter

#python #datascience #performance #opensource

Most of the time, Python’s built-in sorted() and list.sort() are all you need.

But if you ever try to sort a lot of data—millions to billions of values, big numeric logs, or giant SQL exports—you quickly run into a wall: RAM, speed, or both.

So I built Quill-Sort (quill-sort on PyPI).

dragonbreathIT / quill-sort

quill-sort

Adaptive Ultra-Sort — the fastest general-purpose sorting library for Python.

from quill import quill_sort, quill_sorted
quill_sort([3, 1, 4, 1, 5, 9])               # → [1, 1, 3, 4, 5, 9]
quill_sort(records, key=lambda r: r['age'])   # sort objects
quill_sort(big_data, parallel=True)           # use all CPU cores
result = quill_sorted(iterable, reverse=True) # mirrors built-in sorted()

Installation

pip install quill-sort              # core (no dependencies)
pip install quill-sort[fast]        # + numpy + psutil for max speed
pip install quill-sort[all]         # + numpy + pandas + psutil

How it works

Quill profiles your data at intake and routes to the optimal algorithm automatically:

Data type	Strategy	Complexity
Dense integer range	`np.bincount` counting sort	O(n + k)
uint8 / uint16	Radix sort (`kind='stable'`, 1-2 passes)

…

View on GitHub

Quill is a drop-in Python sorting library that behaves like sorted() and list.sort(), but is specifically optimized for high-volume, numeric, and external (disk-backed) sorting. It’s designed to attack the weak spots of Python’s sort, not replace it for every tiny list.

Yes, I actually sorted 1,000,000,000 int32 values (4 GB) on a 28‑core machine in about 21 seconds using Quill’s external mode.

Important: Quill-Sort is still under active, major development. APIs, performance characteristics, and internal behavior may change, and bugs may occur. Additionally, the github repo isn't fully set up yet. Please don’t throw your only copy of critical production data at it without backups yet.

What is Quill-Sort?

Quill-Sort is a hyper-optimized sorting library for Python with a familiar API:

from quill import quill_sort, quill_sorted

data =[1][2][3][4][5]
quill_sort(data)              # in-place, like list.sort()
result = quill_sorted(data)   # returns new list, like sorted()

It’s built for two worlds:

Everyday data: millions of integers/floats/strings in memory.
Stupidly large data: hundreds of millions or even 1 billion integers using external disk-backed sorting (.qwrite files) plus parallel workers.

You don’t have to think about algorithms or chunk sizes—Quill routes to the right strategy internally.

Key features (in plain English)

Drop-in API
- quill_sort() behaves like list.sort().
- quill_sorted() behaves like sorted().
Type-aware strategies
- Specialized integer paths (radix, counting, NumPy-backed paths).
- Float paths with proper inf handling.
- Strings, bytes, and custom keys supported.
Fully stable

Equal keys preserve order, matching Python’s sort semantics.
Early-exit detection

Detects already sorted / reverse-sorted / all-same data and bails out fast instead of doing full work.
Parallel sorting

Can fan out across cores for large workloads.
External mode (.qwrite)

When data is too big for RAM, Quill can spill to disk, partition into pages, and run a multi-phase external sort with automatic temp-file management.
Plugins for NumPy and Pandas

Drop in a numpy array, Series, or DataFrame and let Quill handle it.

Benchmarks vs Python’s sort (real numbers)

All of these were run on my own machine:

Python 3.13.12
28 cores
quill-sort v4.0.4

In-memory benchmarks

These are single-process, in-RAM tests where Python can reasonably compete.

Integers with many duplicates (0–1000, 2M elements)

Quill : 0.1203s
Python: 0.1734s
Speedup: 1.44x

Quill uses integer-specific strategies here, so it wins especially hard on high-dup distributions.

Floats with ±inf (~1.02M elements)

Quill : 0.1069s
Python: 0.1206s
Speedup: 1.13x

Even against CPython’s heavily optimized timsort, Quill still edges ahead on large float arrays.

Random strings (3–40 chars, 500K elements)

Quill : 0.1089s
Python: 0.0954s
Speedup: ~0.88x (Python slightly faster)

For string-heavy workloads, Python’s built-in sort is already excellent. Quill stays in the same ballpark, but its real superpowers show up on numeric and huge-scale datasets.

The fun part: 1,000,000,000 integers

Here’s the one-billion 32‑bit integer stress test I ran:

Target: 1,000,000,000 integers
Type: int32
Range: 0–999,999,999
Data size: 4.0 GB
Machine: 28 cores, ~17 GB RAM
Quill: v4.0.4

High-level strategy:

28 worker processes (one per core).
Phase 1: generate + radix partition into buckets (.qwrite).
Phase 2: parallel bucket merge + sort + concatenation.

Result:

Elements sorted : 1,000,000,000
Sort errors     : 0  [OK] perfect
Total time      : 21.3s
Throughput      : 46.9M elements/second
Output file     : 4.00 GB
Temp files      : 7408 .qwrite files (cleaned automatically)

This is not something CPython’s built-in sort can do directly on this hardware. With normal Python lists of int, you would run out of RAM long before 1B elements. Even with compact buffers, you’d still need hand-written external sorting logic (chunk → sort → write → merge) plus temp-file management.

Quill wraps all of that into:

quill_sort(..., high_performance_mode=True)

External mode in action

When Quill detects that a dataset is too big for the available RAM, it prompts to enable external mode and shows you the plan:

How many pages it will create.
Approximate sizes.
Where temp .qwrite files will live.
That cleanup is automatic (even on crash).

Once enabled, Quill:

Splits data into pages.
Sorts each page (in parallel).
Writes each sorted page as a .qwrite file.
Runs a k-way merge over those files.
Streams the final sorted output.

From your perspective, it’s still just:

quill_sort(data, high_performance_mode=True)

When should you use Quill?

You probably don’t need Quill if:

You’re sorting small or medium lists (thousands → low millions) casually.
You only care about simple string sorting and are happy with Python’s built-ins.

Quill starts to earn its keep when:

You’re regularly sorting millions of ints/floats and care about shaving off 20–40% time.
You work with log data, telemetry, event streams, or numeric analytics.
You hit RAM ceilings and need a proper external sort that doesn’t require reinventing merge sort on disk.
You want drop-in speedups without rewriting everything in C++ or Rust.

Remember: this project is still under major development. Expect sharp edges, and please report bugs if you hit them.

Quick start

Install:

pip install quill-sort[fast]

Basic usage:

from quill import quill_sort, quill_sorted

# In-place
data =[2][4][5][6][7][1]
quill_sort(data)

# New list
result = quill_sorted(data)

# Reverse sort
quill_sort(data, reverse=True)

# With key function
users = [{"name": "Alice", "age": 30},
         {"name": "Bob",   "age": 20}]
quill_sort(users, key=lambda u: u["age"])

External “I have way too much data” mode:

quill_sort(big_data,
           high_performance_mode=True,  # allow external sort
           silent=False)                # see progress output

Install Quill-Sort and try it yourself!

DEV Community