DEV Community

El Housseine Jaafari
El Housseine Jaafari

Posted on • Originally published at clawship.app

Building an Engineering & Security News Aggregator (10 Sources, No APIs)

We built a curated engineering and security news aggregator that pulls from 10 high-signal sources, deduplicates content, and updates every 6 hours.

No paid APIs. No scraping. No login. Just clean, structured news for developers.

This post breaks down exactly how it works.


What This Is

A lightweight news wire combining:

  • Hacker News
  • Lobsters
  • InfoQ
  • Cloudflare Blog
  • Krebs on Security
  • The Hacker News (Security)
  • NIST NVD (vulnerabilities)
  • GitHub Blog
  • OpenAI Blog
  • Anthropic Research

The goal: high-quality signal, zero noise, zero cost.


Why Build This?

Most engineering/news aggregators fail in one of these ways:

  • Too noisy (no curation)
  • Too expensive (paid APIs)
  • Too slow (manual updates)
  • Too fragmented (you check 10 sites anyway)

We wanted:

  • A single feed
  • Fresh updates (but not real-time obsession)
  • No operational cost
  • No lock-in (no accounts, no tracking)

Stack

  • Hono (API layer)
  • Drizzle ORM
  • Postgres
  • Next.js (frontend)
  • RSS feeds + Hacker News Firebase API

High-Level Architecture

           ┌───────────────┐
           │   RSS Feeds   │
           │ (9 sources)   │
           └──────┬────────┘
                  │
                  ▼
           ┌───────────────┐
           │ Fetch Workers │
           │ (every 6 hrs) │
           └──────┬────────┘
                  │
                  ▼
        ┌──────────────────────┐
        │ Normalize Articles   │
        │ title, url, date     │
        └─────────┬────────────┘
                  │
                  ▼
        ┌──────────────────────┐
        │ SHA-256 Deduplication│
        │ (based on URL)       │
        └─────────┬────────────┘
                  │
                  ▼
           ┌───────────────┐
           │   Postgres    │
           └──────┬────────┘
                  │
                  ▼
           ┌───────────────┐
           │   Hono API    │
           └──────┬────────┘
                  │
                  ▼
           ┌───────────────┐
           │   Next.js UI  │
           └───────────────┘
Enter fullscreen mode Exit fullscreen mode

Data Sources

We deliberately chose sources with:

  • High editorial quality
  • Low duplication between each other
  • Stable RSS feeds or APIs

Breakdown

Source Type Why It Matters
Hacker News API Real-time dev signal
Lobsters RSS More technical discussions
InfoQ RSS Deep engineering content
Cloudflare Blog RSS Infra + performance insights
Krebs on Security RSS Trusted security reporting
The Hacker News RSS Security news (broader)
NIST NVD RSS/API Verified vulnerabilities
GitHub Blog RSS Platform + ecosystem updates
OpenAI Blog RSS AI developments
Anthropic Research RSS AI + safety research

Fetching Strategy

We run a simple scheduled job:

// every 6 hours
cron.schedule("0 */6 * * *", async () => {
  await fetchAllSources();
});
Enter fullscreen mode Exit fullscreen mode

Why every 6 hours?

  • Keeps content fresh
  • Avoids unnecessary load
  • Works well with RSS update frequencies

Deduplication (Key Part)

Different sources often post the same story.

We solve this using SHA-256 hashing of URLs.

import { createHash } from "crypto";

function hashUrl(url: string) {
  return createHash("sha256").update(url).digest("hex");
}
Enter fullscreen mode Exit fullscreen mode

Why URL hashing?

  • Fast
  • Deterministic
  • No fuzzy matching complexity
  • Works across sources

Tradeoff

  • Won’t catch rewritten articles with different URLs
  • But avoids false positives (important for trust)

Normalization

Each source has its own format. We normalize into a single shape:

type Article = {
  title: string;
  url: string;
  source: string;
  publishedAt: Date;
};
Enter fullscreen mode Exit fullscreen mode

This keeps the frontend simple and predictable.


API Layer (Hono)

Example endpoint:

app.get("/articles", async (c) => {
  const articles = await db.query.articles.findMany({
    orderBy: (a, { desc }) => [desc(a.publishedAt)],
    limit: 100,
  });

  return c.json(articles);
});
Enter fullscreen mode Exit fullscreen mode

Minimal, fast, no overengineering.


Frontend (Next.js)

  • Server-rendered list
  • No login required
  • No personalization
  • Just chronological, deduplicated news

Limitations

  • Not real-time (by design)
  • No personalization
  • Deduplication is URL-based only
  • Dependent on RSS availability

What We’d Improve

  • Smarter clustering (same story, different URLs)
  • Tagging (infra, AI, security, etc.)
  • Optional filters (without accounts)

Try It

The news wire is open to everyone:

👉 https://clawship.app/blog/engineering-security-news-wire


Connect with Us

Top comments (0)