Essay

News intelligence vs headline noise: how we dedupe and rank crypto stories

Hundreds of sources, dozens of repeats per story, and one signal you actually need. Here's the cross-source pipeline behind the News agent.

8 May 2026·5 min read·xris team

If you've tried to follow crypto news in 2026, you've noticed the same article landing in your feed five times under five different bylines, three of them with slightly wrong details, two with sensationalized headlines, and one — usually the most boring — that's actually the primary source.

Then you have the inverse problem: a real, market-moving story buried under the noise because no big outlet picked it up yet.

The News agent's job is to deal with both. Here's how.

Three ingestion pipelines, not one

We don't trust any single source category, so we run three in parallel:

  1. Curated RSS + email newsletters. A vetted list of outlets (CoinDesk, The Block, Decrypt, Bankless, plus a handful of newsletter authors who consistently surface primary information). High signal, slow cadence.
  2. CoinGecko's news firehose for per-token coverage. Wide, noisy, but exhaustive — picks up stories the curated outlets miss.
  3. X / Twitter via Grok, narrowly scoped to a list of known-quality accounts (project teams, on-chain analysts, traders we trust). Fastest to react.

Each pipeline runs at its own cadence and writes to a shared store, tagged by source type. The downstream stages don't care where an item came from — they care about its content.

Dedupe: same story, different bylines

Two articles about the same event are usually 80% the same words. We use a combination of fuzzy title matching, body shingle hashing, and (when needed) an LLM verification pass to collapse near-duplicates into one canonical item.

The collapsed item keeps all the source URLs. So when the dashboard shows you "BTC ETF saw $400M outflow," you can expand to see that CoinDesk, The Block and three newsletters all carried it. The "merged ×4" badge in the dashboard means exactly that — same story, four sources verified, higher confidence.

Why does this matter? Because multi-source coverage is a signal. If only one outlet is carrying something, it's either a scoop (rare) or unverified (common). If five carry it within an hour, it's real and the market will react.

Sentiment scoring

Each item gets a sentiment label — bullish, bearish, neutral, or mixed — assigned by a lightweight classifier trained on labeled crypto headlines. The label is shown next to the item, color-coded, and the dashboard's Sentiment bar at the top of the News page aggregates all of today's items into a single bar so you can see the day's mood at a glance.

Sentiment isn't a directional prediction. A bearish item doesn't mean "sell now." It means the news flow is leaning negative, which is one input among others.

Impact tagging and pinning

Sentiment tells you the mood. Impact tells you whether anyone should care.

We tag each item with an impact_size ranging from low (chatter) to high (event). High-impact items get pinned to the relevant token's news strip — they don't scroll away, they sit at the top of that token's news view until they decay (typically 7 days for unlocks, 24 hours for price-moving events).

The pin reason appears on the item: 📌 Unlock event T-36h or 📌 ETF flow inflection. You always know why something was pinned.

Per-token scoping

A generic news feed is useless. What you actually want is: when you click on SOL in the dashboard, you see the news that matters for SOL, not the broad market.

The News agent maintains a per-token index. An item about a Bitcoin ETF flow gets associated with BTC. An item about Solana validator economics gets associated with SOL. An item about a SOL/ETH bridge bug gets both. Multi-token attribution is the rule, not the exception.

This is what powers the news badges on the main grid. A small dot next to SOL means "there's pinned news on this token, click in." Amber for high-impact, blue for recent. You're not scanning a global feed — you're being told which tokens have news worth a look.

Cross-checking with Grok and Perplexity

For high-impact items, we add a verification pass. The item gets handed to two independent research models — Grok (for X-native context) and Perplexity (for web-grounded synthesis) — to confirm the claim against primary sources.

The dashboard shows you the verification status: ✓ confirmed if both checks agreed with the original, or via grok_only / via perplexity_only if only one verified. Items that fail both verifications get downranked (still visible if you expand, but not pinned).

This is the "second opinion" pattern that single-LLM news summarizers can't do — they have one voice and one context. We have specialists.

Voting

Finally, you can vote on items. Up means "this matters for trading intelligence," down means "this is noise, deprioritize." Votes feed back into the ranker so the system learns from your judgment over time.

The vote count is shown next to each item; you can only vote once per item per browser. It's a lightweight feedback loop, not a popularity contest.

What you see in the dashboard

When you open the News tab on a token (or the standalone News view), you get:

  • A sentiment bar at the top — the day's aggregate mood across the items you're filtering.
  • A tag heatmap — what topics are dominating today (use it as a filter).
  • The macro brief and crypto brief — two AI-written summaries (separate post on those upcoming).
  • The news list itself, with sentiment, priority, type, verification, sources, and merged-count badges.

Each item is one to two clicks away from its primary source. Nothing is lost in the synthesis.

What this isn't

This isn't a "summarize the news for me" copilot. We're not trying to replace your reading — we're trying to make the reading you do worth doing. You still click through to the high-impact items. You still form your own view. The agent just makes sure the items in front of you are the ones worth your time.

If you want to see it in practice, open the news view — it's live and free during beta.

For more on how the agents work together, read about the merge layer that combines their outputs, or how the S/R agent contributes its piece.