Algorithmic Signal Discovery: Applying PageRank and LLMs to AI Twitter

A LessWrong contributor outlines a methodology for bypassing engagement algorithms to find high-quality, underrated technical voices.

In a recent post on LessWrong, a contributor explores a methodological approach to solving the "discovery problem" on social media, specifically within the artificial intelligence sector. The article, titled "Finding high signal people - applying PageRank to Twitter," details an experiment in algorithmic curation designed to bypass standard engagement metrics and identify high-value technical voices.

The Context

For developers and researchers in the AI/ML space, Twitter (now X) remains a primary channel for real-time discourse and paper releases. However, the platform's native discovery algorithms prioritize engagement, controversy, and mass appeal over technical accuracy or insight. This creates a "rich get richer" dynamic where established figures dominate the conversation, while high-signal researchers with smaller followings-or those who refuse to play the engagement game-remain invisible. The challenge lies in separating signal from noise without relying on vanity metrics like follower counts.

The Methodology

The author proposes a solution that combines classical graph theory with modern Large Language Models (LLMs). The core of the approach is a reapplication of the PageRank algorithm-the same logic that powered early Google search. Instead of web pages linking to one another, the model treats a "follow" as a citation or vote of confidence. However, not all votes are equal; a follow from a highly respected researcher carries significantly more weight than a follow from a random account.

The process involves three distinct stages:

Bootstrapping the Graph: Starting with a seed list of universally recognized AI figures to establish a baseline of "importance."
Identifying the Underrated: By comparing a user's calculated PageRank against their actual follower count, the system identifies anomalies: users who are highly respected by the experts but relatively unknown to the general public.
LLM Qualitative Filtering: To ensure the identified accounts are actually posting technical content rather than just being friends with researchers, an LLM analyzes their timeline to verify "consistent high signal."

This approach offers a replicable framework for anyone looking to curate a higher-quality information diet, moving beyond the limitations of "Who to Follow" recommendations.

Read the full post on LessWrong

Key Takeaways

Traditional social media metrics (likes, follows) are poor proxies for technical competence or insight.
PageRank can be adapted to social graphs by treating 'follows' as weighted citations, prioritizing connections from established experts.
The 'Underrated' metric identifies users with high centrality in expert networks but low public follower counts.
LLMs serve as a necessary qualitative filter to distinguish between high-status accounts and high-signal technical contributors.

Read the original post at lessw-blog

Key Takeaways

Sources