Algorithmic Signal Discovery: Applying PageRank and LLMs to AI Twitter
Coverage of lessw-blog
A LessWrong contributor outlines a methodology for bypassing engagement algorithms to find high-quality, underrated technical voices.
In a recent post on LessWrong, a contributor explores a methodological approach to solving the "discovery problem" on social media, specifically within the artificial intelligence sector. The article, titled "Finding high signal people - applying PageRank to Twitter," details an experiment in algorithmic curation designed to bypass standard engagement metrics and identify high-value technical voices.
The Context
For developers and researchers in the AI/ML space, Twitter (now X) remains a primary channel for real-time discourse and paper releases. However, the platform's native discovery algorithms prioritize engagement, controversy, and mass appeal over technical accuracy or insight. This creates a "rich get richer" dynamic where established figures dominate the conversation, while high-signal researchers with smaller followings-or those who refuse to play the engagement game-remain invisible. The challenge lies in separating signal from noise without relying on vanity metrics like follower counts.
The Methodology
The author proposes a solution that combines classical graph theory with modern Large Language Models (LLMs). The core of the approach is a reapplication of the PageRank algorithm-the same logic that powered early Google search. Instead of web pages linking to one another, the model treats a "follow" as a citation or vote of confidence. However, not all votes are equal; a follow from a highly respected researcher carries significantly more weight than a follow from a random account.
The process involves three distinct stages:
- Bootstrapping the Graph: Starting with a seed list of universally recognized AI figures to establish a baseline of "importance."
- Identifying the Underrated: By comparing a user's calculated PageRank against their actual follower count, the system identifies anomalies: users who are highly respected by the experts but relatively unknown to the general public.
- LLM Qualitative Filtering: To ensure the identified accounts are actually posting technical content rather than just being friends with researchers, an LLM analyzes their timeline to verify "consistent high signal."
This approach offers a replicable framework for anyone looking to curate a higher-quality information diet, moving beyond the limitations of "Who to Follow" recommendations.
Read the full post on LessWrong
Key Takeaways
- Traditional social media metrics (likes, follows) are poor proxies for technical competence or insight.
- PageRank can be adapted to social graphs by treating 'follows' as weighted citations, prioritizing connections from established experts.
- The 'Underrated' metric identifies users with high centrality in expert networks but low public follower counts.
- LLMs serve as a necessary qualitative filter to distinguish between high-status accounts and high-signal technical contributors.