Uncovering the Hidden Priors: What LLMs Generate by Default
Coverage of together-blog
together-blog investigates the inherent 'knowledge priors' of major model families, revealing distinct default behaviors in GPT, Llama, and others when left unprompted.
In a recent analysis, together-blog poses a fundamental question regarding the nature of artificial intelligence: "What do LLMs think when you don't tell them what to think about?" While most users interact with Large Language Models (LLMs) through specific prompts and instructions, the underlying models possess inherent tendencies-referred to as "knowledge priors"-that shape their output when no specific direction is provided. This post explores these default behaviors across several major model families, offering a glimpse into the "subconscious" of modern AI.
The Context: The Myth of the Blank Slate
It is common to conceptualize foundation models as neutral engines that simply process user intent. However, every model is a product of its training data mix, architectural choices, and fine-tuning strategies. These factors create a gravitational pull toward certain types of content. Understanding these priors is not merely an academic exercise; it is critical for developers who need to understand model bias, steerability, and the suitability of specific architectures for distinct tasks. If a model has a strong default tendency toward a specific domain, it may require more aggressive prompting to perform well in unrelated areas.
The Analysis: Distinct Personalities in Code
The investigation by together-blog reveals that different model families exhibit surprisingly distinct "personalities" when generating content without constraints. The findings suggest that the training data distribution leaves a permanent fingerprint on the model's default state:
- GPT Models: These models demonstrate a strong inclination toward code and mathematics. This suggests a heavy weighting of technical data in their training corpus, positioning them as logic-first engines.
- Llama Models: In contrast, the Llama family shows a marked preference for narratives and storytelling, indicating a training focus on prose and general literature.
- DeepSeek: perhaps the most surprising finding is this family's tendency to generate religious content, pointing to specific cultural or textual concentrations in its dataset.
- Qwen: These models frequently output exam questions, likely reflecting a training regimen heavy on educational benchmarks and testing materials.
Why This Matters
Identifying these priors allows engineers to better understand the intrinsic biases of the tools they are building with. For example, knowing that Qwen leans toward exam formats might explain certain rigidities in creative writing tasks, while GPT's code bias explains its proficiency in logic but occasional dryness in creative prose. This research underscores that there is no such thing as a truly "unbiased" model; there are only models with different default settings.
To understand the methodology behind these findings and see the specific examples generated by each model family, we recommend reading the full report.
Read the full post at together-blog
Key Takeaways
- LLM families possess distinct 'knowledge priors' that dictate their default output when unprompted.
- GPT models show a strong default bias toward generating code and mathematical content.
- Llama models exhibit a natural preference for narrative structures and storytelling.
- DeepSeek and Qwen models display unique tendencies toward religious text and exam questions, respectively.
- Understanding these intrinsic biases is essential for effective model selection and prompt engineering.