# The Structuralist: Tsinghua’s Jiuge and the Pre-LLM Era of Constrained Generative AI

> How a specialized deep learning system from 2022 highlights the trade-offs between structural precision and generalist flexibility.

**Published:** April 18, 2022
**Author:** Editorial Team
**Category:** platforms

**Tags:** Generative AI, Natural Language Processing, Tsinghua University, Deep Learning, Classical Chinese Poetry, Jiuge, AI Architecture

**Canonical URL:** https://pseedr.com/platforms/the-structuralist-tsinghuas-jiuge-and-the-pre-llm-era-of-constrained-generative-

---

In April 2022, the landscape of generative AI was defined not by monolithic foundation models, but by highly specialized systems designed to master distinct domains. It was in this context that the Natural Language Processing Lab at Tsinghua University (THUNLP), led by Professor Sun Maosong, solidified the capabilities of Jiuge, a deep learning system engineered specifically for the automatic generation of classical Chinese poetry. While the industry’s focus has since shifted toward General Artificial Intelligence (AGI) and massive Large Language Models (LLMs), Jiuge remains a critical case study in the ability of neural networks to adhere to "hard constraints"—the strict tonal and rhythmic rules of classical verse—which remain a stumbling block for even the most advanced modern transformers.

The development of Jiuge marked a significant milestone in the intersection of computational linguistics and cultural heritage. Unlike the prose generation tasks that dominate current benchmarks, classical Chinese poetry requires adherence to rigorous structural rules. A standard _Jueju_ (quatrain) or _Lu_ (octet) demands specific tonal patterns (level and oblique tones), rhyme schemes, and semantic parallelism. In 2022, THUNLP explicitly positioned Jiuge as a "Deep Learning-based System for Automatic Generation of Chinese Classical Poetry", a distinction that separated it from the rule-based templates of the past and the unstructured text generators of the time.

### The Architecture of Constraint

At the time of its prominence, Jiuge utilized deep learning techniques to navigate the complex search space of Chinese characters. The system was designed to handle the dual challenges of "meaning" and "form." While modern LLMs like GPT-4 or Baidu’s Ernie Bot excel at semantic coherence, they often struggle with the rigid structural constraints of poetry without extensive prompt engineering or fine-tuning. Jiuge was built to prioritize these constraints natively.

According to the project documentation, the system was trained on a massive corpus of classical literature, allowing it to internalize the stylistic nuances of different dynasties, specifically the Tang and Song eras. The public implementation, hosted on GitHub and available via a web interface, demonstrated the lab's commitment to open research, a practice that has become more complicated in the proprietary era of 2024.

### Competitive Landscape: The Specialist vs. The Chatbot

In 2022, the primary comparison for Jiuge was Microsoft’s Xiaoice, a social chatbot that had gained fame for publishing a book of poetry. However, where Xiaoice focused on emotional resonance and modern free verse, Jiuge targeted the academic rigor of classical forms. This distinction is vital; evaluating creative AI is notoriously subjective, but classical poetry offers objective metrics—a character is either tonally correct, or it is not. By focusing on these verifiable metrics, Tsinghua provided a clear benchmark for model performance that purely semantic models could not replicate.

### Retrospective: The Shift from Specialized to Generalist Models

Looking back from the vantage point of the current AI landscape, Jiuge represents the apex of the "Specialized Model" era. In the years since its release, the industry thesis has shifted. The prevailing belief today is that a sufficiently large generalist model can perform any task, including poetry, via few-shot prompting. However, this assumption often fails under scrutiny. Generalist models frequently "hallucinate" constraints, producing poems that look correct to a layperson but violate the tonal rules essential to the art form.

Jiuge’s legacy suggests that for high-stakes or highly constrained domains—whether it be classical poetry, legal contract drafting, or code generation—architecture matters. The "black box" approach of massive transformers often sacrifices precision for versatility. The work done by THUNLP highlights that cultural preservation through AI requires more than just ingesting data; it requires encoding the structural DNA of the culture into the model's objective functions.

### Conclusion

While Jiuge may not possess the conversational fluidity of today's leading chatbots, its contribution to the field of Natural Language Processing (NLP) remains relevant. It demonstrated that deep learning could be disciplined. As we refine the next generation of AI, the lessons from Jiuge—specifically regarding the integration of hard constraints into neural generation—will likely resurface as researchers attempt to cure the hallucination and imprecision issues plaguing current Foundation Models.

### Key Takeaways

*   \*\*Constraint Adherence:\*\* Jiuge demonstrated that deep learning systems could master strict structural rules (rhyme, meter, tone) better than generalist models, a capability that remains a benchmark for AI logic.
*   \*\*Specialized vs. General:\*\* The system represents the pre-2023 focus on domain-specific architectures, contrasting with the current trend of massive, multi-purpose Foundation Models.
*   \*\*Cultural AI:\*\* THUNLP's work highlights the necessity of specialized training data and architectural adjustments to preserve the integrity of culturally specific forms like Classical Chinese poetry.
*   \*\*Objective Evaluation:\*\* Unlike free verse or prose, classical poetry provided a rare objective metric (tonal compliance) for evaluating the precision of generative text models.

---

## Sources

- https://xiaoyuanyi.github.io/slides/jiuge.pdf
- https://github.com/THUNLP-AIPoet
- http://jiuge.thunlp.org/index_phone.html
