Large language models generate text by predicting the next word in a sequence, a process that seems to prioritize content.

But their actual power (in some ways, their only power) emerges from their architecture - the transformer structure that allows attention to every other word, the layers that build conceptual hierarchies, the parameters that encode patterns.

GPT-4 doesn't simply know more words than GPT-3; its superiority lies in its improved architecture.

This isn’t actually about AI.

My basic idea is this: from scientific papers to political movements, from educational curricula to religious texts, the underlying architecture determines success or failure more reliably than the specific content contained within.

If this is true (and I think it is), then we've been optimizing for the wrong variable all along.

The Cathedral Effect

The builders of medieval cathedrals understood something profound about human cognition. Walking into Notre Dame, the feeling of wonder and reverence isn’t drawn from any particular stone or stained glass panel. It's the totality - the structural arrangement that creates an experience no individual element could produce alone.

Academic papers operate similarly. The IMRAD structure (Introduction, Methods, Results, And Discussion) signals "serious research" to readers before they process a single data point. Scientists who deviate from this structure, no matter how brilliant their findings, face skepticism simply because the expected architecture is absent.

The implications: If architecture matters more than content, then the best ideas might be hiding in ignored structures.

The corollary: sometimes mediocre ideas gain traction through superior architecture alone.

Memetic Survival Structures

Religious texts represent perhaps the oldest successful information architectures. The Bible is a carefully structured document designed for propagation across generations.

Its architectural elements - repetition, narrative arcs, embedded moral frameworks - serve as memetic survival mechanisms.

Compare this with philosophical treatises. Despite often containing equally profound ideas, most philosophical works reach far fewer minds. Plato's dialogues endure partly because their question-and-answer architecture proves more memorable than dense analytical prose. Spinoza's work, geometrically structured with axioms and propositions, appeals to mathematically inclined thinkers but its broader reach will always be limited.

This architectural advantage explains why religious frameworks persist even as specific theological claims become scientifically untenable. The architecture remains functional long after the content requires updating.

Companies (often) fail because their information architecture prevents their best ideas from reaching the right people. Amazon's "six-page memo" culture = an architectural intervention. By standardizing how ideas are presented (six pages, narrative form, read silently at the beginning of meetings), Bezos created an environment where ideas compete based on merit rather than presentation skill.

Microsoft, under Ballmer, suffered from the opposite problem. Their stack-ranking performance review system created an architectural environment where protecting your ideas became more important than sharing them. The content of those ideas - many likely brilliant - couldn't overcome the architectural barriers to their propagation.

When I talk with founders, I rarely focus first on generating new ideas.

Instead, I examine their information architecture.

How do ideas flow?

What structures impede or facilitate this flow?

The same people generating the same ideas under a different architectural framework produce massively // dramatically different results.

The Education Paradox

Finland's educational system consistently outperforms the United States despite spending fewer hours on instruction.

The content difference is minimal - both teach mathematics, reading, science.

The architectural difference is profound.

Finnish education emphasizes integrated learning blocks rather than discrete subjects. Their architecture creates natural connections between domains that American education artificially separates. An American student might excel at chemistry but struggle with physics, not realizing the profound connection between the subjects because the educational architecture separates them.

Students who just memorize facts fail to apply them in novel contexts. The architecture of their knowledge lacks connective tissue.

I've experimented with this in my own learning.

When I’m studying a new domain, I spend more time on architectural questions (How do the core concepts relate? What's the hierarchical structure?) than on content acquisition - dates, names etc.

And it works.

The Architectural Immune System

If architecture matters more than content, why don't we acknowledge this?

Why the persistent focus on what rather than how?

The answer is in the "architectural immune system."

Dominant architectures develop defensive mechanisms against architectural criticism. They redirect attention toward content debates, where disagreements can occur without threatening the underlying structure.

Example: Democrats and Republicans fiercely debate policy content within an architectural framework that remains largely unchallenged. Suggestions of parliamentary systems or ranked-choice voting face bipartisan resistance because they’re architectural threats.

Challenges to methodological architecture face stiffer resistance than content disagreements within the accepted architectural framework. Thomas Kuhn recognized this pattern in "The Structure of Scientific Revolutions."

Claude Shannon's information theory offers another perspective. Effective architectures compress information, making it easier to transmit, store, and recall. The periodic table is an example we’re all familiar with - it compresses vast chemical knowledge into a simple architectural framework. Once you understand the architecture, you can predict properties of elements you've never encountered.

Stories function as compression algorithms for human experience. A narrative architecture - setup, conflict, resolution - allows complex life lessons to be transmitted efficiently across generations. The content (specific characters or settings) can vary while the architecture maintains a deeply functional integrity.

Which is why certain architectures persist across domains.

The three-act structure appears in everything from Hollywood films to scientific papers to political speeches. Its persistence comes down to an optimal compression ratio for human information processing.

Good educators implicitly understand this principle.

They give their pupils architectural frameworks for knowledge acquisition before filling them with content.

Poor educators dump content without architectural scaffolding, then wonder why students fail to retain information.

User interface designers understand that architecture determines user behavior more reliably than content. Social media platforms deliberately architect interaction flows to maximize engagement. The specific content becomes almost irrelevant once the architectural hooks are established.

Political movements work the same way. The Tea Party's decentralized structure allowed rapid growth that more hierarchical movements couldn't match. Early Christianity's cell-based organization enabled it to spread under Roman persecution while more centralized religions struggled.

The Architecture-Content Loop

There’s a coda here: a recursive relationship between structure and content.

Architecture shapes content, but content eventually reshapes architecture.

“Normal” science operates within established architectural constraints until anomalous content accumulates beyond a threshold. This content pressure eventually forces architectural revision, establishing a new paradigm with different constraints.

English common law embodies this. The architectural framework of precedent shapes legal decisions, but novel cases gradually reshape the architecture itself. The adaptive tension is why common law stays remarkably resilient - it builds architectural stability while allowing content-driven evolution.

Is it possible - even conceptually - to cultivate this balance with any degree of actual intent?

Can we design architectures that optimize for content innovation while maintaining, even building structural integrity?

Beyond the Architecture-Content Dichotomy

Architecture and content exist on a spectrum. The distinction blurs when you start to think about meta-architecture - the architecture of creating architectures.

And if you’ve made it this far, bear with me.

We’re almost done...

A language like Python represents both content (specific syntax and libraries) and architecture (design principles and paradigms). Languages that succeed typically offer architectural advantages that transcend their specific content implementations.

Maybe instead of asking whether architecture or content matters more, we should ask:

What is the optimal relationship between them for a given purpose?

How can we design architectures that enable the right kind of content to emerge?

If architecture matters more than content, we have an educational challenge. Our systems emphasize content mastery but we neglect architectural understanding.

Students learn facts rather than frameworks for organizing those facts.

Literacy needs both. Reading is recognizing both letters (content) and grammar (architecture); intellectual development is understanding both specific ideas and structures for organizing them.

Architectural literacy education should probably include:

  • Pattern recognition across domains

  • Systems thinking and relationship mapping

  • Meta-cognitive frameworks for knowledge organization

  • Explicit study of successful information architectures

Some educational approaches move in this direction. Montessori education emphasizes discovery of underlying patterns. A liberal arts education (my own academic background) attempts to provide architectural frameworks that transcend specific content domains.

But these remain exceptions rather than the rule.

Most education still prioritizes content acquisition over architectural understanding.

Knowledge management systems like Roam Research and Obsidian emphasize relationship networks over content collection. Complexity scientists study emergent properties of systems rather than individual components. Network theorists map architectural relationships across domains from biology to information spread.

From content to architecture, this is a fundamental change in how we understand the world. Rather than seeing reality as composed of discrete objects and facts, we increasingly recognize patterns of relationship as the fundamental unit of meaning.

For those paying attention, this architectural turn offers tremendous leverage. Those who understand and design architectures will shape the future more profoundly than those who merely generate content within existing structures.

And yet - irony noted - I've just spent 2,000 words of content arguing for architecture's primacy. Perhaps the most persuasive argument wouldn't be an essay at all, but a new architecture for sharing ideas that demonstrates the principle directly.

Maybe next time.

Keep Reading

No posts found