Building Co-Creative AI Systems as a Method for Testing Enactive Theory: Bridging HCI, Enaction, and AI
Nicholas Davis, PhD
Abstract
Cognitive science and artificial intelligence were once closely braided, sharing methods, questions, and conceptual commitments. Today, however, they have largely diverged. Enactive and embodied approaches within cognitive science have developed rich theoretical accounts of sense-making, temporality, and lived interaction, while contemporary AI has advanced powerful systems primarily organized around optimization, prediction, and performance. This divergence has produced an asymmetry: cognitive theories increasingly lack constructive contact with artificial systems, while AI systems increasingly lack grounding in theories of cognition. This paper argues that building co-creative AI systems constitutes a form of theory-in-action through which enactive cognitive theories can be tested, refined, and made operational. We propose co-creative AI—particularly as developed within Human–Computer Interaction—as a methodological bridge that reunites theory and construction. By treating system-building as an epistemic practice rather than an application step, co-creative systems function as experimental instruments for cognitive theory, revealing constraints, failure modes, and commitments that abstract theorizing alone cannot expose.
1. Introduction: The Drift Between Cognitive Science and AI
Cognitive science, artificial intelligence, and human–computer interaction did not begin as separate enterprises. Early work in cybernetics, artificial intelligence, and cognitive modeling treated theory and construction as mutually informing: building systems was a way of thinking about cognition, and theories of cognition were expected to survive contact with working artifacts (Wiener, 1948; Ashby, 1956; Simon, 1969; Winograd & Flores, 1986). Over time, however, these traditions have drifted apart.
In contemporary cognitive science, particularly under enactive, embodied, and phenomenological approaches, cognition is understood as sense-making enacted through ongoing coupling with the world (Varela et al., 1991; Thompson, 2007; Di Paolo et al., 2017). Concepts such as viability, autonomy, temporality, and regulation now occupy a central place in theoretical accounts of mind (Maturana & Varela, 1980; Di Paolo, 2005; Weber & Varela, 2002). These developments have significantly enriched our understanding of cognition as lived, situated, and historically extended (Merleau-Ponty, 2012; Gallagher, 2017).
At the same time, artificial intelligence has undergone rapid technical expansion. Large-scale learning systems, generative models, and optimization-driven architectures now achieve impressive performance across a wide range of tasks (LeCun et al., 2015; Sutton & Barto, 2018). Yet these systems are largely developed independently of contemporary cognitive theory. Intelligence is typically framed in terms of prediction accuracy, reward maximization, or benchmark performance, and interaction is often reduced to input–output mapping rather than treated as an unfolding process of sense-making (Russell & Norvig, 2021; Lipton & Steinhardt, 2019).
The result is a growing gap. Cognitive theories of mind increasingly lack constructive instantiation, while artificial systems increasingly lack cognitive grounding. Enactive theories describe what cognition is, but are rarely stressed by systems that must operate, adapt, and remain coherent over time (Froese & Ziemke, 2009). AI systems, meanwhile, excel at short-horizon success while exhibiting subtle but persistent failures in long-term interaction—over-engagement, mis-timing, loss of trust, and breakdowns of coherence that are not easily captured by existing metrics (Lee & See, 2004; Sculley et al., 2015; Amodei et al., 2016).
This paper argues that Human–Computer Interaction—and, in particular, co-creative AI systems—provides a practical site where these traditions can be re-entangled. Co-creative systems force artificial agents to participate in ongoing human activity rather than merely produce outputs (Jordan & Henderson, 1995; Fischer et al., 2014). In doing so, they make unavoidable many of the core commitments of enactive theory: regulation over time, sensitivity to drift, pacing, yielding, and the maintenance of interactional coherence (Dourish, 2004; De Jaegher & Di Paolo, 2007).
This perspective aligns closely with more recent work on dynamic human–computer interaction (DHCI), which conceptualizes interaction as a temporally extended, adaptive process rather than a sequence of discrete user actions. Contemporary DHCI research emphasizes long-term interaction, mutual adaptation, breakdown and recovery, and the co-evolution of users and systems over time—particularly in contexts involving intelligent, autonomous, or learning systems (Bødker, 2015; Kuutti et al., 2021; Lim et al., 2019). Rather than treating interaction as momentary input–output exchange, DHCI foregrounds trajectories, histories, and transitions as primary analytic units. Co-creative AI systems naturally instantiate these concerns: they expose how pacing, yielding, responsiveness, and coherence must be regulated across unfolding interactional contexts. As such, DHCI provides a contemporary interactional framework in which enactive commitments—such as participatory sense-making and regulation under drift—become operationally unavoidable rather than theoretically optional.
We propose that building such systems is not an application of cognitive theory but a mode of theoretical inquiry. Co-creative AI systems function as epistemic instruments through which theories of cognition are enacted, constrained, and revised (Schön, 1983; Kirsh, 2013). By returning construction to the center of cognitive science—via HCI—we outline a path toward re-braiding enaction, AI, and interaction around the shared problem of understanding intelligence as something that unfolds over time.
Contributions
This paper makes three primary contributions.
First, it articulates a methodological diagnosis of the current divide between enactive cognitive science and artificial intelligence, arguing that the separation is not merely disciplinary but rooted in a shared retreat from constructive engagement on one side and from cognitive theory on the other. We characterize this condition as a form of theoretical and constructive drift that limits both explanatory power and practical viability (Suchman, 2007; Froese & Ziemke, 2009).
Second, the paper advances co-creative AI within HCI as a methodological hinge through which enactive theory and artificial systems can productively constrain one another. We argue that co-creative interaction—by demanding sustained, situated participation over time—forces cognitive commitments such as regulation, autonomy, and sense-making to become operational, while simultaneously exposing limitations of optimization-centric AI approaches that neglect interactional coherence (Bown et al., 2009; Davis et al., 2015; Jordanous & Keller, 2016).
Third, we propose building as a mode of theoretical inquiry in cognitive science. Rather than treating construction as downstream of theory, we show how co-creative systems function as epistemic instruments that reveal where cognitive concepts succeed, fail, or require refinement when enacted in real interaction (Schön, 1983; Kirsh, 2013; Di Paolo et al., 2017). This reframing supports a re-braided cognitive science in which theory, system-building, and lived interaction evolve together, offering a practical path toward cognitively grounded AI without reducing cognition to performance metrics or abstraction alone.
This paper does not propose a new interface or system for adoption, but a methodological stance for HCI research: that building co-creative systems can function as a form of cognitive experimentation.
2. Enaction Without Construction: Limits of Contemporary Enactive Cognitive Science
2.1 Enaction’s Contributions to Cognitive Theory
Enactive cognitive science has produced some of the most compelling accounts of cognition available today. By rejecting internalist, representational models, enaction reframes cognition as sense-making enacted through ongoing coupling between organism and environment (Varela et al., 1991; Di Paolo et al., 2017). Concepts such as autonomy, viability, structural coupling, and normativity have reshaped how cognition is understood—placing time, embodiment, and lived experience at the center rather than at the margins (Maturana & Varela, 1980; Weber & Varela, 2002; Thompson, 2007).
These contributions are not merely philosophical. Enactive approaches have clarified why cognition cannot be reduced to symbol manipulation or statistical inference alone (Hutto & Myin, 2017). They have foregrounded the role of affect, action readiness, and historical continuity in sustaining meaningful engagement with the world (Di Paolo, 2005; Colombetti, 2014). In doing so, enaction has offered a powerful ontology of cognition—one that emphasizes becoming over state, regulation over control, and participation over representation (Varela, 1979; Gallagher, 2017).
2.2 The Absence of Constructive Pressure
However, despite these strengths, much contemporary enactive work remains largely non-constructive. Enactive theories are often articulated at a conceptual or descriptive level, supported by phenomenological analysis or reinterpretations of empirical findings, but rarely stressed by systems that must actually operate over time (Froese & Ziemke, 2009). As a result, many core enactive commitments remain underdetermined with respect to implementation, dynamics, and failure.
This limitation is not accidental. Enactive cognitive science has, in part, defined itself in opposition to classical AI and computational modeling, which it correctly identified as overly representational and disembodied (Varela et al., 1991; Winograd & Flores, 1986). In distancing itself from those traditions, however, enaction has also retreated from system-building as a methodological practice. Artificial systems are frequently treated as metaphors, cautionary examples, or external foils rather than as sites where enactive theory might be enacted, tested, or challenged (Suchman, 2007; Newen et al., 2018).
2.3 The Gap Between Theory and Operation
The consequence is a growing gap between theoretical richness and operational clarity. Key enactive notions—such as regulation, autonomy, sense-making, and normativity—are invoked with increasing sophistication, yet often without specifying how these processes would function in systems that must sustain interaction, recover from misalignment, or adapt under ongoing change (Di Paolo & Thompson, 2014). Without constructive pressure, theories risk stabilizing internally while remaining permissive about their own commitments.
We refer to this condition as theoretical drift. Theoretical drift occurs when a framework remains coherent within its own discourse but gradually loses traction with practice. Concepts accumulate, distinctions proliferate, and explanatory narratives become more refined, yet the theory is no longer constrained by the demands of building something that must work (Kuhn, 1962; Schön, 1983). Importantly, theoretical drift is not a failure of rigor; it is a failure of contact.
2.4 Temporality Without Operational Constraint
This becomes especially visible when enactive theories are applied to phenomena that unfold over time. While enaction emphasizes temporality in principle, few enactive accounts are tested against systems that must manage long-duration interaction—where misalignment accumulates slowly, breakdowns are subtle, and viability depends on pacing, restraint, and historical sensitivity (Thompson, 2007; Rietveld et al., 2018). As a result, temporality often remains a descriptive feature rather than an operational constraint.
2.5 Ambiguity About Success and Failure
Without construction, enactive theories face a second risk: ambiguity about what counts as success or failure. In the absence of running systems, it becomes difficult to distinguish between concepts that are explanatorily necessary and those that are merely rhetorically appealing. Does a given account of regulation genuinely constrain behavior, or does it simply redescribe it? Does autonomy imply specific capacities, or does it function as a post hoc label? These questions cannot be resolved through theory alone (Beer, 2003; Froese & Ziemke, 2009).
2.6 Construction as Inquiry, Not Engineering
None of this suggests that enactive cognitive science must become engineering in the narrow sense. Rather, it suggests that enactive theory requires constructive counterparts—systems that instantiate its commitments sufficiently to expose their consequences. Building, in this context, is not a validation exercise but a form of inquiry (Schön, 1983; Kirsh, 2013). It reveals where concepts are underspecified, where assumptions conflict, and where additional distinctions are required.
In the absence of such constructive engagement, enactive cognitive science risks becoming phenomenologically rich but operationally thin. Its accounts may remain compelling, yet increasingly detached from the practical problem of how cognition—biological or artificial—maintains coherence under changing conditions. Re-engaging with construction is therefore not a retreat from enaction’s philosophical commitments, but a way of honoring them (Di Paolo et al., 2017).
The next section turns to the complementary problem: contemporary artificial intelligence systems that build powerful machinery while largely abandoning theories of cognition altogether.
3. AI Without Cognition: Limits of Contemporary Artificial Intelligence
3.1 Technical Success and Conceptual Narrowing
Contemporary artificial intelligence has achieved extraordinary technical success. Large-scale learning systems now demonstrate impressive performance across perception, language, and control tasks. Advances in representation learning, optimization, and scaling have produced systems that are robust, flexible, and widely deployable (LeCun et al., 2015; Brown et al., 2020; Bommasani et al., 2021). From an engineering standpoint, modern AI is undeniably powerful.
Yet this success has come with a narrowing of what intelligence is taken to be. In dominant AI paradigms, intelligence is framed primarily as performance on well-defined tasks: prediction accuracy, reward maximization, loss minimization, or benchmark superiority (Russell & Norvig, 2021; Sutton & Barto, 2018). Learning is evaluated in terms of convergence and generalization, and interaction is often treated as a transient channel for input and output rather than as an ongoing process of sense-making (Bender & Koller, 2020).
3.2 Performance Without Viability
This framing has proven effective for short-horizon problems, but it carries significant cognitive limitations. Most contemporary AI systems are optimized to act correctly, not to remain viable. They excel at producing appropriate outputs under assumed conditions, yet struggle to sustain coherent behavior across extended interaction, shifting contexts, or evolving norms (Amodei et al., 2016; Marcus, 2018). When breakdown occurs, it is typically handled through retraining, fine-tuning, or external intervention rather than through endogenous regulation.
From a cognitive perspective, this reflects a fundamental asymmetry. Human and animal cognition is not organized primarily around performance metrics, but around maintaining coherent engagement with the world over time (Di Paolo et al., 2017; Thompson, 2007). Cognition persists by regulating participation—by adjusting pacing, yielding control, reorienting attention, and responding to subtle signs of misalignment before failure becomes explicit (Rietveld et al., 2018). These processes are largely absent from contemporary AI architectures, not because they are infeasible, but because they are rarely treated as first-class design problems (Suchman, 2007).
3.3 Interaction as Interface Rather Than Relationship
Interaction, in particular, is frequently reduced to a technical interface problem. Users provide inputs; systems return outputs. Even in interactive learning settings, the dominant question is how to extract signal from human behavior efficiently, rather than how to participate meaningfully in a shared activity (Amershi et al., 2014; Kocaballi et al., 2020). As a result, interaction is modeled as data exchange rather than as a temporally extended relationship.
This reduction obscures a key issue: intelligence that unfolds over time cannot be evaluated solely through instantaneous performance. Systems that appear competent in isolated episodes may nevertheless accumulate misalignment, rigidity, or overconfidence when deployed in real-world settings (Dietterich, 2019; Mitchell et al., 2021). The growing literature on distribution shift, dataset bias, and model brittleness reflects this problem, yet these phenomena are often treated as technical pathologies rather than as symptoms of a deeper cognitive gap (Quinonero-Candela et al., 2009; Ovadia et al., 2019).
3.4 The Absence of Sense-Making Mechanisms
That gap concerns sense-making. Contemporary AI systems typically do not maintain an internal distinction between doing well and being aligned with the situation. They lack mechanisms for detecting when continued action is no longer appropriate, when participation should be modulated, or when coherence is degrading despite locally valid behavior (Floridi et al., 2018; Russell, 2019). As a result, they may persist confidently in regimes that appear successful according to internal metrics but are interactionally failing.
Crucially, this is not a matter of insufficient scale or data. Increasing model capacity can improve short-term adaptability, but it does not by itself introduce the capacity for self-regulation under drift (Bengio et al., 2021). Without architectural commitments to temporality, coupling, and participation, systems remain brittle over extended interaction—no matter how impressive their immediate outputs (Marcus & Davis, 2019).
3.5 Construction Without Cognitive Theory
From the perspective of enactive cognitive science, this reveals a missed opportunity. AI systems constitute one of the few domains where cognitive theories could be materially instantiated and stressed under real conditions. Yet most contemporary AI research proceeds with minimal engagement with theories of cognition, embodiment, or sense-making (Newell, 1990; Lake et al., 2017). Cognition is implicitly assumed rather than explicitly modeled.
The result is a form of cognitive underdetermination: systems that work remarkably well without clarifying what kind of intelligence they embody. Benchmarks are passed, but questions about autonomy, regulation, and viability are deferred or externalized (Rahimi et al., 2019). Intelligence becomes something inferred from outputs rather than understood as an ongoing organizational achievement (Bender et al., 2021).
3.6 Why the Absence of Cognition Matters
This does not imply that contemporary AI must adopt enactive theory wholesale. Rather, it suggests that AI’s current trajectory leaves key dimensions of intelligence unaddressed—particularly those that matter in sustained interaction with humans (Dreyfus, 2007; Suchman, 2007). The absence of cognitive theory is not neutral; it shapes what systems can notice, how they fail, and how responsibility for breakdown is distributed (Benjamin, 2019).
If enactive cognitive science risks becoming theory without construction, contemporary AI risks becoming construction without cognition. The challenge, then, is not to privilege one over the other, but to identify a domain in which both must confront each other’s limits.
The next section argues that human–computer interaction provides precisely such a domain—a space where cognition, construction, and lived experience are forced into direct contact.
4. Human–Computer Interaction as a Bridging Discipline
4.1 HCI Between Cognitive Theory and Artificial Systems
Human–computer interaction occupies a distinctive position between cognitive theory and artificial systems. Unlike much of cognitive science, HCI cannot remain purely descriptive; unlike much of AI, it cannot treat interaction as incidental. Systems must be encountered by real users, over time, in contexts where breakdown, confusion, frustration, and trust are unavoidable. This makes HCI an unusually demanding site for cognitive ideas (Norman, 2013; Rogers, 2012).
Where enactive cognitive science emphasizes lived experience, coupling, and sense-making, HCI operationalizes these concerns by necessity (Suchman, 2007). Interaction unfolds temporally. Misalignment accumulates. Users adapt, resist, reinterpret, or disengage. A system that performs well in isolation may still fail interactionally, not because it is inaccurate, but because it cannot pace itself, yield control, or remain intelligible as circumstances change (Dourish, 2001; Höök, 2018).
4.2 Interaction as a Temporal and Normative Process
For this reason, HCI functions as a phenomenological laboratory for cognitive theory. Concepts such as engagement, breakdown, coordination, and repair are not abstract topics but empirical realities encountered in use (Winograd & Flores, 1986; Suchman, 1987). Designers cannot assume stable meanings, fixed goals, or uniform users. Instead, they must confront how sense-making emerges in practice—how it stabilizes, drifts, and sometimes collapses (Rogers, Sharp, & Preece, 2011).
Crucially, HCI also constrains theory through construction. Cognitive commitments become embedded in interface timing, system responsiveness, feedback structure, and permissible actions. A theory that emphasizes coupling but produces systems that interrupt, dominate, or over-automate reveals a mismatch between its conceptual claims and its operational consequences (Höök et al., 2016). In this way, HCI exposes gaps between what a theory says cognition is and what it actually enables systems to do.
4.3 Viability as an Interactional Criterion
This distinguishes HCI from post-hoc evaluation approaches. Rather than asking whether a system achieves predefined outcomes, HCI asks whether interaction remains viable. Does the system remain understandable? Does it support repair? Can users regain orientation after breakdown? These questions align directly with enactive concerns about sense-making and normativity, yet they demand concrete answers in the form of working systems (Di Paolo et al., 2017; Kaptelinin & Nardi, 2012).
Historically, HCI has often been framed as a downstream application area—where insights from psychology or AI are translated into usable interfaces. However, this framing understates its epistemic role. HCI does not merely apply cognitive theories; it tests their consequences (Rogers, 2012). It forces theories to encounter time, variability, and situated use. In doing so, it reveals which theoretical distinctions matter and which dissolve under interactional pressure (Suchman, 2007).
4.4 Creative Interaction as a Stress Test for Cognitive Theory
This is especially apparent in creative and co-creative systems. Creative interaction cannot be reduced to goal satisfaction or task completion. It requires sensitivity to timing, openness to deviation, and tolerance for ambiguity (Fischer et al., 2014). Systems must know when to act, when to hold back, and when to let human activity reconfigure the interaction. These demands make creative HCI a particularly rich testbed for enactive ideas (Höök, 2018; Davis et al., 2021).
Moreover, HCI provides a shared language across disciplines. Terms such as interactional breakdown, repair, legibility, trust, and pacing are intelligible to designers, engineers, and cognitive scientists alike (Norman, 2013; Rogers et al., 2011). This makes HCI a practical site for re-entangling enactive theory and AI construction without requiring either community to abandon its core commitments.
4.5 HCI as a Load-Bearing Structure
From this perspective, HCI is not merely a bridge between enaction and AI—it is the load-bearing structure that makes their reunion possible. It supplies the experimental ecology in which cognitive theories must survive contact with real systems and real users. It also supplies AI with criteria of success that extend beyond optimization toward sustained participation, intelligibility, and trust (Shneiderman, 2020).
The argument of this paper is therefore not that HCI should import enactive theory wholesale, nor that AI should be subordinated to cognitive science. Rather, HCI provides the space in which enactive ideas can become operational and AI systems can become cognitively accountable. It is the discipline in which theory, construction, and lived interaction are forced into alignment.
Table X situates co-creative AI at the intersection of three traditions that are frequently invoked together but rarely compared at the level of their operative commitments: enactive cognitive science, artificial intelligence, and human–computer interaction. Rather than contrasting these fields in terms of disciplinary scope or technical maturity, the table highlights how each addresses core dimensions of co-creative interaction—such as temporality, regulation, drift, autonomy, and evaluation—through distinct methodological lenses. The comparison makes visible both complementarities and blind spots: what each tradition foregrounds, what it tends to abstract away, and which phenomena become legible only when systems must participate in sustained interaction with human partners. In doing so, the table supports the paper’s central claim that co-creative AI functions as a methodological hinge—one that exposes the limits of isolated approaches and motivates a re-braided framework in which theory, construction, and interaction mutually constrain one another.